Implementation of the merge_sort - comparing the timing of an array versus a vector

Question

I'm going through the book Introduction to Algorithms. I made a comparison between merge-sort for an array of integers versus a vector.

Could I have structured this program better? Why is the vector version so much slower? Sorting 2 million integers with a vector type took almost 2 seconds but sorting the same list using an array only took .4 seconds. Also, if I increase arraylength to over 3 million, then I get a segmentation fault. How can I avoid this?

I am used to Mathematica and Python, but not C++. In what way have I made use of pointers here? How could I better make use of them?

#include <iostream>
#include <math.h>
#include <vector>
#include <chrono>
using namespace std;

const int arraylength=2000000;

//This is an implementation of merge_sort, an algorithm to sort a list of integers using
//a recursion relation.  The merge_sort is written as two functions, `merge` which takes two
//pre-sorted lists and merges them to a single sorted list.  This is called on by merge_sort, 
//which also recursively calls itself.

//I've implemented it here twice, first with the two functions `merge` and `merge_sort`, and then
//again with `vmerge` and `vmerge_sort`.  The first two take as their argument arrays of integers, 
//while the second two take the data type `vector` from the `vector` package (is package the right word?
//or do I say library?).  


void merge(int A[], int p, int q, int r)
{
    //n1 and n2 are the lengths of the pre-sorted sublists, A[p..q] and A[q+1..r]
    int n1=q-p+1;
    int n2=r-q;
    //copy these pre-sorted lists to L and R
    int L[n1+1];
    int R[n2+1];
    for(int i=0;i<=n1-1; i++)
    {
        L[i]=A[p+i];
    }
    for(int j=0;j<=n2-1; j++)
    {
        R[j]=A[q+1+j];
    }


    //Create a sentinal value for L and R that is larger than the largest
    //element of A
    int largest;
    if(L[n1-1]<R[n2-1]) largest=R[n2-1]; else largest=L[n1-1];
    L[n1]=largest+1;
    R[n2]=largest+1;

    //Merge the L and R lists
    int i=0;
    int j=0;
    for(int k=p; k<=r; k++)
    {
        if (L[i]<=R[j])
        {
            A[k]=L[i];
            i++;
        } else
        {
            A[k]=R[j];
            j++;
        }
    }
}

void merge_sort(int A[], int p, int r)
{
    if(p<r)
    {
        int q=floor((p+r)/2);
        merge_sort(A,p,q);
        merge_sort(A,q+1,r);
        merge(A,p,q,r);
    }

}


void vmerge(vector<int>& A, int p, int q, int r)
{
    //n1 and n2 are the lengths of the pre-sorted sublists, A[p..q] and A[q+1..r]
    int n1=q-p+1;
    int n2=r-q;
    //copy these pre-sorted lists to L and R

    vector<int> L(&A[p],&A[q+1]);
    vector<int> R(&A[q+1],&A[r+1]);


    //Create a sentinal value for L and R that is larger than the largest
    //element of A
    int largest;
    if(L[n1-1]<R[n2-1]) largest=R[n2-1]; else largest=L[n1-1];
    L.push_back(largest+1);
    R.push_back(largest+1);

    //Merge the L and R lists
    int i=0;
    int j=0;
    for(int k=p; k<=r; k++)
    {
        if (L[i]<=R[j])
        {
            A[k]=L[i];
            i++;
        } else
        {
            A[k]=R[j];
            j++;
        }
    }
}


void vmerge_sort(vector<int>& A, int p, int r)
{
    //This recursively splits the vector A into smaller sections 
    if(p<r)
    {
        int q=floor((p+r)/2);
        vmerge_sort(A,p,q);
        vmerge_sort(A,q+1,r);
        vmerge(A,p,q,r);
    }

}    

int main()
{
    //seed the random number generator
    srand(time(0));

    cout<<"C++ merge-sort test"<<endl;
    //vlist is defined to be of type vector<int>
    vector<int> vlist1;
    //rlist1 is defined to be an integer array
    int *rlist1= new int[arraylength];
    //both vlist1 and rlist1 have the same content, 2 million random integers
    for(int i=0;i<=arraylength-1;i++)
    {
        rlist1[i] = rand() % 10000;
        vlist1.push_back(rlist1[i] );
    }

    //here I sort rlist1
    auto   t1 = std::chrono::high_resolution_clock::now();
    merge_sort(rlist1,0,arraylength-1);
    auto   t2 = std::chrono::high_resolution_clock::now();
    cout << "sorting "<<arraylength<<" random numbers with merge sort took "
              << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count()
              << " milliseconds\n";

    //here I sort vlist1          
    t1 = std::chrono::high_resolution_clock::now();
    vmerge_sort(vlist1,0,arraylength-1);
    t2 = std::chrono::high_resolution_clock::now();
    cout << "sorting "<<arraylength<<" random numbers with vmerge sort took "
              << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count()
              << " milliseconds\n";


}

UPDATE: Here is the code I've gotten to after reading Loki Astari and Aleksey Demakov's answers. With the code above, I was able to sort 2 million random numbers in 400 ms using merge_sort and 1926 ms using vmerge_sort. After making the changes, these functions do the task in 410 ms and 860 ms, respectively. So working with the vector type takes twice as long. I suppose this shouldn't be a suprise, as it states here "Therefore, compared to arrays, vectors consume more memory in exchange for the ability to manage storage and grow dynamically in an efficient way."

#include <iostream>
#include <math.h>
#include <vector>
#include <chrono>

//Is this less offensive than using the entire std namespace?
using std::cout;
using std::endl;

const int arraylength=2000000;

//This is an implementation of merge_sort, an algorithm to sort a list of integers using
//a recursion relation.  The merge_sort is written as two functions, `merge` which takes two
//pre-sorted lists and merges them to a single sorted list.  This is called on by merge_sort, 
//which also recursively calls itself.

//I've implemented it here twice, first with the two functions `merge` and `merge_sort`, and then
//again with `vmerge` and `vmerge_sort`.  The first two take as their argument arrays of integers, 
//while the second two take the data type `vector` from the `vector` package (is package the right word?
//or do I say library?).  


void merge(int A[], int LA[], int RA[], int p, int q, int r)
{
    //n1 and n2 are the lengths of the pre-sorted sublists, A[p..q] and A[q+1..r]
    int n1=q-p+1;
    int n2=r-q;
    //Copy the left and right halves of the A array into the L and R arrays
    for(int i=0;i<n1; i++)
    {
        LA[i]=A[p+i];
    }
    for(int j=0;j<n2; j++)
    {
        RA[j]=A[q+1+j];
    }


    //Merge the L and R lists
    int i=0;
    int j=0;
    int k = p;
    while(i < n1 && j < n2) {
        A[k++] = (LA[i]<=RA[j])  
                   ? LA[i++]    
                   : RA[j++];
    }
    while(i < n1) {
        A[k++] = LA[i++];
    }
    while(j < n2) {
        A[k++] = RA[j++];
    }
}

void merge_sort(int A[], int LA[], int RA[], int p, int r)
{
    //This recursively splits the array A into smaller sections 
    if(p<r)
    {
        int q=floor((p+r)/2);
        merge_sort(A,LA,RA,p,q);
        merge_sort(A,LA,RA,q+1,r);
        merge(A,LA,RA,p,q,r);
    }

}


void vmerge(std::vector<int>& A, std::vector<int>& LA, std::vector<int>& RA, int p, int q, int r)
{
    //n1 and n2 are the lengths of the pre-sorted sublists, A[p..q] and A[q+1..r]
    int n1=q-p+1;
    int n2=r-q;
    //copy these pre-sorted lists to L and R

    for(int i=0;i<n1; i++)
    {
        LA[i]=A[p+i];
    }
    for(int j=0;j<n2; j++)
    {
        RA[j]=A[q+1+j];
    }


    //Merge the L and R lists
    int i=0;
    int j=0;
    int k = p;
    while(i < n1 && j < n2) 
    {
        A[k++] = (LA[i]<=RA[j])  
                   ? LA[i++]    
                   : RA[j++];
    }
    while(i < n1) {
        A[k++] = LA[i++];
    }
    while(j < n2) {
        A[k++] = RA[j++];
    }


}


void vmerge_sort(std::vector<int>& A, std::vector<int>& LA, std::vector<int>& RA, int p, int r)
{
    //This recursively splits the vector A into smaller sections 
    if(p<r)
    {
        int q=floor((p+r)/2);
        vmerge_sort(A,LA,RA,p,q);
        vmerge_sort(A,LA,RA,q+1,r);
        vmerge(A,LA,RA,p,q,r);
    }

}    

int main()
{
    //seed the random number generator
    srand(time(0));
    std::chrono::high_resolution_clock::time_point t1,t2;
    cout<<"C++ merge-sort test"<<endl;


    //rlist1 is defined to be an integer array
    //L and R are the subarrays used in the merge function
    int *rlist1= new int[arraylength];
    int halfarraylength=ceil(arraylength/2)+1;
    int *R= new int[halfarraylength];
    int *L= new int[halfarraylength];


    //vlist is defined to be of type vector<int>
    //vL and vR are the left and right subvectors used in the vmerge function
    std::vector<int> vlist1,vL,vR;
    vlist1.reserve(arraylength);
    vL.reserve(halfarraylength);
    vR.reserve(halfarraylength);



    //both vlist1 and rlist1 have the same content, 2 million random integers
    for(int i=0;i<=arraylength-1;i++)
    {
        rlist1[i] = rand() % 1000000;
        vlist1[i] = rlist1[i];
    }


    //here I sort rlist1
    t1 = std::chrono::high_resolution_clock::now();
    merge_sort(rlist1,L,R,0,arraylength-1);
    t2 = std::chrono::high_resolution_clock::now();
    cout << "sorting "<<arraylength<<" random numbers with merge sort took "
              << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count()
              << " milliseconds\n";



    //here I sort vlist1          
    t1 = std::chrono::high_resolution_clock::now();
    vmerge_sort(vlist1,vL,vR,0,arraylength-1);
    t2 = std::chrono::high_resolution_clock::now();
    cout << "sorting "<<arraylength<<" random numbers with vmerge sort took "
              << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count()
              << " milliseconds\n";

    //Now we test that both sorted lists are identical
    cout << "Testing that both sorted lists are the same"<< endl;
    int testcounter = 0;
    for (int k=0; k< arraylength; k++)
    {
        if (rlist1[k] != vlist1[k]) testcounter+=1;
    }
    if (testcounter==0) cout<< "Both lists are the same\n"; else cout<<"Both lists are not the same\n";




}

Both answers have been very helpful. How does accepting an answer on this stackexchange work, since you aren't specifically asking a question, but asking for comments on how to improve something.

` Why is the vector version`: Something is different in your code. Vector basically bolis down to an array (unless you start resizing). — Loki Astari
– Loki Astari, Commented Mar 30, 2015 at 18:30
@LokiAstari, I can see I've done a horrible job of commenting this, let me edit it to make it more clear. — Jason B.
– Jason B., Commented Mar 30, 2015 at 18:48

Loki Astari · Accepted Answer · 2015-03-30 19:17:56Z

A quick glance the only difference I can spot is:

 vector<int> L(&A[p],&A[q+1]);
 vector<int> R(&A[q+1],&A[r+1]);

Verses:

int L[n1+1];
int R[n2+1];

Note: This is technically not legal C++ (bit a lot of compilers accept it as an extension).

~~I am not sure why you are doing this:~~
OK spotted what you are doing now.

L.push_back(largest+1);
R.push_back(largest+1);

But here you may be forcing the vector to re-size (which will force a copy of all the data from the original buffer to the new buffer). Which the array version does not do. This is a potential for your speed problem (a hidden array copy).

It looks like these two lines are causing your problems.

~~But I think you may have a bug in your code.~~ I see why you write your loop like this now. You have added an element onto the end of each range to make sure you always have an element to compare against. OK. So this is not a bug. But see below for a better way to write it.

for(int k=p; k<=r; k++)
{
    // Here once `i` has gone past the end of L
    // or `j` has gone past the end of R then 
    // accessing the element is undefined behavior.
        //
        // This is going to happen when you have finished looking
        // at one array and the only elements left are in the other array.
        // Which will always happen at least for one element but
        // potentially for lots of elements.
        //
        // Also because you have gone past the end of the array
        // The values you are comparing are going to give you
        // completely different behaviors.
        //
        // Which result in different times.
    if (L[i]<=R[j])
    {

        A[k]=L[i];
        i++;
    } else
    {
        A[k]=R[j];
        j++;
    }
}

The correct way to do this loop is:

int i=0;
int j=0;
while(i < n1 && j < n2) {
    A[k++] = (L[i]<=R[j])  // Sorry could not resist a one liner
               ? L[i++]    // But probably best to write it the original way
               : R[j++];
}
// Because there is no conditional in this loop
// It can run quicker than the version with the conditional above.
// Also: Only one of these two loops is actually run.
while(i < n1) {
    A[k++] = L[i++];
}
while(j < n2) {
    A[k++] = R[j++];
}

Also your bounds are a bit funky.

You do beginning to end. Normally in C++ we do beginning to one past the end. If you want to be consistent with C++ usage you should adapt this methodology.

Written as:

A[p..q)       // Notice the ) at the end indicating not inclusive.
A[q..r)       // That is how I would expect the two ranges to line up.
              // Since yours don't seem to I find it hard to verify you
              // are doing the maths correctly.

Also by using this convention you get rid of a lot of extra (+1) and (-1) from your code and it looks neater.

Thank you so much for your help, this is great. Another question, I'm reading a lot and I see the concepts of classes and objects a lot, but coming from Mathematica I only think in terms of functions. Is this code too small to explore the idea of objects and classes? — Jason B.
– Jason B., Commented Mar 30, 2015 at 20:11
Sorting does not readily apply itself to classes. You are performing an operation on a container (also know as an algorithm). But you could change your code to use iterator (rather than a container an indexes). Iterators are the glue between containers and algorithms. And you could update your sort so that it can be applied to any type of object (not just integers) which may involve a functor object. — Loki Astari
– Loki Astari, Commented Mar 30, 2015 at 20:19
what do you mean that what I wrote isn't legal C++? I'm not even sure what it is that I've written there, when I write vector<int> L(&A[p],&A[q+1]); I looked up how to copy a portion of one vector to another vector, and it led me to that. I don't understand the notation (&A[p],&A[q+1]) - specifically what the & signifies and also why it copies A[p...q] and not A[p....q+1] — Jason B.
– Jason B., Commented Mar 31, 2015 at 8:54
I'm confused what you mean by my bounds, and how I could change them. Say I want to loop over the elements of vlist1, I would write for(int k=0; k<=arraylength-1; k++) and then the body of the loop. Do you mean that I should instead write for(int k=0; k<arraylength; k++) or do you mean something more than that? — Jason B.
– Jason B., Commented Mar 31, 2015 at 10:27
@JasonB: This is not legal by the standard int L[n1+1]; The size of an array is fixed at compile time (not runtime). What you are using here is a compiler extension (if you check the warnings it probably warns you about this). — Loki Astari
– Loki Astari, Commented Mar 31, 2015 at 21:06

Aleksey Demakov · Accepted Answer · 2015-03-31 13:13:11Z

1

In the array version you allocate your arrays on the stack. If the arrays are too large, you might get a stack overflow.

In the C++ vector version, std::vector allocates space on the free store. So you will probably get something about log(arraylength) * arraylength vector allocations. Additionally you do push_back for both of the vectors, which might double the number of allocations.

I would suggest for both versions pre-allocate the required additional memory in the main() function and pass it to merge functions as a parameter.

For C++ vectors you will need to call reserve() method, so that they contain enough space from the beginning, without the need to reallocate it.

UPDATE: I put together merge sort implementation for vectors only, the version for arrays might be done using similar technique.

#include <algorithm>
#include <limits>
#include <stdexcept>
#include <vector>

void vmerge(std::vector<int> &a,
            int p, int q, int r,
            std::vector<int> &aux1,
            std::vector<int> &aux2) {
  aux1.clear();
  aux2.clear();
  aux1.insert(aux1.begin(), &a[p], &a[q]);
  aux2.insert(aux2.begin(), &a[q], &a[r]);

  int max = std::max(aux1.back(), aux2.back());
  if (max == std::numeric_limits<int>::max())
    throw std::out_of_range("This version of merge algorithm cannot handle INT MAX value");
  aux1.push_back(max + 1);
  aux2.push_back(max + 1);

  int i1 = 0, i2 = 0;
  for (int k = p; k < r; k++) {
    if (aux1[i1] <= aux2[i2])
      a[k] = aux1[i1++];
    else
      a[k] = aux2[i2++];
  }
}

void vmerge_sort_aux(std::vector<int> &a,
                     int p, int r,
                     std::vector<int> &aux1,
                     std::vector<int> &aux2) {
  int n = r - p;
  if (n > 1) {
    int q = p + n / 2;
    vmerge_sort_aux(a, p, q, aux1, aux2);
    vmerge_sort_aux(a, q, r, aux1, aux2);
    vmerge(a, p, q, r, aux1, aux2);
  }
}

void vmerge_sort(std::vector<int> &a) {
  if (a.size() > 1) {
    std::vector<int> aux1;
    std::vector<int> aux2;
    aux1.reserve(a.size() / 2 + 1);
    aux2.reserve(a.size() - (a.size() / 2) + 1);
    vmerge_sort_aux(a, 0, a.size(), aux1, aux2);
  }
}

edited Mar 31, 2015 at 13:13

answered Mar 30, 2015 at 20:51

Aleksey Demakov

6115 silver badges12 bronze badges

\$\begingroup\$ Aleksey, how would I go about pre-allocating the memor and passing it to the merge functions? I find that if I comment out the line merge_sort(rlist1,0,arraylength-1);, thus only sorting the vector list vlist1, I can set the arraylength very high. So it is the array that is causing me trouble. But I am still interested to do things more correctly so I will look at reserve. I had thought that by initializing rlist1 by the new keyword, I was allocating it to the heap instead of the stack. Is this not correct? \$\endgroup\$

Jason B.
– Jason B.

2015-03-31 10:42:02 +00:00
Commented Mar 31, 2015 at 10:42
\$\begingroup\$ Also, is there a preferred method for growing a vector besides with the push_back function? If I initialized the vector, then gave it a specific size via the reserve() method, do I not need to use push_back? \$\endgroup\$

Jason B.
– Jason B.

2015-03-31 10:46:08 +00:00
Commented Mar 31, 2015 at 10:46
\$\begingroup\$ vector has two sizes -- used size and allocated size (capacity). If you push_back when used size is equal to capacity, then a new memory block with larger capacity is allocated from the free store, all vector elements are copied to the new block, the old block is released, and finally new element is appended to the vector within the new memory block. If you grow your vector one by one, this heavy reallocation mechanism might be repeated several times. The way to avoid this is to specify from the very beginning how large your array is going to grow calling reserve(FINAL_SIZE). \$\endgroup\$

Aleksey Demakov
– Aleksey Demakov

2015-03-31 11:04:08 +00:00
Commented Mar 31, 2015 at 11:04
\$\begingroup\$ Yes, it is correct that you allocated rlist1 on the heap (it is often called the free store in C++-related discussion). But inside the merge() function you have these lines: int L[n1+1]; and int R[n2+1];. This is where you allocate arrays on the stack. \$\endgroup\$

Aleksey Demakov
– Aleksey Demakov

2015-03-31 11:14:26 +00:00
Commented Mar 31, 2015 at 11:14
\$\begingroup\$ Okay, so I can get around that by allocating L and R in the free store within the merge function, the same way I do rlist1 in main. But if I wanted to do it the way you mentioned instead, by pre-allocating the memory in main and passing it as a parameter to merge - how would I do that? What do you mean by passing memory as a parameter? Do you mean that L and R should be initialized within main rather than within merge? What is the disadvantage of a function besides main allocating something to the free store? \$\endgroup\$

Jason B.
– Jason B.

2015-03-31 11:34:45 +00:00
Commented Mar 31, 2015 at 11:34

| Show 3 more comments

Stack Exchange Network

Implementation of the merge_sort - comparing the timing of an array versus a vector

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

Implementation of the merge_sort - comparing the timing of an array versus a vector

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions