You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Quicksort

Quicksort is the most common sorting algorithm in use today.  In light of its technological impact, it's good to know how it works. 

Fundamentally, Quicksort is akin to merge sort, in that it seeks to partition the set into two pieces and then recursively sort each piece.    Mergesort simply whacks the set into two equal-sized pieces, which has the side effect of requiring a non-trivial merging process to re-join the sorted sub-pieces.   

Rather than split the set (array) into two arbitrary and equal-sized pieces, Quicksort divides the set into subsets where one subset contains values that are all smaller than a given "pivot value" while the other subset are all values that are greater than the pivot.   Some implementations put the pivot in the set of smaller numbers and some put it in the set of larger numbers and still others keep it completely separate.  

Joining the sorted subsets is trivial because we already know that one set is completely smaller than the other set. 

Which value to use as the pivot?   Consider this:  on average, for any randomly chosen element in the set (array), the value of that element is the mean of all the values.   Thus, on average, half of the values in the set are smaller than any randomly chosen value and half are thus larger.   What does this all mean?   It means that it doesn't matter which element we choose to be our pivot because on average, half of the values will be larger than our pivot and half will be smaller, which is what we want.    So, why not pick the first element?   It's statistically no different than any other elenent and its real easy to find!

On average, quicksort will be faster than mergesort because the partitioning process of breaking the set into "smaller" and a "larger" subsets is less computationally expensive than mergesort's generalized merge operation.  On the other hand, if the set is pathological, e.g. it's already sorted, Quicksort will degenerate into a O(N^2^) behavior while mergesort will always be O(N*log(N)). 

Here's a very basic implementation of Quicksort in an imperative coding style:

import java.util.*;
class QuicksortBasic implements QuickSort {
  public ArrayList<Integer> sort(ArrayList<Integer> a) {

    if (a.isEmpty()) return new ArrayList<Integer>();

    ArrayList<Integer> left = new ArrayList<Integer>();
    ArrayList<Integer> mid = new ArrayList<Integer>();
    ArrayList<Integer> right = new ArrayList<Integer>();

    for (Integer i : a)
      if ( i < a.get(0) )
        left.add(i); // Use element 0 as pivot
      else if ( i > a.get(0))
        right.add(i);
      else
        mid.add(i);

    ArrayList<Integer> left_s = sort(left);
    ArrayList<Integer> right_s = sort(right);

    left_s.addAll(mid);
    left_s.addAll(right_s);

    return left_s;
  }
}

It is possible to optimize this code further to make it even faster.

From an OO sorting perspective, Quicksort is a hard-split, easy join example of Merritt's sorting theorem.

/**
 * A concrete sorter that uses the QuickSort method.
 */
public class QuickSorter extends ASorter
{

 /**
  * The constructor for this class.
  * @param iCompareOp The comparison strategy to use in the sorting.
  */
 public QuickSorter(AOrder iCompareOp)
 {
  super(iCompareOp);
 }
 /**
  * Splits A[lo:hi] into A[lo:s-1] and A[s:hi] where s is the returned value of this function.
  * This method places all values greater than the key at A[0] at indices above the split index
  * and all values below the key at indices less than the split index.
  * @param A the array A[lo:hi] to be sorted.
  * @param lo the low index of A.
  * @param hi the high index of A.
  * @return The split index.
  */
 protected int split(Object[] A, int lo, int hi)
 {
      Object key = A[lo];
      int lx = lo; // left index.
      int rx = hi; // right index.
      // Invariant 1: key <= A[rx+1:hi].
      // Invariant 2: A[lo:lx-1] <= key.
      // Invariant 3: there exists ix in [lo:rx] such that A[ix] <= key.
      // Invariant 4: there exists jx in [lx:hi] such that key <= A[jx].
      while (lx <= rx)
      {
         while (aOrder.lt(key, A[rx])) // will terminate due to invariant 3.
         {
            rx--;  // Invariant 1 is maintained.
         } // A[rx] <= key <= A[rx+1:hi]; also  invariant 0, lx <= rx.

         while (aOrder.lt(A[lx], key)) // will terminate due to invariant 4.
         {
            lx++;  // Invariant 2 is maintained.
         } // A[lo:lx-1] <= key <= A[lx]

         if (lx <= rx)
         {
            // swap A[lx] with A[rx]:
            Object temp = A[lx];
            A[lx] = A[rx]; // invariant 3 is maintained.
            A[rx] = temp;  // invariant 4 is maintained.
            rx--;  // invariant 1 is maintained.
            lx++;  // invariant 2 is maintained.
         }
      } // rx < lx, A[lo:lx-1] <= key <= A[rx+1:hi], and key = A[lx].
  return lx;

 }

 /**
  * Joins sorted A[lo:s-1] and sorted A[s:hi] into A[lo:hi].
  * This method does nothing, as the sub-arrays are already in proper order.
  * @param A A[lo:s-1] and A[s:hi] are sorted.
  * @param lo the low index of A.
  * @param s
  * @param hi the high index of A.
  */
 protected void join(Object[] A, int lo, int s, int hi)
 {
 }
}

The split operations is just the partitioning of the array into two parts, one smaller than the pivot and one larger than the pivot.   The join operation is simply a no-op because the two sorted subsets are disjoint and contiguous, so they are already in order. 

Also download the lecture notes on this subject that includes an interesting decomposition of the Quicksort problem into an architecture that can be efficiently run in parallel on multi-core processors.

Note:  To analyze the performance of the parallel algorithm, use an input ArrayList that gets sorted in a few seconds on your machine.
  • No labels