335 lines
12 KiB
Plaintext
335 lines
12 KiB
Plaintext
Note: At the end of the document is a discussion of the various techniques,
|
|
a short evaluation of the goods and bads, and also a small enhancement
|
|
to the Bubble-Sort algorithm called Back-Track (in C).
|
|
- Amit Margalit (9-MAY-1992)
|
|
|
|
Information
|
|
ÄÄÄÄÄÄÄÄÄÄÄ
|
|
|
|
By definition, sort routines place un-ordered numeric or string data into
|
|
ascending or descending order. Many sort algorithms have been discovered,
|
|
and delineating their characteristics is a favorite pastime of Pascal
|
|
tutorials.
|
|
|
|
Bubble Sort
|
|
|
|
The bubble sort algorithm derives its name from the way it systematically
|
|
floats the largest values in an array to the top, like bubbles in a bottle
|
|
of soda.
|
|
|
|
The algorithm goes like this: starting at the bottom of the array, compare
|
|
each adjacent pair of values; if the lower member of the pair is greater
|
|
than the upper member, swap. Once you've worked your way to the top in this
|
|
fashion, theArray[ARRAYSIZE] is known to be the largest number in the
|
|
array. Now repeat the process, but this time stop one element short of the
|
|
top (since we already know this value is the largest). Repeat this process
|
|
until finally on the last pass you examine only the bottom two members of
|
|
the array.
|
|
|
|
procedure BubbleSort (var theArray: arrayType);
|
|
var
|
|
last, current : integer;
|
|
|
|
begin
|
|
for last := ARRAYSIZE downto 2 do
|
|
for current := 1 to last-1 do
|
|
if theArray[current] > theArray[current+1] then
|
|
Swap(theArray[current], theArray[current+1]);
|
|
end; { BubbleSort }
|
|
|
|
Insertion Sort
|
|
|
|
The insertion sort is on the simple end of the scale, and resembles the way
|
|
people sort a bridge hand. Working from one end of the array to the other,
|
|
each value is placed in its correct position relative to the values that
|
|
have already been sorted.
|
|
|
|
The routines shown here are designed to process arrays of integers into
|
|
ascending order (i.e., theArray[1] gets the lowest value;
|
|
theArray[ARRAYSIZE] gets the highest), although with only trivial changes
|
|
the algorithms could easily work with string data and/or descending order.
|
|
|
|
const
|
|
ARRAYSIZE = 100;
|
|
|
|
type
|
|
arrayType = array[1..ARRAYSIZE] of integer;
|
|
|
|
procedure InsertionSort (var theArray: arrayType );
|
|
var
|
|
newValue, newPos, currentPos: integer;
|
|
begin
|
|
for newPos := 2 to ARRAYSIZE do
|
|
begin
|
|
newValue := theArray[newPos];
|
|
currentPos := newPos;
|
|
while (currentPos > 1) and (theArray[currentPos-1] > newValue) do
|
|
begin
|
|
theArray[currentPos] := theArray[currentPos-1];
|
|
currentPos := currentPos - 1;
|
|
end;
|
|
theArray[currentPos] := newValue;
|
|
end;
|
|
end; { InsertionSort }
|
|
|
|
Shell Sort
|
|
|
|
The Shell Sort was invented by a computer scientist named Donald Shell. His
|
|
sorting method compares and swaps numbers the greatest distance from each
|
|
other first, shrinking the distance between numbers until you return to the
|
|
ones in closest proximity to each other.
|
|
|
|
procedure ShellSort(var List: ListType; ListMax: integer);
|
|
label
|
|
ExitLoop;
|
|
|
|
var
|
|
Indx, Jndx, TheVal, Incr : integer;
|
|
begin
|
|
Incr := 1;
|
|
repeat
|
|
Incr := 3 * Incr + 1;
|
|
until Incr > ListMax;
|
|
repeat
|
|
Incr := Incr div 3;
|
|
for Indx := Incr + 1 to ListMax do
|
|
begin
|
|
TheVal := List[Indx];
|
|
Jndx := Indx;
|
|
while List[Jndx - Incr] > TheVal do
|
|
begin
|
|
List[Jndx] := List[Jndx - Incr];
|
|
Jndx := Jndx - Incr;
|
|
if Jndx <= Incr then
|
|
goto ExitLoop;
|
|
end;
|
|
ExitLoop:
|
|
List[Jndx] := TheVal;
|
|
end;
|
|
until Incr = 1;
|
|
end; { ShellSort }
|
|
|
|
Quicksort
|
|
|
|
King of the sorting algorithms is Quicksort. The gist of Quicksort is
|
|
summarized here (in general it is a recursive algorithm):
|
|
|
|
-- Pick a pivot point in the center of the array
|
|
|
|
-- Partition the array:
|
|
Exchange equal or larger elements (working from the left) with equal or
|
|
smaller elements (working from the right). Repeat this process until the
|
|
array has been organized such that every value left of the pivot point is
|
|
smaller than every value right of the pivot point, and the value at the
|
|
pivot point itself is in exactly the right position.
|
|
|
|
-- If there's more than one element left of the pivot point,
|
|
call Quicksort recursively on the left-hand part of the array
|
|
|
|
-- If there's more than one element right of the pivot point,
|
|
call Quicksort recursively on the right-hand part of the array
|
|
|
|
procedure Quicksort(start, finish: integer);
|
|
{ sorts global variable theArray }
|
|
var
|
|
pivot, left, right : integer;
|
|
begin
|
|
{ set pivot point }
|
|
left := start;
|
|
right := finish;
|
|
pivot := theArray[(start + finish) div 2];
|
|
{ partition }
|
|
repeat
|
|
while theArray[left] < pivot do
|
|
left := left + 1;
|
|
while pivot < theArray[right] do
|
|
right := right - 1;
|
|
if left <= right then
|
|
begin
|
|
Swap( theArray[left], theArray[right] );
|
|
left := left + 1;
|
|
right := right - 1;
|
|
end;
|
|
until right <= left;
|
|
{ sort right and left halves }
|
|
if start < right then QuickSort(start, right);
|
|
if left < finish then QuickSort(left, finish);
|
|
end; { QuickSort }
|
|
|
|
------------------------------------------------------------------------------
|
|
Following are a few additions by Amit Margalit
|
|
------------------------------------------------------------------------------
|
|
|
|
BackTrack - An enhancement of the Bubble-Sort.
|
|
|
|
The idea of BackTrack is simple: You run only once on the entire array, and
|
|
every time you find an element out of place, you search backwards through
|
|
the array to find where is the right place for it.
|
|
|
|
/* Assumes that we are sorting int's */
|
|
|
|
void backtrack(int* array, int nelem)
|
|
{
|
|
int idx,temp;
|
|
void track(int* arr,int cidx);
|
|
|
|
if(nelem == 1)
|
|
return;
|
|
for(idx=1;idx < nelem; idx++) {
|
|
if(*(array+idx) < *(array+idx-1)) {
|
|
track(array,idx);
|
|
}
|
|
}
|
|
}
|
|
|
|
void track(int* array, int idx)
|
|
{
|
|
do {
|
|
swap(*(array+idx) , *(array+idx-1)); /* swap elements */
|
|
idx--; /* decrease index */
|
|
} while ( (idx > 0) && (*(array+idx) < *(array+idx-1)) );
|
|
/* repeat while index is 1 or more, and element still not in place */
|
|
}
|
|
|
|
swap(int* v1, int* v2)
|
|
{
|
|
int temp;
|
|
|
|
temp = *v1;
|
|
*v1 = *v2;
|
|
*v2 = temp;
|
|
}
|
|
|
|
|
|
Merge-Sort
|
|
----------
|
|
Actually, this is the most efficient of all algorithms (where speed is
|
|
concerned...). It needs twice as much storage space. Here are the basics:
|
|
|
|
Note: when you have two already-sorted arrays and you want to merge them into
|
|
a single sorted array, you simply read the first elements, compare them. The
|
|
one that fits to be before the other is sent to the output stream, and a new
|
|
one from its array is read, and the comparison goes on again and again until
|
|
the arrays are merged.
|
|
|
|
- Divide the array into 2 separate ones.
|
|
|
|
- If the number of elements on the left side is more than one, call
|
|
merge-sort in recursion giving it the left array as an array to sort.
|
|
|
|
- Do the same about the right side.
|
|
|
|
- When both side have finished and returned, merge the array on the left
|
|
with the one on the right.
|
|
|
|
Here is a C source (to sort an array of ints):
|
|
|
|
|
|
/* assume lower memory addresses on the left going up to the right */
|
|
|
|
mergesort(int* array, int length)
|
|
{
|
|
int leftsize=length/2;
|
|
int rightsize=length-leftsize;
|
|
int lidx,ridx,nidx;
|
|
int *newarray;
|
|
|
|
if(leftsize>1)
|
|
mergesort(array,leftsize);
|
|
if(rightsize>1)
|
|
mergesort(array+leftsize,rightsize);
|
|
/* both sides are now sorted arrays, so we can merge the two sides */
|
|
|
|
lidx=ridx=nidx=0;
|
|
if((newarray=(int*)malloc(length*sizeof(int)))==NULL)
|
|
error("Out of memory."); /* and terminate */
|
|
for(;;) {
|
|
if(*(array+leftsize+ridx) < *(array+lidx)) {
|
|
*(newarray+nidx) = *(array+leftsize+ridx);
|
|
nidx++; ridx++;
|
|
}
|
|
else {
|
|
*(newarray+nidx) = *(array+lidx);
|
|
nidx++; lidx++;
|
|
}
|
|
if(lidx>leftsize) /* no more? copy the rest */
|
|
for(;ridx<=rightsize;ridx++,nidx++)
|
|
*(newarray+nidx) = *(array+leftsize+ridx);
|
|
if(ridx>leftsize)
|
|
for(;lidx<=rightsize;lidx++,nidx++)
|
|
*(newarray+nidx) = *(array+lidx);
|
|
}
|
|
for(lidx=0;lidx<length;lidx++) /* copy the new array back */
|
|
*(array+lidx) = *(newarray+lidx);
|
|
free(newarray); /* and free the memory */
|
|
}
|
|
|
|
Comparing the algorithms:
|
|
-------------------------
|
|
|
|
Here is the list of the algorithms included in this article with a short note:
|
|
|
|
Bubble-Sort Very inefficient, slow, but very easy to implement.
|
|
Back-Track An improvement of the Bubble Sort.
|
|
Insertion-Sort Also easy to implement, easiest to understand.
|
|
Shell-Sort More efficient, a little hard to understand.
|
|
QuickSort The best. A bit hard to understand and requires a large stack.
|
|
Merge-Sort Always better than QuickSort, but needs twice the memory.
|
|
|
|
In detail:
|
|
|
|
BubbleSort
|
|
----------
|
|
Runs a long time, in little efficiency. Good if you wish to sort a small
|
|
number of elements (up to 20 maybe), don't care much about the time it
|
|
takes, or don't wanna start writing a more sophisticated sort routine. It's
|
|
the simplest one to implement, and has no flow-problems. (i.e. having to
|
|
exit from inside a loop or something like that).
|
|
Time of execution: Depends only on length of array.
|
|
|
|
Back-Track
|
|
----------
|
|
Just a bit better. Runs only once on the entire array, and swaps things to
|
|
their right place. Useful also if you don't wanna start writing a more
|
|
sophisticated sort routine, are sorting a small array, and do care little
|
|
about the execution time.
|
|
Time of execution: Depends on length of array, and its randomness (the more
|
|
the array is already sorted, the less it takes).
|
|
|
|
Insertion-Sort
|
|
--------------
|
|
Is good if you are sorting a linked-list. You really can't do it many other
|
|
ways... You copy the first node of the list to a new location, and read
|
|
through the original list, inserting the nodes in proper order to the new
|
|
list, by just keeping a temporary copy of the node, setting the appropriate
|
|
links of the node in the new list and the node to be inserted correctly.
|
|
Time of execution: Depends on length of array, and the complexity of
|
|
changing the links in the list.
|
|
|
|
Shell-Sort
|
|
----------
|
|
The Shell-Sort use a kind of self-recursion. It runs on the large scale, then
|
|
descends into smaller regions of the array. Each time reducing the gap between
|
|
the elements it is sorting.
|
|
|
|
Quick-Sort
|
|
----------
|
|
Is not simple to implement, but is the most efficient. It is recursive and so
|
|
needs a large stack for a large array to be sorted. The idea is to divide the
|
|
array into 2 halves, picking the value of the element in the middle as a
|
|
pivot-value. Then you go all the way from the left and the right edges, and
|
|
swap any values that ought to be on opposite sides. When you get to the
|
|
middle point, you check to see if there are more than one elements on each
|
|
side, and if there are you call the routine again, sending it the half of the
|
|
array as the whole new array for it to sort. This is done in recursion.
|
|
|
|
Merge-Sort
|
|
----------
|
|
As has been said before, it needs memory twice the size of the array, to hold
|
|
both the array, and the sorted one (for the last pass). Its efficiency is the
|
|
best! Its execution time grows only with the growth of the array size, and is
|
|
not affected by the level of sortedness of the array preceding the sort.
|
|
|
|
Amit Margalit, 11-May-1992
|
|
|