This is a VB6 static (.BAS) module for sorting a Variant array of Variant arrays using the Heapsort algorithm. It could easily be converted to a class (.CLS) if desired as well.
It should be usable in VB5 and VBA, but this hasn't been tested. it should even be convertable to VBScript, though with much reduced performance of course.
Why VarVars?
When I need to work on large sets of row/col-based (tabular) data and I can't use a database or ADO Recordset I find Variant arrays of Variant arrays handy. Much handier then a 2-D array of String because I can used typed data, far less clunky than working with inflexible UDT arrays where you can't iterate over the fields, and even more useful than 2-D Variant arrays.
Typed data instead of String for everything can be very useful. In a field typed as Single, Double, Currency, etc. 2 and 2.0 will compare properly.
For another thing you can build a "row" of fields easily just using the VB Array() function, then assign it to a "row" slot in your Variant rows array.
Why Heapsort?
The Heapsort is reasonably speedy, doesn't require additional space, and performs well even in worst-case input sequences.
But the main reason is that it is easy to modify to perform the sort in several steps. This makes it possible to use a Timer control to drive a "background" sort that doesn't cause your programs to become unresponsive. This does add to the time a little, but the Timer.Interval need only be 16 to 32 milliseconds for most purposes. It also allows you update progress bars, handle "cancel" button clicks, etc.
This has always been the preferred way to handle this in VB6, as the manual states in many places. DoEvents() calls have their place, but DoEvents() is evil, should rarely be used, and even then only used quite carefully.
However both a synchronous sort and a quantified "pseudo-async" sort are provided here.
Usage
Basically, just add HeapsortVarVar.bas to your projects.
Then when you have a Variant array of Variant arrays to sort, call the SortBy() subroutine (or the QuantumSortBy() function until it returns False).
You pass the column index to sort on, the outer (rows) array to be sorted, and an optional Boolean Descending value (True or False, default False) to specify the direction of the sort.
QuantumSortBy() has two more parameters but the source code comments should explain those. These need to be initialized before the first call of each sort to be performed.
There are no extra dependencies.
Issues
There are sorts that can be faster, Quicksort is popular. However they aren't usually enough faster to warrant sacrificing some of the things that are easy to do to Heapsort (e.g. a quantified pseudo-async version).
If you create a Quicksort adapted for this I'd love to see it. Right now HeapsortVarVar works plenty fast for me, but more speed without additional pain is always appreciated.
Bugs. As far as I can tell there aren't any, but if you find some I'd like to know about them.
I find this very versatile, and once "known bug free" it should make a useful and easily reusable sort for VB users. While you can use it for sorting single items, it would be better to just create a new version tailored to work on a simple single-valued 1-D array.
Demos
There are two in the ZIP archive attachment. They are "ready to run" with a sample input file included, though I'd try testing after compiling to an EXE first. There is more data included than it appears, though it ZIPpped fairly small because it's about 1500 records copied into the file multiple times. The sort isn't too bad even in the IDE, however populating and re-populating the flexgrid can take a while.
Yes, I know the flexgrids can sort, but here it is just being used for demo purposes.
HeapsortDemo
This is a GUI program that loads and parses the sample data into a VarVar and displays it in an MSHFlexGrid. Then you can click or shift-click the column headers to sort and redisplay the VarVar contents.
Yes, there are a couple of UI quirks, related to the column click selecting cells in the grid. The program tries to clear these selections but it doesn't always succeed. The grid is merely being used for demo purposes here anyway. But you may already know of a fix for that if you care.
DeleteDups
A simpler program without any Forms that loads up the sample data, then sorts on the second column as a Single value, then
writes a new file including only the first row for each unique "second column" value.
It uses MsgBox calls to display when it reaches each phase and the time each phase takes.
It should be usable in VB5 and VBA, but this hasn't been tested. it should even be convertable to VBScript, though with much reduced performance of course.
Why VarVars?
When I need to work on large sets of row/col-based (tabular) data and I can't use a database or ADO Recordset I find Variant arrays of Variant arrays handy. Much handier then a 2-D array of String because I can used typed data, far less clunky than working with inflexible UDT arrays where you can't iterate over the fields, and even more useful than 2-D Variant arrays.
Typed data instead of String for everything can be very useful. In a field typed as Single, Double, Currency, etc. 2 and 2.0 will compare properly.
For another thing you can build a "row" of fields easily just using the VB Array() function, then assign it to a "row" slot in your Variant rows array.
Why Heapsort?
The Heapsort is reasonably speedy, doesn't require additional space, and performs well even in worst-case input sequences.
But the main reason is that it is easy to modify to perform the sort in several steps. This makes it possible to use a Timer control to drive a "background" sort that doesn't cause your programs to become unresponsive. This does add to the time a little, but the Timer.Interval need only be 16 to 32 milliseconds for most purposes. It also allows you update progress bars, handle "cancel" button clicks, etc.
This has always been the preferred way to handle this in VB6, as the manual states in many places. DoEvents() calls have their place, but DoEvents() is evil, should rarely be used, and even then only used quite carefully.
However both a synchronous sort and a quantified "pseudo-async" sort are provided here.
Usage
Basically, just add HeapsortVarVar.bas to your projects.
Then when you have a Variant array of Variant arrays to sort, call the SortBy() subroutine (or the QuantumSortBy() function until it returns False).
You pass the column index to sort on, the outer (rows) array to be sorted, and an optional Boolean Descending value (True or False, default False) to specify the direction of the sort.
QuantumSortBy() has two more parameters but the source code comments should explain those. These need to be initialized before the first call of each sort to be performed.
There are no extra dependencies.
Issues
There are sorts that can be faster, Quicksort is popular. However they aren't usually enough faster to warrant sacrificing some of the things that are easy to do to Heapsort (e.g. a quantified pseudo-async version).
If you create a Quicksort adapted for this I'd love to see it. Right now HeapsortVarVar works plenty fast for me, but more speed without additional pain is always appreciated.
Bugs. As far as I can tell there aren't any, but if you find some I'd like to know about them.
I find this very versatile, and once "known bug free" it should make a useful and easily reusable sort for VB users. While you can use it for sorting single items, it would be better to just create a new version tailored to work on a simple single-valued 1-D array.
Demos
There are two in the ZIP archive attachment. They are "ready to run" with a sample input file included, though I'd try testing after compiling to an EXE first. There is more data included than it appears, though it ZIPpped fairly small because it's about 1500 records copied into the file multiple times. The sort isn't too bad even in the IDE, however populating and re-populating the flexgrid can take a while.
Yes, I know the flexgrids can sort, but here it is just being used for demo purposes.
HeapsortDemo
This is a GUI program that loads and parses the sample data into a VarVar and displays it in an MSHFlexGrid. Then you can click or shift-click the column headers to sort and redisplay the VarVar contents.
Yes, there are a couple of UI quirks, related to the column click selecting cells in the grid. The program tries to clear these selections but it doesn't always succeed. The grid is merely being used for demo purposes here anyway. But you may already know of a fix for that if you care.
DeleteDups
A simpler program without any Forms that loads up the sample data, then sorts on the second column as a Single value, then
writes a new file including only the first row for each unique "second column" value.
It uses MsgBox calls to display when it reaches each phase and the time each phase takes.