The idea was very noble and I was quite happy with the first development candidate. I especially liked the idea of data structure being an object, also for schematic and semantic reasons. Plus, I thought that I would finally get rid of all those boring extra lines of code needed, when using plain arrays. I also thought that only a small performance penalty would be paid, with all function calls made using class accessors (push, at). However, as I was prototyping Monte Carlo basket equity option pricing and doing matrix operations with large matrices, I quickly realized that the cost of processing matrix operations was definitely way too high.
Testing processing times
The issue was bothering me in a such way, that I finally wanted to get some hard-tested facts on handling and operating matrix structures. For this reason, I prepared test cases for three different matrix schemes. Within these test cases, matrix data structure was implemented by using
- Two-dimensional array
- Variant array of double arrays (jagged array)
- Matrix class
Within test cases mentioned, a simple procedural program performs the following operations for all matrix schemes described above
- Creates three matrices (A, B, C)
- Fills two matrices (A, B) with random numbers
- Performs matrix multiplication (A, B) and returns the result into matrix (C)
The following chart is presenting the findings of this experiment.
We can clearly see, that a simple two-dimensional array is the most efficient data structure for handling large matrix operations in VBA. There is just no way out of this, period. Testing program is presented below. You can just copy-paste it into a new standard VBA module, if you are interested to run it in your own laptop. Remember to create reference to Microsoft Scripting Runtime library.
Additionally, I also tested using MMULT function with simple two-dimensional arrays. Efficiency of this method is only marginally better, than using two-dimensional arrays with the code provided above (testRun_2DArray). Moreover, there is a limit of the sizes of matrices what we can feed for this worksheet function and those are surprisingly low. For example, trying to multiply A (10 * 100 000) with B (100 000 * 10) leads to runtime error.
The chart below is presenting the results for test cases with small matrices, including test case for using MMULT worksheet function. For each matrix schemes, matrix B rows (A columns) were dimensioned from 1 000 to 65 000 and matrix B columns (A rows) were assumed to be constant 10. For example, in the first calculation, we multiplied matrix A (10 * 1 000) with matrix B (1 000 * 10) and received matrix C (10 * 10). In the second calculation matrix dimensions were A (10 * 2 000) and B (2 000 * 10) and we received matrix C (10 * 10), and so on.
The direction of the results is the same as with large matrices. Using MMULT worksheet function is the most efficient choice, but only marginally better than using simple two-dimensional arrays. The use of Matrix wrapper class for small matrix operations can still be seen as reasonable choice, since the time loss compared to more efficient choices is after all, relatively small.
Just for the curious, I wanted to compare VBA matrix operations efficiency results with the corresponding C++ results. For this reason, I used dynamically allocated arrays. Otherwise, the actual testing program was basically the same as for VBA cases: allocate memory for arrays, fill arrays with random numbers, perform matrix multiplication and finally release the allocated memory. Time elapsed was recorded only for the actual matrix multiplication operation. The chart below is presenting the results.
In a nutshell, average efficiency ratio (VBA processing time / C++ processing time) is 5.24 for this experiment sample. Moreover, larger arrays can be handled in C++ than in VBA, since the memory is allocated from the heap memory instead of stack memory.
So, for any large and time-critical matrix operations performed in VBA, a simple two-dimensional array is the most efficient data structure which can be provided by VBA. For a small matrix operations, arrays wrapped in class can still be used. For real hardcore calculations (very large matrices, fast processing times), VBA is unfortunately not efficient tool for handling such calculations.
Life goes on - have a great weekend!