Statistics and Maps¶
Several basic statistic formulas are calculated when importing data.
TDataItem has a "Stats" property that returns the calculations at column and table level.
These basic values (things like Standard Deviation, Kurtosis, Skewness, etc, etc) are intended to be used by the machine-learning algorithms. Calculating them in advance, at import time, and persisting them together with the data saves time.
A "map" of each column is also created. A map is a class that contains a sorted array with the unique element values of a data item, and the frequency of each element (the number of times it appears).
The map sorted array is used for fast searching of data (using a binary search algorithm).