Skip to content

Namespace: impurity

Functions

Functions

getInformationGainRatioForSplit

getInformationGainRatioForSplit(parentSet, childrenSets, config, splitFn): number

Split quality scoring function for classification trees

Information gain ratio is similar like information gain, but penalizes splits that have many distinct values (like dates, IDs or names)

Remarks

Higher score - better split!!!

Parameters

Name Type
parentSet TreeGardenDataSample[]
childrenSets Object
config TreeGardenConfiguration
splitFn (currentSample: TreeGardenDataSample) => any

Returns

number

Defined in

impurity/entropy.ts:63


getInformationGainForSplit

getInformationGainForSplit(parentSet, childrenSets, config, _splitFn): number

Split quality scoring function for classification trees.

It measures decrease of entropy of child data set compared to parent data set. Low entropy == pure data set. Decrease in entropy means raise of purity, thus larger decrease, better split

Remarks

Higher score - better split!!!

Parameters

Name Type
parentSet TreeGardenDataSample[]
childrenSets Object
config TreeGardenConfiguration
_splitFn (currentSample: TreeGardenDataSample) => any

Returns

number

Defined in

impurity/entropy.ts:36


getGiniIndexForSplit

getGiniIndexForSplit(parentSet, childrenSets, config, _splitter): number

Split quality scoring function for classification trees

See gini impurity

Remarks

lower score - better split!!!

Parameters

Name Type
parentSet TreeGardenDataSample[]
childrenSets Object
config TreeGardenConfiguration
_splitter (currentSample: TreeGardenDataSample) => any

Returns

number

Defined in

impurity/gini.ts:42


getScoreForRegressionTreeSplit

getScoreForRegressionTreeSplit(parentDataSet, childDataSets, config, splitter): number

Split quality scoring function for regression trees

It is based on sum of residuals, residual is distance of particular value from average value tree-garden uses absolute distance, not squared. Lower sum means that values are closer together - data set is more pure.

Remarks

lower score - better split!!!

Parameters

Name Type
parentDataSet TreeGardenDataSample[]
childDataSets Object
config TreeGardenConfiguration
splitter (currentSample: TreeGardenDataSample) => any

Returns

number

Defined in

algorithmConfiguration/buildAlgorithmConfiguration.ts:49