index

Here you can browse all top level functions, types, variables and namespaces of tree-garden.

Type Aliases

Functions

Variables

Namespaces

Type Aliases

TreeGardenConfiguration

TreeGardenConfiguration: Object

TreeGardenConfiguration is somehow central object of tree-garden it holds every options regarding growing trees, growing forests and their usage on unknown data. It can also be used for dependency injection of custom implementations.

You do not want to write your configuration by hand, see buildAlgorithmConfiguration.

growMissingValueReplacement,evaluateMissingValueReplacement, missingValue,getAllPossibleSplitCriteriaForDiscreteAttribute and getAllPossibleSplitCriteriaForContinuousAttribute can be defined differently for particular attribute.

Type declaration

Name	Type	Description
`treeType`	`"classification"` \| `"regression"`	tree-garden supports also regression trees and forests, here you can switch ;) `Default Value` `classification`
`attributes`	{ `[key: string]`: typeof `defaultAttributeConfiguration`; }	Key is attribute id, value is attribute meta object. Filled by buildAlgorithmConfiguration
`includedAttributes`	`string`[]	Only these attributes are considered for building decision tree
`excludedAttributes`	`string`[]	These attributes are not considered for building decision tree
`getScoreForSplit`	(`parentDataSet`: `TreeGardenDataSample`[], `childDataSets`: { `[key: string]`: `TreeGardenDataSample`[]; }, `config`: `TreeGardenConfiguration`, `splitter`: `SplitCriteriaFn`) => `number`	Impurity scoring function. You can switch on gini, information gain or regression tree score in case of regression trees. You can also implement your own. `Default Value` getInformationGainRatioForSplit
`biggerScoreBetterSplit`	`boolean`	Depends on split scoring function you choose, entropy based methods have higher score, better split, but gini index has lower score better split! `Default Value` `true`
`shouldWeStopGrowth`	(`node`: `TreeGardenNode`, `configuration`: `TreeGardenConfiguration`) => `boolean`	You can configure pre-pruning.
`numberOfSplitsKept`	`number`	How many of considered splits in each node should be stored, it can be seen in tree-garden-visualization upon clicking on node. `Default Value` `3`
`growMissingValueReplacement`	(`dataSet`: `TreeGardenDataSample`[], `attributeId`: `string`, `configuration`: `TreeGardenConfiguration`) => (`sample`: `TreeGardenDataSample`) => `any`	How to deal with missing values during growth phase. `Default Value` getMostCommonValueFF
`evaluateMissingValueReplacement`	(`dataSet`: `TreeGardenDataSample`[], `attributeId`: `string`, `configuration`: `TreeGardenConfiguration`) => (`sample`: `TreeGardenDataSample`) => `any`	How to deal with missing values during evaluate phase. `Default Value` getMostCommonValueFF
`getClassFromLeafNode`	(`node`: `TreeGardenNode`, `sample?`: `TreeGardenDataSample`) => `string`	Function that will retrieve class from node of classification tree for given sample `Default Value` getMostCommonClassForNode
`getValueFromLeafNode`	(`node`: `TreeGardenNode`, `sample?`: `TreeGardenDataSample`) => `number`	Function that will retrieve value from node of regression tree for given sample `Default Value` getValueForNode
`onlyBinarySplits`	`boolean`	If true only binary splits are allowed - this is restriction implemented in CART algorithm - possible splits are designed in way that it has always boolean outcome - two child nodes leads from each parent `Default Value` `false` if `true` it will perform very slowly on data sets with attributes like date, name - plenty of possible discrete values.
`missingValue`	`any`	What value is considered as missing value `Default Value` `undefined`
`keepFullLearningData`	`boolean`	If true all data partitions in each node are kept - data of tree will be huge suitable just for small training sets `Default Value` `false`
`getAllPossibleSplitCriteriaForDiscreteAttribute`	(`attributeId`: `string`, `dataSet`: `TreeGardenDataSample`[], `configuration`: `TreeGardenConfiguration`) => `SplitCriteriaDefinition`[]	Strategy, how to generate all possible splits for given discrete attribute `Default Value` getPossibleSpitCriteriaForDiscreteAttribute
`getAllPossibleSplitCriteriaForContinuousAttribute`	(`attributeId`: `string`, `dataSet`: `TreeGardenDataSample`[], `configuration`: `TreeGardenConfiguration`) => `SplitCriteriaDefinition`[]	Strategy, how to generate all possible splits for given continuous attribute `Default Value` getPossibleSpitCriteriaForContinuousAttribute
`costComplexityPruningKFold`	`number`	If you use cost complexity pruning alpha parameter is internally found by cross-validation, you can change how many datasets are used. `Default Value` `5`
`reducedErrorPruningGetScore`	(`accuracyBeforePruning`: `number`, `accuracyAfterPruning`: `number`, `numberOfNodesInPrunedTree`: `number`) => `number`	Function used for scoring of reduced error pruning `Default Value` getPrunedTreeScore
`getTreeAccuracy`	(`treeRootNode`: `TreeGardenNode`, `dataSet`: `TreeGardenDataSample`[], `configuration`: `TreeGardenConfiguration`) => `number`	Function that will calculate how precise tree is `Default Value` getTreeAccuracy
`numberOfTrees`	`number`	How many trees do we want in random forest - `Default Value` `27`
`getAttributesForTree`	(`algorithmConfiguration`: `TreeGardenConfiguration`, `_dataSet`: `TreeGardenDataSample`[]) => `string`[]	Function for gathering subset of attributes for random forest `Default Value` getSubsetOfAttributesForTreeOfRandomForest
`numberOfBootstrappedSamples`	`number`	How many samples are bootstrapped for each tree of random forest, `Default Value` `0` which means same amount as number of samples in training data set
`calculateOutOfTheBagError`	`boolean`	Should we calculate out of the bag error for random forest? `Default Value` `true`
`majorityVoting`	(`treeRoots`: `TreeGardenNode`[], `dataSample`: `TreeGardenDataSample`, `config`: `TreeGardenConfiguration`) => `SingleSamplePredictionResult`	Majority voting function for random forests, `Default Value` getResultFromMultipleTrees
`mergeClassificationResults`	(`values`: `string`[]) => `string`	Function for merging classification results (from multiple trees) `Default Value` getMostCommonValue
`mergeRegressionResults`	(`values`: `number`[]) => `number`	Function for merging regression results (from multiple trees) `Default Value` getMedian
`getTagOfSampleWithMissingValueWhileClassifying?`	(`sample`: `TreeGardenDataSample`, `attributeId`: `string`, `nodeWhereWeeNeedValue`: `TreeGardenNode`, `config`: `TreeGardenConfiguration`) => `any`	If there is missing value while classifying (reference data set for replacement was not provided) this function will gather tag for given node. See default implementation.
`allClasses?`	`string`[]	All classes of training data set - filled by buildAlgorithmConfiguration
`buildTime?`	`number`	Timestamp - when buildAlgorithmConfig was called - filled automatically

Defined in

algorithmConfiguration/buildAlgorithmConfiguration.ts:21

TreeGardenDataSample

TreeGardenDataSample: Object

For more information, see tree-garden data sample.

Index signature

▪ [key: string]: any

Type declaration

Name	Type
`_class?`	`string` \| `number`
`_label?`	`string` \| `number`
`_id?`	`string` \| `number`

Defined in

dataSet/set.ts:8

TreeGardenNode

TreeGardenNode: Object

TreeGardenNode is object representing one node of tree, under childNodes, you can see tags of split and child nodes.

Type declaration

Name	Type	Description
`id`	`string`	Every node have unique identifier.
`isLeaf`	`boolean`	Is node leaf or not?
`depth`	`number`	Depth of node in tree - it starts from zero - root node have depth = `0`.
`alreadyUsedSplits`	`SplitCriteriaDefinition`[]	Split definitions used from root up to this node.
`chosenSplitCriteria`	`SplitCriteriaDefinition`	Best scoring split criteria.
`bestSplits`	`ReturnType`<typeof `getBestScoringSplits`>	Array of best scoring splits and respective scores - amount of kept split can be set in configuration.
`dataPartitionsCounts`	`ReturnType`<typeof `dataPartitionsToDataPartitionCounts`>	Counts of samples behind each tag, divided by classes, it should look like: `{tag:{classOne:3, classTwo:3}, anotherTag:{classOne:1, classTwo:6}}`
`classCounts`	`ReturnType`<typeof `dataPartitionsToClassCounts`>	count of samples by class, should look like: `{classOne:8, classTwo:7}`
`parentId?`	`string`	Unique identifier of parent node.
`childNodes?`	{ `[key: string]`: `TreeGardenNode`; }	Object of split tags and child nodes.
`impurityScore?`	`number`	Score of chosen best split criteria.
`dataPartitions?`	`ReturnType`<typeof `splitDataSet`>	Basically split function product - tags and samples - it is thrown away if no longer needed to change this behaviour, see `keepFullLearningData` in configuration. It should look, like that: `{'tag':[sample,anotherSample],'anotherTag':[sample,anotherSample,nextSample]}`
`regressionTreeAverageOutcome?`	`number`	Average outcome of samples of regression tree in this node.
`regressionTreeStandardDeviation?`	`number`	Standard deviation calculated from values of samples of regression tree in this node.

Defined in

treeNode.ts:23

SplitCriteriaFn

SplitCriteriaFn: ReturnType<typeof getSplitCriteriaFn>

See return value of getSplitCriteriaFn

Defined in

split.ts:22

SplitCriteriaDefinition

SplitCriteriaDefinition: [string, SplitOperator, any?]

This represents split criteria in serializable way:

Array of [attributeId, operator, value?]

See split module

Example

['color', '==', 'black']
['age','>',10]

Defined in

split.ts:34

SplitOperator

SplitOperator: typeof supportedMathOperators extends Set<infer K> ? K : never

Split operator is of supported mathematical operators - check current code for supported choices.

Defined in

split.ts:18

Functions

buildAlgorithmConfiguration

buildAlgorithmConfiguration(dataSet, configuration?): TreeGardenConfiguration

This function will help you to create configuration for your decision tree or forest. If you have at least part of data set with all classes present, you can create configuration automatically (see examples - every training/evaluating needs configuration), if you have just one sample check example.

See defaultConfiguration to see default values.

Parameters

Name	Type	Description
`dataSet`	`TreeGardenDataSample`[]	Array of tree-garden samples, be sure you have all classes included
`configuration`	`Partial`<`Omit`<`TreeGardenConfiguration`, `"attributes"`> & { `attributes?`: { `[key: string]`: `Partial`<typeof `defaultAttributeConfiguration`>; } }>	override default configuration with your own.

Returns

TreeGardenConfiguration

Defined in

algorithmConfiguration/buildAlgorithmConfiguration.ts:214

growTree

growTree(algorithmConfiguration, dataSet): TreeGardenNode

Grow (train) your decision tree on your configuration and data set. See examples in getting started

Parameters

Name	Type
`algorithmConfiguration`	`TreeGardenConfiguration`
`dataSet`	`TreeGardenDataSample`[]

Returns

TreeGardenNode

Defined in

growTree.ts:11

growRandomForest

growRandomForest(algorithmConfiguration, dataSet): Object

Grow (train) your random forest on your configuration and data set. See random forest example.

Parameters

Name	Type
`algorithmConfiguration`	`TreeGardenConfiguration`
`dataSet`	`TreeGardenDataSample`[]

Returns

Object

Name	Type
`trees`	`TreeGardenNode`[]
`oobError`	`undefined` \| `number`
`treesAndOobSets`	readonly [`TreeGardenNode`, `undefined` \| `Set`<`undefined` \| `string` \| `number`>][]

Defined in

growRandomForest.ts:17

getTreePrediction

getTreePrediction<T>(samplesToPredict, decisionTreeRoot, algorithmConfiguration, referenceDataSetForReplacing?): PredictionReturnValue<T>

Get outcome of your trained decision tree on unknown samples. See examples to see it in action.

Type parameters

Name	Type
`T`	extends `TreeGardenDataSample` \| `TreeGardenDataSample`[]

Parameters

Name	Type	Description
`samplesToPredict`	`T`	-
`decisionTreeRoot`	`TreeGardenNode`	-
`algorithmConfiguration`	`TreeGardenConfiguration`	-
`referenceDataSetForReplacing?`	`TreeGardenDataSample`[]	Provide data set to replace missing values in your unknown samples you want to classify.

Returns

PredictionReturnValue<T>

Defined in

predict.ts:125

getRandomForestPrediction

getRandomForestPrediction<T>(samplesToPredict, trees, algorithmConfiguration, referenceDataSetForReplacing?): PredictionReturnValue<T>

Get outcome of your trained random forest on unknown samples. See random forest example to see it in action.

Type parameters

Name	Type
`T`	extends `TreeGardenDataSample` \| `TreeGardenDataSample`[]

Parameters

Name	Type	Description
`samplesToPredict`	`T`	-
`trees`	`TreeGardenNode`[]	-
`algorithmConfiguration`	`TreeGardenConfiguration`	-
`referenceDataSetForReplacing?`	`TreeGardenDataSample`[]	Provide data set to replace missing values in your unknown samples you want to classify.

Returns

PredictionReturnValue<T>

Defined in

predict.ts:146

getTreeAccuracy

getTreeAccuracy(treeRootNode, dataSet, configuration): number

Calculate accuracy for tree (classification and regression) on given data set.

See getMissClassificationRateRaw and getRAbsErrorRaw for more information

Parameters

Name	Type
`treeRootNode`	`TreeGardenNode`
`dataSet`	`TreeGardenDataSample`[]
`configuration`	`TreeGardenConfiguration`

Returns

number

Defined in

statistic/treeStats.ts:93

getDividedSet

getDividedSet(dataSet, portionGoesToFirst?): TreeGardenDataSample[][]

Function that randomly distributes samples of data set into two data sets.

Example

// 70% goes to training, rest to validation
const [trainingDataSet,validationDataSet] = getDividedSet(originalDataSet,0.7)

Parameters

Name	Type	Default value	Description
`dataSet`	`TreeGardenDataSample`[]	`undefined`	-
`portionGoesToFirst`	`number`	`0.5`	portion of samples that will go to first one, rest goes to second one 0 - 1

Returns

TreeGardenDataSample[][]

Defined in

dataSet/dividingAndBootstrapping.ts:15

Variables

defaultConfiguration

Const defaultConfiguration: TreeGardenConfiguration

Default configuration. See code for more information.

Defined in

algorithmConfiguration/algorithmDefaultConfiguration.ts:27

defaultAttributeConfiguration

Const defaultAttributeConfiguration: Object

Default configuration for attribute.

Type declaration

Name	Type
`dataType`	`"discrete"` \| `"continuous"` \| `"automatic"`
`growMissingValueReplacement`	`undefined` \| (`dataSet`: `TreeGardenDataSample`[], `attributeId`: `string`, `configuration`: `TreeGardenConfiguration`) => (`sampleWithMissingValue`: `TreeGardenDataSample`) => `string` \| `number`
`evaluateMissingValueReplacement`	`undefined` \| (`dataSet`: `TreeGardenDataSample`[], `attributeId`: `string`, `configuration`: `TreeGardenConfiguration`) => (`sampleWithMissingValue`: `TreeGardenDataSample`) => `string` \| `number`
`missingValue`	`any`
`getAllPossibleSplitCriteriaForDiscreteAttribute`	`undefined` \| (`attributeId`: `string`, `dataSet`: `TreeGardenDataSample`[], `configuration`: `TreeGardenConfiguration`) => `SplitCriteriaDefinition`[]
`getAllPossibleSplitCriteriaForContinuousAttribute`	`undefined` \| (`attributeId`: `string`, `dataSet`: `TreeGardenDataSample`[], `configuration`: `TreeGardenConfiguration`) => `SplitCriteriaDefinition`[]

Defined in

algorithmConfiguration/attibuteDefaultConfiguration.ts:12