Skip to content

index

Here you can browse all top level functions, types, variables and namespaces of tree-garden.

Type Aliases

Functions

Variables

Namespaces

Type Aliases

TreeGardenConfiguration

TreeGardenConfiguration: Object

TreeGardenConfiguration is somehow central object of tree-garden it holds every options regarding growing trees, growing forests and their usage on unknown data. It can also be used for dependency injection of custom implementations.

You do not want to write your configuration by hand, see buildAlgorithmConfiguration.

growMissingValueReplacement,evaluateMissingValueReplacement, missingValue,getAllPossibleSplitCriteriaForDiscreteAttribute and getAllPossibleSplitCriteriaForContinuousAttribute can be defined differently for particular attribute.

Type declaration

Name Type Description
treeType "classification" | "regression" tree-garden supports also regression trees and forests, here you can switch ;) Default Value classification
attributes { [key: string]: typeof defaultAttributeConfiguration; } Key is attribute id, value is attribute meta object. Filled by buildAlgorithmConfiguration
includedAttributes string[] Only these attributes are considered for building decision tree
excludedAttributes string[] These attributes are not considered for building decision tree
getScoreForSplit (parentDataSet: TreeGardenDataSample[], childDataSets: { [key: string]: TreeGardenDataSample[]; }, config: TreeGardenConfiguration, splitter: SplitCriteriaFn) => number Impurity scoring function. You can switch on gini, information gain or regression tree score in case of regression trees. You can also implement your own. Default Value getInformationGainRatioForSplit
biggerScoreBetterSplit boolean Depends on split scoring function you choose, entropy based methods have higher score, better split, but gini index has lower score better split! Default Value true
shouldWeStopGrowth (node: TreeGardenNode, configuration: TreeGardenConfiguration) => boolean You can configure pre-pruning.
numberOfSplitsKept number How many of considered splits in each node should be stored, it can be seen in tree-garden-visualization upon clicking on node. Default Value 3
growMissingValueReplacement (dataSet: TreeGardenDataSample[], attributeId: string, configuration: TreeGardenConfiguration) => (sample: TreeGardenDataSample) => any How to deal with missing values during growth phase. Default Value getMostCommonValueFF
evaluateMissingValueReplacement (dataSet: TreeGardenDataSample[], attributeId: string, configuration: TreeGardenConfiguration) => (sample: TreeGardenDataSample) => any How to deal with missing values during evaluate phase. Default Value getMostCommonValueFF
getClassFromLeafNode (node: TreeGardenNode, sample?: TreeGardenDataSample) => string Function that will retrieve class from node of classification tree for given sample Default Value getMostCommonClassForNode
getValueFromLeafNode (node: TreeGardenNode, sample?: TreeGardenDataSample) => number Function that will retrieve value from node of regression tree for given sample Default Value getValueForNode
onlyBinarySplits boolean If true only binary splits are allowed - this is restriction implemented in CART algorithm - possible splits are designed in way that it has always boolean outcome - two child nodes leads from each parent Default Value false if true it will perform very slowly on data sets with attributes like date, name - plenty of possible discrete values.
missingValue any What value is considered as missing value Default Value undefined
keepFullLearningData boolean If true all data partitions in each node are kept - data of tree will be huge suitable just for small training sets Default Value false
getAllPossibleSplitCriteriaForDiscreteAttribute (attributeId: string, dataSet: TreeGardenDataSample[], configuration: TreeGardenConfiguration) => SplitCriteriaDefinition[] Strategy, how to generate all possible splits for given discrete attribute Default Value getPossibleSpitCriteriaForDiscreteAttribute
getAllPossibleSplitCriteriaForContinuousAttribute (attributeId: string, dataSet: TreeGardenDataSample[], configuration: TreeGardenConfiguration) => SplitCriteriaDefinition[] Strategy, how to generate all possible splits for given continuous attribute Default Value getPossibleSpitCriteriaForContinuousAttribute
costComplexityPruningKFold number If you use cost complexity pruning alpha parameter is internally found by cross-validation, you can change how many datasets are used. Default Value 5
reducedErrorPruningGetScore (accuracyBeforePruning: number, accuracyAfterPruning: number, numberOfNodesInPrunedTree: number) => number Function used for scoring of reduced error pruning Default Value getPrunedTreeScore
getTreeAccuracy (treeRootNode: TreeGardenNode, dataSet: TreeGardenDataSample[], configuration: TreeGardenConfiguration) => number Function that will calculate how precise tree is Default Value getTreeAccuracy
numberOfTrees number How many trees do we want in random forest - Default Value 27
getAttributesForTree (algorithmConfiguration: TreeGardenConfiguration, _dataSet: TreeGardenDataSample[]) => string[] Function for gathering subset of attributes for random forest Default Value getSubsetOfAttributesForTreeOfRandomForest
numberOfBootstrappedSamples number How many samples are bootstrapped for each tree of random forest, Default Value 0 which means same amount as number of samples in training data set
calculateOutOfTheBagError boolean Should we calculate out of the bag error for random forest? Default Value true
majorityVoting (treeRoots: TreeGardenNode[], dataSample: TreeGardenDataSample, config: TreeGardenConfiguration) => SingleSamplePredictionResult Majority voting function for random forests, Default Value getResultFromMultipleTrees
mergeClassificationResults (values: string[]) => string Function for merging classification results (from multiple trees) Default Value getMostCommonValue
mergeRegressionResults (values: number[]) => number Function for merging regression results (from multiple trees) Default Value getMedian
getTagOfSampleWithMissingValueWhileClassifying? (sample: TreeGardenDataSample, attributeId: string, nodeWhereWeeNeedValue: TreeGardenNode, config: TreeGardenConfiguration) => any If there is missing value while classifying (reference data set for replacement was not provided) this function will gather tag for given node. See default implementation.
allClasses? string[] All classes of training data set - filled by buildAlgorithmConfiguration
buildTime? number Timestamp - when buildAlgorithmConfig was called - filled automatically

Defined in

algorithmConfiguration/buildAlgorithmConfiguration.ts:21


TreeGardenDataSample

TreeGardenDataSample: Object

For more information, see tree-garden data sample.

Index signature

▪ [key: string]: any

Type declaration

Name Type
_class? string | number
_label? string | number
_id? string | number

Defined in

dataSet/set.ts:8


TreeGardenNode

TreeGardenNode: Object

TreeGardenNode is object representing one node of tree, under childNodes, you can see tags of split and child nodes.

Type declaration

Name Type Description
id string Every node have unique identifier.
isLeaf boolean Is node leaf or not?
depth number Depth of node in tree - it starts from zero - root node have depth = 0.
alreadyUsedSplits SplitCriteriaDefinition[] Split definitions used from root up to this node.
chosenSplitCriteria SplitCriteriaDefinition Best scoring split criteria.
bestSplits ReturnType<typeof getBestScoringSplits> Array of best scoring splits and respective scores - amount of kept split can be set in configuration.
dataPartitionsCounts ReturnType<typeof dataPartitionsToDataPartitionCounts> Counts of samples behind each tag, divided by classes, it should look like: {tag:{classOne:3, classTwo:3}, anotherTag:{classOne:1, classTwo:6}}
classCounts ReturnType<typeof dataPartitionsToClassCounts> count of samples by class, should look like: {classOne:8, classTwo:7}
parentId? string Unique identifier of parent node.
childNodes? { [key: string]: TreeGardenNode; } Object of split tags and child nodes.
impurityScore? number Score of chosen best split criteria.
dataPartitions? ReturnType<typeof splitDataSet> Basically split function product - tags and samples - it is thrown away if no longer needed to change this behaviour, see keepFullLearningData in configuration. It should look, like that: {'tag':[sample,anotherSample],'anotherTag':[sample,anotherSample,nextSample]}
regressionTreeAverageOutcome? number Average outcome of samples of regression tree in this node.
regressionTreeStandardDeviation? number Standard deviation calculated from values of samples of regression tree in this node.

Defined in

treeNode.ts:23


SplitCriteriaFn

SplitCriteriaFn: ReturnType<typeof getSplitCriteriaFn>

See return value of getSplitCriteriaFn

Defined in

split.ts:22


SplitCriteriaDefinition

SplitCriteriaDefinition: [string, SplitOperator, any?]

This represents split criteria in serializable way:

Array of [attributeId, operator, value?]

See split module

Example

['color', '==', 'black']
['age','>',10]

Defined in

split.ts:34


SplitOperator

SplitOperator: typeof supportedMathOperators extends Set<infer K> ? K : never

Split operator is of supported mathematical operators - check current code for supported choices.

Defined in

split.ts:18

Functions

buildAlgorithmConfiguration

buildAlgorithmConfiguration(dataSet, configuration?): TreeGardenConfiguration

This function will help you to create configuration for your decision tree or forest. If you have at least part of data set with all classes present, you can create configuration automatically (see examples - every training/evaluating needs configuration), if you have just one sample check example.

See defaultConfiguration to see default values.

Parameters

Name Type Description
dataSet TreeGardenDataSample[] Array of tree-garden samples, be sure you have all classes included
configuration Partial<Omit<TreeGardenConfiguration, "attributes"> & { attributes?: { [key: string]: Partial<typeof defaultAttributeConfiguration>; } }> override default configuration with your own.

Returns

TreeGardenConfiguration

Defined in

algorithmConfiguration/buildAlgorithmConfiguration.ts:214


growTree

growTree(algorithmConfiguration, dataSet): TreeGardenNode

Grow (train) your decision tree on your configuration and data set. See examples in getting started

Parameters

Name Type
algorithmConfiguration TreeGardenConfiguration
dataSet TreeGardenDataSample[]

Returns

TreeGardenNode

Defined in

growTree.ts:11


growRandomForest

growRandomForest(algorithmConfiguration, dataSet): Object

Grow (train) your random forest on your configuration and data set. See random forest example.

Parameters

Name Type
algorithmConfiguration TreeGardenConfiguration
dataSet TreeGardenDataSample[]

Returns

Object

Name Type
trees TreeGardenNode[]
oobError undefined | number
treesAndOobSets readonly [TreeGardenNode, undefined | Set<undefined | string | number>][]

Defined in

growRandomForest.ts:17


getTreePrediction

getTreePrediction<T>(samplesToPredict, decisionTreeRoot, algorithmConfiguration, referenceDataSetForReplacing?): PredictionReturnValue<T>

Get outcome of your trained decision tree on unknown samples. See examples to see it in action.

Type parameters

Name Type
T extends TreeGardenDataSample | TreeGardenDataSample[]

Parameters

Name Type Description
samplesToPredict T -
decisionTreeRoot TreeGardenNode -
algorithmConfiguration TreeGardenConfiguration -
referenceDataSetForReplacing? TreeGardenDataSample[] Provide data set to replace missing values in your unknown samples you want to classify.

Returns

PredictionReturnValue<T>

Defined in

predict.ts:125


getRandomForestPrediction

getRandomForestPrediction<T>(samplesToPredict, trees, algorithmConfiguration, referenceDataSetForReplacing?): PredictionReturnValue<T>

Get outcome of your trained random forest on unknown samples. See random forest example to see it in action.

Type parameters

Name Type
T extends TreeGardenDataSample | TreeGardenDataSample[]

Parameters

Name Type Description
samplesToPredict T -
trees TreeGardenNode[] -
algorithmConfiguration TreeGardenConfiguration -
referenceDataSetForReplacing? TreeGardenDataSample[] Provide data set to replace missing values in your unknown samples you want to classify.

Returns

PredictionReturnValue<T>

Defined in

predict.ts:146


getTreeAccuracy

getTreeAccuracy(treeRootNode, dataSet, configuration): number

Calculate accuracy for tree (classification and regression) on given data set.

See getMissClassificationRateRaw and getRAbsErrorRaw for more information

Parameters

Name Type
treeRootNode TreeGardenNode
dataSet TreeGardenDataSample[]
configuration TreeGardenConfiguration

Returns

number

Defined in

statistic/treeStats.ts:93


getDividedSet

getDividedSet(dataSet, portionGoesToFirst?): TreeGardenDataSample[][]

Function that randomly distributes samples of data set into two data sets.

Example

// 70% goes to training, rest to validation
const [trainingDataSet,validationDataSet] = getDividedSet(originalDataSet,0.7)

Parameters

Name Type Default value Description
dataSet TreeGardenDataSample[] undefined -
portionGoesToFirst number 0.5 portion of samples that will go to first one, rest goes to second one 0 - 1

Returns

TreeGardenDataSample[][]

Defined in

dataSet/dividingAndBootstrapping.ts:15

Variables

defaultConfiguration

Const defaultConfiguration: TreeGardenConfiguration

Default configuration. See code for more information.

Defined in

algorithmConfiguration/algorithmDefaultConfiguration.ts:27


defaultAttributeConfiguration

Const defaultAttributeConfiguration: Object

Default configuration for attribute.

Type declaration

Name Type
dataType "discrete" | "continuous" | "automatic"
growMissingValueReplacement undefined | (dataSet: TreeGardenDataSample[], attributeId: string, configuration: TreeGardenConfiguration) => (sampleWithMissingValue: TreeGardenDataSample) => string | number
evaluateMissingValueReplacement undefined | (dataSet: TreeGardenDataSample[], attributeId: string, configuration: TreeGardenConfiguration) => (sampleWithMissingValue: TreeGardenDataSample) => string | number
missingValue any
getAllPossibleSplitCriteriaForDiscreteAttribute undefined | (attributeId: string, dataSet: TreeGardenDataSample[], configuration: TreeGardenConfiguration) => SplitCriteriaDefinition[]
getAllPossibleSplitCriteriaForContinuousAttribute undefined | (attributeId: string, dataSet: TreeGardenDataSample[], configuration: TreeGardenConfiguration) => SplitCriteriaDefinition[]

Defined in

algorithmConfiguration/attibuteDefaultConfiguration.ts:12