Configuration from data sample
Let`s obtain algorithm configuration just from single data sample.
This may be handy, if we do not have access to full training data set - for instance if we have build classification service which uses pretrained tree/forest.
code
// in case of prediction we do not have whole data set available, but we still need algorithmConfiguration for prediction.
// we can inherit configuration just with single complete sample and knowledge of all classes in case of classification tree
// we do not need to write it by hand...
import {
buildAlgorithmConfiguration,
getTreePrediction,
sampleTrees
} from 'tree-garden';
// Let`s use pretrained tree, which is bundled with tree-garden
const { tennisTree } = sampleTrees;
// we need configuration in order to be able to predict some unknown samples
// we will buildConfiguration using just single complete (without missing values) [sample for config]
// sample and knowledge of all classes
const singleSample = {
_label: '5', outlook: 'Rain', temp: 'Cool', humidity: 'Normal', wind: 'Weak', _class: 'Yes'
};
// full configuration that can be used for predictions
const config = buildAlgorithmConfiguration(
[singleSample],
{
allClasses: ['Yes', 'No'] // [important]
}
);
// sample of interest - based on today`s weather ;)
const shouldIGoToPlayTennisTodaySample = {
outlook: 'Sunny',
temp: 'Mild',
humidity: 'Normal',
wind: 'Weak'
};
// prediction from our imported tree
const shouldIStayOrShouldIGo = getTreePrediction(shouldIGoToPlayTennisTodaySample, tennisTree, config);
// lets see if I should go
console.log(`Hey mighty tree, should i go play tennis today?\nMighty tree says: ${shouldIStayOrShouldIGo}`);
comments
In this example we imported bundled tree, take one data sample and used it to build configuration.
This configuration is then used for predicting our unknown sample.
[sample for config]
As wee used just single data sample to create configuration. We need sample without missing values -
all fields from learning phase where whole data set was presented must be included in this single sample.
[important] We also need to provide all classes presented in our data set in our case it is 'Yes' and 'No'