Linfa Menu
Linfa 0.3.0 concentrates on polishing the existing implementation and adds only three new algorithms to the crowd. A new feature system is introduced, which allows the selection of the BLAS/LAPACK backend in the base-crate. The Dataset
interface is polished and follows the ndarray
model more closely. The new linfa-datasets
crate gives easier access to sample datasets and can be used for testing.
linfa-clustering
by [@Sauro98]linfa-bayes
by [@VasanthakumarV]linfa-elasticnet
by [@paulkoerbitz] and [@bytesnake]linfa-datasets
for easier testing (3cec12b4f)Dataset
to DatasetBase
and introduce Dataset
and DatasetView
(21dd579cf)The following section shows a small example how datasets interact with the training and testing of a Linear Decision Tree.
You can load a dataset, shuffle it and then split it into training and validation sets:
// initialize pseudo random number generator with seed 42
let mut rng = Isaac64Rng::seed_from_u64(42);
// load the Iris dataset, shuffle and split with ratio 0.8
let (train, test) = linfa_datasets::iris()
.shuffle(&mut rng)
.split_with_ratio(0.8);
With the training dataset a linear decision tree model can be trained. Entropy is used as a metric for the optimal split here:
let entropy_model = DecisionTree::params()
.split_quality(SplitQuality::Entropy)
.max_depth(Some(100))
.min_weight_split(10.0)
.min_weight_leaf(10.0)
.fit(&train);
The validation dataset is now used to estimate the error. For this the true labels are predicted and then a confusion matrix gives clue about the type of error:
let cm = entropy_model
.predict(test.records().view())
.confusion_matrix(&test);
println!("{:?}", cm);
println!(
"Test accuracy with Entropy criterion: {:.2}%",
100.0 * cm.accuracy()
);
Finally you can analyze which features were used in the decision and export the whole tree it to a TeX
file. It will contain a TiKZ tree with information on the splitting decision and impurity improvement:
let feats = entropy_model.features();
println!("Features trained in this tree {:?}", feats);
let mut tikz = File::create("decision_tree_example.tex").unwrap();
tikz.write(gini_model.export_to_tikz().to_string().as_bytes())
.unwrap();
The whole example can be found in linfa-trees/examples/decision_tree.rs.