An Overview of the clj-ml API


Machine Learning library for Clojure built around Weka and friends



clj-ml.classifiers

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
This namespace contains several functions for building classifiers using different
classification algorithms: Bayes networks, multilayer perceptron, decission tree or
support vector machines are available. Some of these classifiers have incremental
versions so they can be built without having all the dataset instances in memory.

Functions for evaluating the classifiers built using cross validation or a training
set are also provided.

A sample use of the API for classifiers is shown below:

 (use 'clj-ml.classifiers)

 ; Building a classifier using a  C4.5 decission tree
 (def *classifier* (make-classifier :decission-tree :c45))

 ; We set the class attribute for the loaded dataset.
 ; *dataset* is supposed to contain a set of instances.
 (dataset-set-class *dataset* 4)

 ; Training the classifier
 (classifier-train *classifier* *ds*)

 ; We evaluate the classifier using a test dataset
 (def *evaluation*   (classifier-evaluate classifier  :dataset *dataset* *trainingset*))

 ; We retrieve some data from the evaluation result
 (:kappa *evaluation*)
 (:root-mean-squared-error *evaluation*)
 (:precision *evaluation*)

 ; A trained classifier can be used to classify new instances
 (def *to-classify* (make-instance ds  {:class :Iris-versicolor
                                        :petalwidth 0.2
                                        :petallength 1.4
                                        :sepalwidth 3.5
                                        :sepallength 5.1}))

 ; We retrieve the index of the class value assigned by the classifier
 (classifier-classify *classifier* *to-classify*)

 ; We retrieve a symbol with the value assigned by the classifier
 (classifier-label *classifier* *to-classify*)

A classifier can also be trained using cross-validation:

 (classifier-evaluate *classifier* :cross-validation ds 10)

Finally a classifier can be stored in a file for later use:

 (use 'clj-ml.utils)

 (serialize-to-file *classifier*
  "/Users/antonio.garrote/Desktop/classifier.bin")
Public variables and functions: classifier-classify classifier-evaluate classifier-label classifier-train classifier-update make-classifier


clj-ml.clusterers

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
This namespace contains several functions for
building clusterers using different clustering algorithms. K-means, Cobweb and

Expectation maximization algorithms are currently supported. Some of these
algorithms support incremental building of the clustering without having the
full data set in main memory. Functions for evaluating the clusterer as well
as for clustering new instances are also supported
Public variables and functions: clusterer-build clusterer-cluster clusterer-evaluate clusterer-info clusterer-update make-clusterer make-clusterer-options


clj-ml.data

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
This namespace contains several functions for
building creating and manipulating data sets and instances. The formats of
these data sets as well as their classes can be modified and assigned to
the instances. Finally data sets can be transformed into Clojure sequences
that can be transformed using usual Clojure functions like map, reduce, etc.
Public variables and functions: dataset-add dataset-at dataset-extract-at dataset-pop instance-to-map instance-to-vector make-dataset make-instance


clj-ml.data-store

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
Functions for storing and retrieving data sets from a persistence store, like a data
base system.
Currently MongoDB is the only store supported.
Public variables and functions: data-store-connection-db data-store-load-dataset data-store-save-dataset make-data-store-connection


clj-ml.distance-functions

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
Generates different distance metrics that can be passed as parameters to certain
classifiers and clusterers like K-Means.

Euclidean, Manhattan and Chebysev distance functions are supported.
Public variables and functions: make-distance-function


clj-ml.filters

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
This namespace defines a set of functions that can be applied to data sets to modify the
dataset in some way: transforming nominal attributes into binary attributes, removing
attributes etc.

A sample use of the API is shown below:

  ;; *ds* is the dataset where the first attribute is to be removed
  (def *filter* (make-filter :remove-attributes {:dataset-format *ds* :attributes [0]}))

  ;; We apply the filter to the original data set and obtain the new one
  (def *filtered-ds* (filter-apply *filter* *ds*))


The previous sample of code could be rewritten with the make-apply-filter function:

  ;; There is no necessity of passing the :dataset-format option, *ds* format is used
  ;; automatically
  (def *filtered-ds* (make-apply-filter :remove-attributes {:attributes [0]} *ds*))
Public variables and functions: filter-apply make-apply-filter make-filter


clj-ml.io

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
Functions for reading and saving datasets, classifiers and clusterers to files and other
persistence mechanisms.
Public variables and functions: load-instances save-instances


clj-ml.kernel-functions

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
Kernel functions that can be passed as parameters to support vector machines classifiers.

Polynomic, radial basis and string kernels are supported
Public variables and functions: make-kernel-function make-kernel-function-options


clj-ml.ui

by Antonio Garrote <antoniogarrote@gmail.com>
Detailed API documentation
Namespace containing functions for plotting classifiers, clusterers and data sets.
Public variables and functions: display-object
Logo & site design by Tom Hickey.
Clojure auto-documentation system by Tom Faulhaber.