API for clj-ml.filters

by Antonio Garrote <antoniogarrote@gmail.com>

Usage:
(ns your-namespace
  (:require clj-ml.filters))

Overview

This namespace defines a set of functions that can be applied to data sets to modify the
dataset in some way: transforming nominal attributes into binary attributes, removing
attributes etc.

A sample use of the API is shown below:

  ;; *ds* is the dataset where the first attribute is to be removed
  (def *filter* (make-filter :remove-attributes {:dataset-format *ds* :attributes [0]}))

  ;; We apply the filter to the original data set and obtain the new one
  (def *filtered-ds* (filter-apply *filter* *ds*))


The previous sample of code could be rewritten with the make-apply-filter function:

  ;; There is no necessity of passing the :dataset-format option, *ds* format is used
  ;; automatically
  (def *filtered-ds* (make-apply-filter :remove-attributes {:attributes [0]} *ds*))

Public Variables and Functions



filter-apply

function
Usage: (filter-apply filter dataset)
Filters an input dataset using the provided filter and generates an output dataset. The
first argument is a filter and the second parameter the data set where the filter should
be applied.


make-apply-filter

function
Usage: (make-apply-filter kind options dataset)
Creates a new filter with the provided options and apply it to the provided dataset.
The :dataset-format attribute for the making of the filter will be setup to the
dataset passed as an argument if no other value is provided.

The application of this filter is equivalent a the consequetive application of
make-filter and apply-filter.


make-filter

multimethod
No usage documentation available
Creates a filter for the provided attributes format. The first argument must be a symbol
identifying the kind of filter to generate.
Currently the following filters are supported:

  - :supervised-discretize
  - :unsupervised-discretize
  - :supervised-nominal-to-binary
  - :unsupervised-nominal-to-binary
  - :remove-attributes
  - :select-append-attributes
  - :project-attributes

 The second parameter is a map of attributes
 for the filter to be built.

 An example of usage could be:

   (make-filter :remove {:attributes [0 1] :dataset-format dataset})

 Documentation for the different filters:

 * :supervised-discretize

   An instance filter that discretizes a range of numeric attributes
   in the dataset into nominal attributes. Discretization is by Fayyad
   & Irani's MDL method (the default).

   Parameters:

     - :attributes
         Index of the attributes to be discretized, sample value: [0,4,6]
     - :invert
         Invert mathcing sense of the columns, sample value: true
     - :kononenko
         Use Kononenko's MDL criterion, sample value: true

 * :unsupervised-discretize

   Unsupervised version of the discretize filter. Discretization is by simple
   pinning.

   Parameters:

     - :attributes
         Index of the attributes to be discretized, sample value: [0,4,6]
     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :unset-class
         Does not take class attribute into account for the application
         of the filter, sample-value: true
     - :binary
     - :equal-frequency
         Use equal frequency instead of equal width discretization, sample
         value: true
     - :optimize
         Optmize the number of bins using leave-one-out estimate of
         estimated entropy. Ingores the :binary attribute. sample value: true
     - :number-bins
         Defines the number of bins to divide the numeric attributes into
         sample value: 3

 * :supervised-nominal-to-binary

   Converts nominal attributes into binary numeric attributes. An attribute with k values
   is transformed into k binary attributes if the class is nominal.

   Parameters:
     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :also-binary
         Sets if binary attributes are to be coded as nominal ones, sample value: true
     - :for-each-nominal
         For each nominal value one binary attribute is created, not only if the
         values of the nominal attribute are greater than two.

 * :unsupervised-nominal-to-binary

   Unsupervised version of the :nominal-to-binary filter

   Parameters:

     - :attributes
         Index of the attributes to be binarized. Sample value: [1 2 3]
     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :also-binary
         Sets if binary attributes are to be coded as nominal ones, sample value: true
     - :for-each-nominal
         For each nominal value one binary attribute is created, not only if the
         values of the nominal attribute are greater than two., sample value: true

 * :remove-attributes

   Remove some columns from the data set after the provided attributes.

   Parameters:

     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :attributes
         Index of the attributes to remove. Sample value: [1 2 3]

 * :select-append-attributes

   Append a copy of the selected columns at the end of the dataset.

   Parameters:

     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :attributes
         Index of the attributes to remove. Sample value: [1 2 3]
     - :invert
         Invert the selection of the columns. Sample value: [0 1]

 * :project-attributes

   Project some columns from the provided dataset

   Parameters:

     - :dataset-format
         The dataset where the filter is going to be applied or a
         description of the format of its attributes. Sample value:
         dataset, (dataset-format dataset)
     - :invert
         Invert the selection of columns. Sample value: [0 1]
Logo & site design by Tom Hickey.
Clojure auto-documentation system by Tom Faulhaber.