net.sf.myra.datamining.data
Class AbstractEntropyBasedBuilder

java.lang.Object
  extended by net.sf.myra.datamining.data.IntervalBuilder
      extended by net.sf.myra.datamining.data.AbstractEntropyBasedBuilder
Direct Known Subclasses:
EntropyIntervalBuilder, MDLEntropyIntervalBuilder

public abstract class AbstractEntropyBasedBuilder
extends IntervalBuilder

Version:
$Revision: 1981 $ $Date:: 2009-02-20 11:46:44#$
Author:
Fernando Esteban Barril Otero

Nested Class Summary
 
Nested classes/interfaces inherited from class net.sf.myra.datamining.data.IntervalBuilder
IntervalBuilder.CutPoint, IntervalBuilder.Interval
 
Field Summary
 
Fields inherited from class net.sf.myra.datamining.data.IntervalBuilder
BUILDER, DEFAULT_BUILDER, metadata, MINIMUM, minimumLimit
 
Constructor Summary
AbstractEntropyBasedBuilder(Metadata metadata)
          Default constructor.
 
Method Summary
protected  double average(double value1, double value2)
           
 int count(java.util.List<Instance> instances, ContinuousAttribute attribute)
          Returns the number of candidate threshold values.
protected  IntervalBuilder.CutPoint[] create(double[][] matrix, int classes)
          Returns candidate cut point(s) given the data distribution.
protected abstract  IntervalBuilder.CutPoint[] create(double[][] matrix, int classes, boolean filter)
          Returns candidate cut point(s) given the data distribution.
 IntervalBuilder.Interval[] create(java.util.List<Instance> instances, ContinuousAttribute attribute)
          Returns the discrete intervals for the specified continuous attribute.
 IntervalBuilder.Interval createSingle(java.util.List<Instance> instances, ContinuousAttribute attribute)
          Returns an interval for the specified continuous attribute tailored for the specified instance.
 IntervalBuilder.Interval createSingle(java.util.List<Instance> instances, ContinuousAttribute attribute, java.lang.String label)
          Returns a discrete interval for the specified continuous attribute tailored for the specified instances and class value.
protected  int diversity(double[][] matrix, int index, int length, int c)
          Returns the number of classes (diversity) in the specified range of examples.
protected  double entropy(double[][] matrix, int index, int length, int k)
          Returns the entropy of the specified range of examples.
protected  void sort(double[][] matrix)
          Sorts (based on a bubble sort algorithm) the matrix of attribute value -> class value.
protected  double weightedLength(double[][] matrix, int from, int to)
           
 
Methods inherited from class net.sf.myra.datamining.data.IntervalBuilder
getInstance
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbstractEntropyBasedBuilder

public AbstractEntropyBasedBuilder(Metadata metadata)
Default constructor.

Parameters:
metadata - the dataset metadata.
Method Detail

create

public IntervalBuilder.Interval[] create(java.util.List<Instance> instances,
                                         ContinuousAttribute attribute)
Description copied from class: IntervalBuilder
Returns the discrete intervals for the specified continuous attribute.

Specified by:
create in class IntervalBuilder
Parameters:
instances - the list of instances.
attribute - the continuous attribute.
Returns:
the discrete intervals for the specified continuous attribute.

createSingle

public IntervalBuilder.Interval createSingle(java.util.List<Instance> instances,
                                             ContinuousAttribute attribute)
Returns an interval for the specified continuous attribute tailored for the specified instance.

Specified by:
createSingle in class IntervalBuilder
Parameters:
instances - the list of instances.
attribute - the continuous attribute.
Returns:
an interval for the specified continuous attribute tailored for the specified instance.

createSingle

public IntervalBuilder.Interval createSingle(java.util.List<Instance> instances,
                                             ContinuousAttribute attribute,
                                             java.lang.String label)
Returns a discrete interval for the specified continuous attribute tailored for the specified instances and class value. The instances are arranged into a binary distribution (i.e. instances that belong to the specified class and instances that do not belong to the specified class).

Specified by:
createSingle in class IntervalBuilder
Parameters:
instances - the list of instances.
attribute - the continuous attribute.
label - the class value.
Returns:
a discrete interval for the specified continuous attribute tailored for the specified instances and class value.

count

public int count(java.util.List<Instance> instances,
                 ContinuousAttribute attribute)
Description copied from class: IntervalBuilder
Returns the number of candidate threshold values.

Specified by:
count in class IntervalBuilder
Parameters:
instances - the list of instances.
attribute - the continuous attribute.
Returns:
the number of candidate threshold values.

sort

protected void sort(double[][] matrix)
Sorts (based on a bubble sort algorithm) the matrix of attribute value -> class value.

Parameters:
matrix - the matrix to be sorted.

entropy

protected double entropy(double[][] matrix,
                         int index,
                         int length,
                         int k)
Returns the entropy of the specified range of examples.

Parameters:
matrix - the (values,class) distribution.
index - the start index.
length - the number of position to evaluate.
k - the total number of class values.
Returns:
the entropy of the specified range of examples.

diversity

protected int diversity(double[][] matrix,
                        int index,
                        int length,
                        int c)
Returns the number of classes (diversity) in the specified range of examples.

Parameters:
matrix - the (values,class) distribution.
index - the start index.
length - the number of position to evaluate.
c - the number of classes.
Returns:
the number of classes within the specified range.

weightedLength

protected double weightedLength(double[][] matrix,
                                int from,
                                int to)

average

protected double average(double value1,
                         double value2)

create

protected IntervalBuilder.CutPoint[] create(double[][] matrix,
                                            int classes)
Returns candidate cut point(s) given the data distribution.

Parameters:
matrix - the (values,class) distribution.
classes - the total number of class values.
Returns:
candidate cut point(s) given the data distribution.

create

protected abstract IntervalBuilder.CutPoint[] create(double[][] matrix,
                                                     int classes,
                                                     boolean filter)
Returns candidate cut point(s) given the data distribution.

Parameters:
matrix - the (values,class) distribution.
classes - the total number of class values.
filter - indicated if the cut points should be filtered. When filtering cut points, it is guarenteed that this method will return 1 or 2 values and when 2 values is returned, the interval to be selected should be the one between the two values.
Returns:
candidate cut point(s) given the data distribution.


Copyright © 2013. All Rights Reserved.