GClasses

GClasses::GNaiveInstance Class Reference

This is an instance-based learner. Instead of finding the k-nearest neighbors of a feature vector, it finds the k-nearst neighbors in each dimension. That is, it finds n*k neighbors, considering each dimension independently. It then combines the label from all of these neighbors to make a prediction. Finding neighbors in this way makes it more robust to high-dimensional datasets. It tends to perform worse than k-nn in low-dimensional space, and better than k-nn in high-dimensional space. (It may be thought of as a cross between a k-nn instance learner and a Naive Bayes learner. It only supports continuous features and labels (so it is common to wrap it in a Categorize filter which will convert nominal features to a categorical distribution of continuous values). More...

#include <GNaiveInstance.h>

Inheritance diagram for GClasses::GNaiveInstance:
GClasses::GIncrementalLearner GClasses::GSupervisedLearner GClasses::GTransducer

List of all members.

Public Member Functions

 GNaiveInstance (GRand &rand)
 nNeighbors is the number of neighbors (in each dimension) that will contribute to the output value.
 GNaiveInstance (GDomNode *pNode, GLearnerLoader &ll)
 Deserializing constructor.
virtual ~GNaiveInstance ()
virtual GDomNodeserialize (GDom *pDoc)
 Marshal this object into a DOM, which can then be converted to a variety of serial formats.
void setNeighbors (size_t k)
 Specify the number of neighbors to use.
size_t neighbors ()
 Returns the number of neighbors.
virtual void trainSparse (GSparseMatrix &features, GMatrix &labels)
 See the comment for GIncrementalLearner::trainSparse.
virtual void clear ()
 See the comment for GSupervisedLearner::clear.
void autoTune (GMatrix &features, GMatrix &labels)
 Uses cross-validation to find a set of parameters that works well with the provided data.

Static Public Member Functions

static void test ()
 Performs unit tests for this class. Throws an exception if there is a failure.

Protected Member Functions

void evalInput (size_t nInputDim, double dInput)
virtual void trainInner (GMatrix &features, GMatrix &labels)
 See the comment for GSupervisedLearner::trainInner.
virtual void predictInner (const double *pIn, double *pOut)
 See the comment for GSupervisedLearner::predictInner.
virtual void predictDistributionInner (const double *pIn, GPrediction *pOut)
 See the comment for GSupervisedLearner::predictDistributionInner.
virtual bool canImplicitlyHandleNominalFeatures ()
 See the comment for GTransducer::canImplicitlyHandleNominalFeatures.
virtual bool canImplicitlyHandleNominalLabels ()
 See the comment for GTransducer::canImplicitlyHandleNominalLabels.
virtual void beginIncrementalLearningInner (sp_relation &pFeatureRel, sp_relation &pLabelRel)
 See the comment for GIncrementalLearner::beginIncrementalLearningInner.
virtual void trainIncrementalInner (const double *pIn, const double *pOut)
 Incrementally train with a single instance.

Protected Attributes

size_t m_internalLabelDims
size_t m_internalFeatureDims
size_t m_nNeighbors
GNaiveInstanceAttr ** m_pAttrs
double * m_pValueSums
double * m_pWeightSums
double * m_pSumBuffer
double * m_pSumOfSquares
GHeapm_pHeap

Detailed Description

This is an instance-based learner. Instead of finding the k-nearest neighbors of a feature vector, it finds the k-nearst neighbors in each dimension. That is, it finds n*k neighbors, considering each dimension independently. It then combines the label from all of these neighbors to make a prediction. Finding neighbors in this way makes it more robust to high-dimensional datasets. It tends to perform worse than k-nn in low-dimensional space, and better than k-nn in high-dimensional space. (It may be thought of as a cross between a k-nn instance learner and a Naive Bayes learner. It only supports continuous features and labels (so it is common to wrap it in a Categorize filter which will convert nominal features to a categorical distribution of continuous values).


Constructor & Destructor Documentation

GClasses::GNaiveInstance::GNaiveInstance ( GRand rand)

nNeighbors is the number of neighbors (in each dimension) that will contribute to the output value.

GClasses::GNaiveInstance::GNaiveInstance ( GDomNode pNode,
GLearnerLoader ll 
)

Deserializing constructor.

virtual GClasses::GNaiveInstance::~GNaiveInstance ( ) [virtual]

Member Function Documentation

void GClasses::GNaiveInstance::autoTune ( GMatrix features,
GMatrix labels 
)

Uses cross-validation to find a set of parameters that works well with the provided data.

virtual void GClasses::GNaiveInstance::beginIncrementalLearningInner ( sp_relation pFeatureRel,
sp_relation pLabelRel 
) [protected, virtual]
virtual bool GClasses::GNaiveInstance::canImplicitlyHandleNominalFeatures ( ) [inline, protected, virtual]
virtual bool GClasses::GNaiveInstance::canImplicitlyHandleNominalLabels ( ) [inline, protected, virtual]

See the comment for GTransducer::canImplicitlyHandleNominalLabels.

Reimplemented from GClasses::GTransducer.

virtual void GClasses::GNaiveInstance::clear ( ) [virtual]

See the comment for GSupervisedLearner::clear.

Implements GClasses::GSupervisedLearner.

void GClasses::GNaiveInstance::evalInput ( size_t  nInputDim,
double  dInput 
) [protected]
size_t GClasses::GNaiveInstance::neighbors ( ) [inline]

Returns the number of neighbors.

virtual void GClasses::GNaiveInstance::predictDistributionInner ( const double *  pIn,
GPrediction pOut 
) [protected, virtual]
virtual void GClasses::GNaiveInstance::predictInner ( const double *  pIn,
double *  pOut 
) [protected, virtual]
virtual GDomNode* GClasses::GNaiveInstance::serialize ( GDom pDoc) [virtual]

Marshal this object into a DOM, which can then be converted to a variety of serial formats.

Implements GClasses::GSupervisedLearner.

void GClasses::GNaiveInstance::setNeighbors ( size_t  k) [inline]

Specify the number of neighbors to use.

static void GClasses::GNaiveInstance::test ( ) [static]

Performs unit tests for this class. Throws an exception if there is a failure.

Reimplemented from GClasses::GSupervisedLearner.

virtual void GClasses::GNaiveInstance::trainIncrementalInner ( const double *  pIn,
const double *  pOut 
) [protected, virtual]

Incrementally train with a single instance.

Implements GClasses::GIncrementalLearner.

virtual void GClasses::GNaiveInstance::trainInner ( GMatrix features,
GMatrix labels 
) [protected, virtual]
virtual void GClasses::GNaiveInstance::trainSparse ( GSparseMatrix features,
GMatrix labels 
) [virtual]

Member Data Documentation

GNaiveInstanceAttr** GClasses::GNaiveInstance::m_pAttrs [protected]