GClasses
|
This is an instance-based learner. Instead of finding the k-nearest neighbors of a feature vector, it finds the k-nearst neighbors in each dimension. That is, it finds n*k neighbors, considering each dimension independently. It then combines the label from all of these neighbors to make a prediction. Finding neighbors in this way makes it more robust to high-dimensional datasets. It tends to perform worse than k-nn in low-dimensional space, and better than k-nn in high-dimensional space. (It may be thought of as a cross between a k-nn instance learner and a Naive Bayes learner. It only supports continuous features and labels (so it is common to wrap it in a Categorize filter which will convert nominal features to a categorical distribution of continuous values). More...
#include <GNaiveInstance.h>
Public Member Functions | |
GNaiveInstance (GRand &rand) | |
nNeighbors is the number of neighbors (in each dimension) that will contribute to the output value. | |
GNaiveInstance (GDomNode *pNode, GLearnerLoader &ll) | |
Deserializing constructor. | |
virtual | ~GNaiveInstance () |
virtual GDomNode * | serialize (GDom *pDoc) |
Marshal this object into a DOM, which can then be converted to a variety of serial formats. | |
void | setNeighbors (size_t k) |
Specify the number of neighbors to use. | |
size_t | neighbors () |
Returns the number of neighbors. | |
virtual void | trainSparse (GSparseMatrix &features, GMatrix &labels) |
See the comment for GIncrementalLearner::trainSparse. | |
virtual void | clear () |
See the comment for GSupervisedLearner::clear. | |
void | autoTune (GMatrix &features, GMatrix &labels) |
Uses cross-validation to find a set of parameters that works well with the provided data. | |
Static Public Member Functions | |
static void | test () |
Performs unit tests for this class. Throws an exception if there is a failure. | |
Protected Member Functions | |
void | evalInput (size_t nInputDim, double dInput) |
virtual void | trainInner (GMatrix &features, GMatrix &labels) |
See the comment for GSupervisedLearner::trainInner. | |
virtual void | predictInner (const double *pIn, double *pOut) |
See the comment for GSupervisedLearner::predictInner. | |
virtual void | predictDistributionInner (const double *pIn, GPrediction *pOut) |
See the comment for GSupervisedLearner::predictDistributionInner. | |
virtual bool | canImplicitlyHandleNominalFeatures () |
See the comment for GTransducer::canImplicitlyHandleNominalFeatures. | |
virtual bool | canImplicitlyHandleNominalLabels () |
See the comment for GTransducer::canImplicitlyHandleNominalLabels. | |
virtual void | beginIncrementalLearningInner (sp_relation &pFeatureRel, sp_relation &pLabelRel) |
See the comment for GIncrementalLearner::beginIncrementalLearningInner. | |
virtual void | trainIncrementalInner (const double *pIn, const double *pOut) |
Incrementally train with a single instance. | |
Protected Attributes | |
size_t | m_internalLabelDims |
size_t | m_internalFeatureDims |
size_t | m_nNeighbors |
GNaiveInstanceAttr ** | m_pAttrs |
double * | m_pValueSums |
double * | m_pWeightSums |
double * | m_pSumBuffer |
double * | m_pSumOfSquares |
GHeap * | m_pHeap |
This is an instance-based learner. Instead of finding the k-nearest neighbors of a feature vector, it finds the k-nearst neighbors in each dimension. That is, it finds n*k neighbors, considering each dimension independently. It then combines the label from all of these neighbors to make a prediction. Finding neighbors in this way makes it more robust to high-dimensional datasets. It tends to perform worse than k-nn in low-dimensional space, and better than k-nn in high-dimensional space. (It may be thought of as a cross between a k-nn instance learner and a Naive Bayes learner. It only supports continuous features and labels (so it is common to wrap it in a Categorize filter which will convert nominal features to a categorical distribution of continuous values).
GClasses::GNaiveInstance::GNaiveInstance | ( | GRand & | rand | ) |
nNeighbors is the number of neighbors (in each dimension) that will contribute to the output value.
GClasses::GNaiveInstance::GNaiveInstance | ( | GDomNode * | pNode, |
GLearnerLoader & | ll | ||
) |
Deserializing constructor.
virtual GClasses::GNaiveInstance::~GNaiveInstance | ( | ) | [virtual] |
Uses cross-validation to find a set of parameters that works well with the provided data.
virtual void GClasses::GNaiveInstance::beginIncrementalLearningInner | ( | sp_relation & | pFeatureRel, |
sp_relation & | pLabelRel | ||
) | [protected, virtual] |
See the comment for GIncrementalLearner::beginIncrementalLearningInner.
Implements GClasses::GIncrementalLearner.
virtual bool GClasses::GNaiveInstance::canImplicitlyHandleNominalFeatures | ( | ) | [inline, protected, virtual] |
See the comment for GTransducer::canImplicitlyHandleNominalFeatures.
Reimplemented from GClasses::GTransducer.
virtual bool GClasses::GNaiveInstance::canImplicitlyHandleNominalLabels | ( | ) | [inline, protected, virtual] |
See the comment for GTransducer::canImplicitlyHandleNominalLabels.
Reimplemented from GClasses::GTransducer.
virtual void GClasses::GNaiveInstance::clear | ( | ) | [virtual] |
See the comment for GSupervisedLearner::clear.
Implements GClasses::GSupervisedLearner.
void GClasses::GNaiveInstance::evalInput | ( | size_t | nInputDim, |
double | dInput | ||
) | [protected] |
size_t GClasses::GNaiveInstance::neighbors | ( | ) | [inline] |
Returns the number of neighbors.
virtual void GClasses::GNaiveInstance::predictDistributionInner | ( | const double * | pIn, |
GPrediction * | pOut | ||
) | [protected, virtual] |
See the comment for GSupervisedLearner::predictDistributionInner.
Implements GClasses::GSupervisedLearner.
virtual void GClasses::GNaiveInstance::predictInner | ( | const double * | pIn, |
double * | pOut | ||
) | [protected, virtual] |
See the comment for GSupervisedLearner::predictInner.
Implements GClasses::GSupervisedLearner.
Marshal this object into a DOM, which can then be converted to a variety of serial formats.
Implements GClasses::GSupervisedLearner.
void GClasses::GNaiveInstance::setNeighbors | ( | size_t | k | ) | [inline] |
Specify the number of neighbors to use.
static void GClasses::GNaiveInstance::test | ( | ) | [static] |
Performs unit tests for this class. Throws an exception if there is a failure.
Reimplemented from GClasses::GSupervisedLearner.
virtual void GClasses::GNaiveInstance::trainIncrementalInner | ( | const double * | pIn, |
const double * | pOut | ||
) | [protected, virtual] |
Incrementally train with a single instance.
Implements GClasses::GIncrementalLearner.
virtual void GClasses::GNaiveInstance::trainInner | ( | GMatrix & | features, |
GMatrix & | labels | ||
) | [protected, virtual] |
See the comment for GSupervisedLearner::trainInner.
Implements GClasses::GSupervisedLearner.
virtual void GClasses::GNaiveInstance::trainSparse | ( | GSparseMatrix & | features, |
GMatrix & | labels | ||
) | [virtual] |
See the comment for GIncrementalLearner::trainSparse.
Implements GClasses::GIncrementalLearner.
size_t GClasses::GNaiveInstance::m_internalFeatureDims [protected] |
size_t GClasses::GNaiveInstance::m_internalLabelDims [protected] |
size_t GClasses::GNaiveInstance::m_nNeighbors [protected] |
GNaiveInstanceAttr** GClasses::GNaiveInstance::m_pAttrs [protected] |
GHeap* GClasses::GNaiveInstance::m_pHeap [protected] |
double* GClasses::GNaiveInstance::m_pSumBuffer [protected] |
double* GClasses::GNaiveInstance::m_pSumOfSquares [protected] |
double* GClasses::GNaiveInstance::m_pValueSums [protected] |
double* GClasses::GNaiveInstance::m_pWeightSums [protected] |