GClasses
|
An implementation of a Kohonen self-organizing map. More...
#include <GSelfOrganizingMap.h>
Public Member Functions | |
GSelfOrganizingMap (int nMapDims, int nMapWidth, GRand *pRand, SOM::Reporter *r=new SOM::NoReporting()) | |
Creates a map whose nodes are on a grid that uses euclidean distance to find the nearest point and is trained with the batch SOM algorithm with neighborhood decreasing exponentially from 2*nMapWidth to 1 in 100 super-epochs with each super-epoch involving waiting up to 100 iterations for convergence at that neighborhood width. The neighbors affect one another through a unit height Gaussian with sigma=neighborhood size. Every epoch and super-epoch, the reporter is notified. This map will be responsible for deleting the dynamically allocated reporter. | |
GSelfOrganizingMap (const std::vector< double > &outputAxes, std::size_t numNodes, SOM::NodeLocationInitialization *topology, SOM::TrainingAlgorithm *trainer, GDistanceMetric *weightDistance, GDistanceMetric *nodeDistance) | |
Create a self-organizing map. | |
GSelfOrganizingMap (GDomNode *pNode) | |
Reconstruct this self-organizing map from its serialized form in a dom document. | |
virtual | ~GSelfOrganizingMap () |
virtual GMatrix * | doit (GMatrix &in) |
Transforms pIn after training on it. | |
virtual GDomNode * | serialize (GDom *) |
Add this map to a dom document and return the pointer to the tree added. WARNING: the current imlementation DOES NOT SERIALIZE THE TRAINING ALGORITHM - do not train a GSelfOrganizingMap obtained from a dom file. | |
virtual void | train (GMatrix &in) |
see comment on GIncrementalTransform::train(GMatrix&) | |
virtual void | beginIncrementalTraining (sp_relation &, double *, double *) |
TODO: implement enableIncrementalTraining. | |
void | transform (const double *pIn, double *pOut) |
Transform the given vector from input coordinates to map coordinates by finding the best match among the nodes. | |
std::size_t | bestMatch (const double *pIn) const |
Return the index of the node whose weight vector best matches in under the distance metric for this SOM. Assumes there is at least one node. | |
std::vector< std::size_t > | bestData (const GMatrix *pData) const |
Given a matrix containing input data of the correct dimensions, returns a vector v of nodes.size() indices into that matrix. v[i] is the index of the data in pData that gives the strongest response for node i. That is, pData->row(v[i]) is the data element that best matches node i. | |
std::vector< std::size_t > | nearestNeighbors (unsigned nodeIdx, unsigned numNeighbors, double epsilon=1e-8) const |
Inspector giving the indices of the nearest neighbors in output space of the node at index nodeIdx. If any neighbor with a given distance is returned all neighbors at that distance will be returned. Any distances within epsilon of one another are considered equivalent. | |
std::vector< SOM::NodeAndDistance > | neighborsInCircle (unsigned nodeIdx, double radius) const |
Return all neighbors of the node at that have a distance from nodeIdx of less than radius. | |
unsigned | inputDimensions () const |
Return the number of dimensions this takes as input. | |
unsigned | outputDimensions () const |
Return the number of dimensions this returns as output. | |
std::vector< double > | outputAxes () const |
Return the maximum for each output axis. | |
const std::vector< SOM::Node > & | nodes () const |
Inspector for the nodes making up this map. | |
std::vector< SOM::Node > & | nodes () |
Accessor for the nodes making up this map. | |
const GDistanceMetric * | weightDistance () const |
Inspector for the distance metric used in input space, that is, between an input point and a weight for determining the winner. | |
const GDistanceMetric * | nodeDistance () const |
Inspector for the distance metric used in output space, that is, between two nodes for their relative influence. | |
Static Public Member Functions | |
static void | test () |
Performs unit tests for this class. Throws an exception if there is a failure. | |
Protected Member Functions | |
void | invalidateSortedNeighbors () |
Marks sortedNeighbors as invalid and need of regeneration. | |
void | regenerateSortedNeighbors () const |
Eliminates the current contents of m_sortedNeighbors and regenerates it from m_pNodeDistance and m_nodes. sets m_sortedNeighborsIsValid on completion. Note that the members this method changes are mutable, thus it can be marked as a const method and called from const methods. | |
const std::vector< std::vector < SOM::NodeAndDistance > > & | sortedNeighbors () const |
Return a sorted neighbor structure in which entry i is a list of the node indices sorted by their distance from node i. | |
Protected Attributes | |
unsigned | m_nInputDims |
Number of input dimensions. | |
std::vector< double > | m_outputAxes |
Vector containing the maximum number in each output axis - so axis #0 would have a range 0..m_vOutputAxes[0]. Remember to call m_pNodeDistance->init whenever you change the number of output axes. | |
SOM::TrainingAlgorithm * | m_pTrainer |
The algorithm to be used to train this map on the next call to train. This object owns the training algorithm and is responsable for deleting it. | |
GDistanceMetric * | m_pWeightDistance |
The distance function to use when finding the closest weight to a given input point - owned by this object. Remember to call init on this whenever you change the number of weights. | |
GDistanceMetric * | m_pNodeDistance |
The distance function to use when calculating the distance between nodes in the map. If you change this, remember to invalidate the neighbor structure. - owned by this object. Remember to call init whenever you change the number of output axes. | |
std::vector< SOM::Node > | m_nodes |
The nodes in the map. Do not change the order of nodes once sortedNeighbors has been created unless you also invalidate sortedNeighbors. Remember to call m_pWeightDistance->init if you change the number of weights. | |
bool | m_sortedNeighborsIsValid |
std::vector< std::vector < SOM::NodeAndDistance > > | m_sortedNeighbors |
Each entry sortedNeighbors[i] contains a vector of all the other nodes sorted by their distance from i in outputSpace. | |
Friends | |
class | SOM::TrainingAlgorithm |
An implementation of a Kohonen self-organizing map.
Note: does not support more than 2^52 nodes -- I don't believe this will be a problem within the next 20 years.
See: T. Kohonen "Self Organizing Maps" Third Edition, 2001, published by Springer
GClasses::GSelfOrganizingMap::GSelfOrganizingMap | ( | int | nMapDims, |
int | nMapWidth, | ||
GRand * | pRand, | ||
SOM::Reporter * | r = new SOM::NoReporting() |
||
) |
Creates a map whose nodes are on a grid that uses euclidean distance to find the nearest point and is trained with the batch SOM algorithm with neighborhood decreasing exponentially from 2*nMapWidth to 1 in 100 super-epochs with each super-epoch involving waiting up to 100 iterations for convergence at that neighborhood width. The neighbors affect one another through a unit height Gaussian with sigma=neighborhood size. Every epoch and super-epoch, the reporter is notified. This map will be responsible for deleting the dynamically allocated reporter.
nMapDims specifies the number of dimensions for the map.
nMapWidth specifies the size in one dimension of the map.
(so if nMapDims is 3 and nMapWidth is 10, the map will contain 1000 (10^3) nodes.)
GClasses::GSelfOrganizingMap::GSelfOrganizingMap | ( | const std::vector< double > & | outputAxes, |
std::size_t | numNodes, | ||
SOM::NodeLocationInitialization * | topology, | ||
SOM::TrainingAlgorithm * | trainer, | ||
GDistanceMetric * | weightDistance, | ||
GDistanceMetric * | nodeDistance | ||
) |
Create a self-organizing map.
outputAxes | each node in the network will have a vector location and component i of that vector will range from 0..outputAxes[i] |
numNodes | the number of nodes in the network |
topology | determines the locations assigned to each node |
trainer | algorithm to train the network on data |
weightDistance | the distance used in determining which node's weight-set is closest to an input point |
nodeDistance | the distance used to determine the influence of two nodes with different coordinates on one another. |
GClasses::GSelfOrganizingMap::GSelfOrganizingMap | ( | GDomNode * | pNode | ) |
Reconstruct this self-organizing map from its serialized form in a dom document.
virtual GClasses::GSelfOrganizingMap::~GSelfOrganizingMap | ( | ) | [virtual] |
virtual void GClasses::GSelfOrganizingMap::beginIncrementalTraining | ( | sp_relation & | , |
double * | , | ||
double * | |||
) | [inline, virtual] |
TODO: implement enableIncrementalTraining.
std::vector<std::size_t> GClasses::GSelfOrganizingMap::bestData | ( | const GMatrix * | pData | ) | const |
Given a matrix containing input data of the correct dimensions, returns a vector v of nodes.size() indices into that matrix. v[i] is the index of the data in pData that gives the strongest response for node i. That is, pData->row(v[i]) is the data element that best matches node i.
Assumes there is at least one row in pData
std::size_t GClasses::GSelfOrganizingMap::bestMatch | ( | const double * | pIn | ) | const |
Return the index of the node whose weight vector best matches in under the distance metric for this SOM. Assumes there is at least one node.
Transforms pIn after training on it.
Reimplemented from GClasses::GIncrementalTransform.
unsigned GClasses::GSelfOrganizingMap::inputDimensions | ( | ) | const [inline] |
Return the number of dimensions this takes as input.
void GClasses::GSelfOrganizingMap::invalidateSortedNeighbors | ( | ) | [inline, protected] |
Marks sortedNeighbors as invalid and need of regeneration.
std::vector<std::size_t> GClasses::GSelfOrganizingMap::nearestNeighbors | ( | unsigned | nodeIdx, |
unsigned | numNeighbors, | ||
double | epsilon = 1e-8 |
||
) | const |
Inspector giving the indices of the nearest neighbors in output space of the node at index nodeIdx. If any neighbor with a given distance is returned all neighbors at that distance will be returned. Any distances within epsilon of one another are considered equivalent.
If there are enough nodes in the network, at least numNeigbors neighbors will be returned. If there are fewer nodes in the network than the number of neighbors, a list of all indices will be returned.
std::vector<SOM::NodeAndDistance> GClasses::GSelfOrganizingMap::neighborsInCircle | ( | unsigned | nodeIdx, |
double | radius | ||
) | const |
Return all neighbors of the node at that have a distance from nodeIdx of less than radius.
const GDistanceMetric* GClasses::GSelfOrganizingMap::nodeDistance | ( | ) | const [inline] |
Inspector for the distance metric used in output space, that is, between two nodes for their relative influence.
const std::vector<SOM::Node>& GClasses::GSelfOrganizingMap::nodes | ( | ) | const [inline] |
Inspector for the nodes making up this map.
std::vector<SOM::Node>& GClasses::GSelfOrganizingMap::nodes | ( | ) | [inline] |
Accessor for the nodes making up this map.
std::vector<double> GClasses::GSelfOrganizingMap::outputAxes | ( | ) | const [inline] |
Return the maximum for each output axis.
unsigned GClasses::GSelfOrganizingMap::outputDimensions | ( | ) | const [inline] |
Return the number of dimensions this returns as output.
void GClasses::GSelfOrganizingMap::regenerateSortedNeighbors | ( | ) | const [protected] |
Eliminates the current contents of m_sortedNeighbors and regenerates it from m_pNodeDistance and m_nodes. sets m_sortedNeighborsIsValid on completion. Note that the members this method changes are mutable, thus it can be marked as a const method and called from const methods.
Add this map to a dom document and return the pointer to the tree added.
WARNING: the current imlementation DOES NOT SERIALIZE THE TRAINING ALGORITHM - do not train a GSelfOrganizingMap obtained from a dom file.
TODO: make all serialize's const
Implements GClasses::GIncrementalTransform.
const std::vector<std::vector<SOM::NodeAndDistance> >& GClasses::GSelfOrganizingMap::sortedNeighbors | ( | ) | const [inline, protected] |
Return a sorted neighbor structure in which entry i is a list of the node indices sorted by their distance from node i.
static void GClasses::GSelfOrganizingMap::test | ( | ) | [static] |
Performs unit tests for this class. Throws an exception if there is a failure.
virtual void GClasses::GSelfOrganizingMap::train | ( | GMatrix & | in | ) | [inline, virtual] |
see comment on GIncrementalTransform::train(GMatrix&)
Implements GClasses::GIncrementalTransform.
void GClasses::GSelfOrganizingMap::transform | ( | const double * | pIn, |
double * | pOut | ||
) | [virtual] |
Transform the given vector from input coordinates to map coordinates by finding the best match among the nodes.
see comment on GIncrementalTransform::(const double*, double*)
Implements GClasses::GIncrementalTransform.
const GDistanceMetric* GClasses::GSelfOrganizingMap::weightDistance | ( | ) | const [inline] |
Inspector for the distance metric used in input space, that is, between an input point and a weight for determining the winner.
friend class SOM::TrainingAlgorithm [friend] |
unsigned GClasses::GSelfOrganizingMap::m_nInputDims [protected] |
Number of input dimensions.
std::vector<SOM::Node> GClasses::GSelfOrganizingMap::m_nodes [protected] |
The nodes in the map. Do not change the order of nodes once sortedNeighbors has been created unless you also invalidate sortedNeighbors. Remember to call m_pWeightDistance->init if you change the number of weights.
std::vector<double> GClasses::GSelfOrganizingMap::m_outputAxes [protected] |
Vector containing the maximum number in each output axis - so axis #0 would have a range 0..m_vOutputAxes[0]. Remember to call m_pNodeDistance->init whenever you change the number of output axes.
The distance function to use when calculating the distance between nodes in the map. If you change this, remember to invalidate the neighbor structure. - owned by this object. Remember to call init whenever you change the number of output axes.
The algorithm to be used to train this map on the next call to train. This object owns the training algorithm and is responsable for deleting it.
The distance function to use when finding the closest weight to a given input point - owned by this object. Remember to call init on this whenever you change the number of weights.
std::vector<std::vector<SOM::NodeAndDistance> > GClasses::GSelfOrganizingMap::m_sortedNeighbors [mutable, protected] |
Each entry sortedNeighbors[i] contains a vector of all the other nodes sorted by their distance from i in outputSpace.
bool GClasses::GSelfOrganizingMap::m_sortedNeighborsIsValid [mutable, protected] |