GClasses

GClasses::GSelfOrganizingMap Class Reference

An implementation of a Kohonen self-organizing map. More...

#include <GSelfOrganizingMap.h>

Inheritance diagram for GClasses::GSelfOrganizingMap:
GClasses::GIncrementalTransform GClasses::GTransform

List of all members.

Public Member Functions

 GSelfOrganizingMap (int nMapDims, int nMapWidth, GRand *pRand, SOM::Reporter *r=new SOM::NoReporting())
 Creates a map whose nodes are on a grid that uses euclidean distance to find the nearest point and is trained with the batch SOM algorithm with neighborhood decreasing exponentially from 2*nMapWidth to 1 in 100 super-epochs with each super-epoch involving waiting up to 100 iterations for convergence at that neighborhood width. The neighbors affect one another through a unit height Gaussian with sigma=neighborhood size. Every epoch and super-epoch, the reporter is notified. This map will be responsible for deleting the dynamically allocated reporter.
 GSelfOrganizingMap (const std::vector< double > &outputAxes, std::size_t numNodes, SOM::NodeLocationInitialization *topology, SOM::TrainingAlgorithm *trainer, GDistanceMetric *weightDistance, GDistanceMetric *nodeDistance)
 Create a self-organizing map.
 GSelfOrganizingMap (GDomNode *pNode)
 Reconstruct this self-organizing map from its serialized form in a dom document.
virtual ~GSelfOrganizingMap ()
virtual GMatrixdoit (GMatrix &in)
 Transforms pIn after training on it.
virtual GDomNodeserialize (GDom *)
 Add this map to a dom document and return the pointer to the tree added.

WARNING: the current imlementation DOES NOT SERIALIZE THE TRAINING ALGORITHM - do not train a GSelfOrganizingMap obtained from a dom file.
virtual void train (GMatrix &in)
 see comment on GIncrementalTransform::train(GMatrix&)
virtual void beginIncrementalTraining (sp_relation &, double *, double *)
 TODO: implement enableIncrementalTraining.
void transform (const double *pIn, double *pOut)
 Transform the given vector from input coordinates to map coordinates by finding the best match among the nodes.
std::size_t bestMatch (const double *pIn) const
 Return the index of the node whose weight vector best matches in under the distance metric for this SOM. Assumes there is at least one node.
std::vector< std::size_t > bestData (const GMatrix *pData) const
 Given a matrix containing input data of the correct dimensions, returns a vector v of nodes.size() indices into that matrix. v[i] is the index of the data in pData that gives the strongest response for node i. That is, pData->row(v[i]) is the data element that best matches node i.
std::vector< std::size_t > nearestNeighbors (unsigned nodeIdx, unsigned numNeighbors, double epsilon=1e-8) const
 Inspector giving the indices of the nearest neighbors in output space of the node at index nodeIdx. If any neighbor with a given distance is returned all neighbors at that distance will be returned. Any distances within epsilon of one another are considered equivalent.
std::vector< SOM::NodeAndDistanceneighborsInCircle (unsigned nodeIdx, double radius) const
 Return all neighbors of the node at that have a distance from nodeIdx of less than radius.
unsigned inputDimensions () const
 Return the number of dimensions this takes as input.
unsigned outputDimensions () const
 Return the number of dimensions this returns as output.
std::vector< double > outputAxes () const
 Return the maximum for each output axis.
const std::vector< SOM::Node > & nodes () const
 Inspector for the nodes making up this map.
std::vector< SOM::Node > & nodes ()
 Accessor for the nodes making up this map.
const GDistanceMetricweightDistance () const
 Inspector for the distance metric used in input space, that is, between an input point and a weight for determining the winner.
const GDistanceMetricnodeDistance () const
 Inspector for the distance metric used in output space, that is, between two nodes for their relative influence.

Static Public Member Functions

static void test ()
 Performs unit tests for this class. Throws an exception if there is a failure.

Protected Member Functions

void invalidateSortedNeighbors ()
 Marks sortedNeighbors as invalid and need of regeneration.
void regenerateSortedNeighbors () const
 Eliminates the current contents of m_sortedNeighbors and regenerates it from m_pNodeDistance and m_nodes. sets m_sortedNeighborsIsValid on completion. Note that the members this method changes are mutable, thus it can be marked as a const method and called from const methods.
const std::vector< std::vector
< SOM::NodeAndDistance > > & 
sortedNeighbors () const
 Return a sorted neighbor structure in which entry i is a list of the node indices sorted by their distance from node i.

Protected Attributes

unsigned m_nInputDims
 Number of input dimensions.
std::vector< double > m_outputAxes
 Vector containing the maximum number in each output axis - so axis #0 would have a range 0..m_vOutputAxes[0]. Remember to call m_pNodeDistance->init whenever you change the number of output axes.
SOM::TrainingAlgorithmm_pTrainer
 The algorithm to be used to train this map on the next call to train. This object owns the training algorithm and is responsable for deleting it.
GDistanceMetricm_pWeightDistance
 The distance function to use when finding the closest weight to a given input point - owned by this object. Remember to call init on this whenever you change the number of weights.
GDistanceMetricm_pNodeDistance
 The distance function to use when calculating the distance between nodes in the map. If you change this, remember to invalidate the neighbor structure. - owned by this object. Remember to call init whenever you change the number of output axes.
std::vector< SOM::Nodem_nodes
 The nodes in the map. Do not change the order of nodes once sortedNeighbors has been created unless you also invalidate sortedNeighbors. Remember to call m_pWeightDistance->init if you change the number of weights.
bool m_sortedNeighborsIsValid
std::vector< std::vector
< SOM::NodeAndDistance > > 
m_sortedNeighbors
 Each entry sortedNeighbors[i] contains a vector of all the other nodes sorted by their distance from i in outputSpace.

Friends

class SOM::TrainingAlgorithm

Detailed Description

An implementation of a Kohonen self-organizing map.

Note: does not support more than 2^52 nodes -- I don't believe this will be a problem within the next 20 years.

See: T. Kohonen "Self Organizing Maps" Third Edition, 2001, published by Springer


Constructor & Destructor Documentation

GClasses::GSelfOrganizingMap::GSelfOrganizingMap ( int  nMapDims,
int  nMapWidth,
GRand pRand,
SOM::Reporter r = new SOM::NoReporting() 
)

Creates a map whose nodes are on a grid that uses euclidean distance to find the nearest point and is trained with the batch SOM algorithm with neighborhood decreasing exponentially from 2*nMapWidth to 1 in 100 super-epochs with each super-epoch involving waiting up to 100 iterations for convergence at that neighborhood width. The neighbors affect one another through a unit height Gaussian with sigma=neighborhood size. Every epoch and super-epoch, the reporter is notified. This map will be responsible for deleting the dynamically allocated reporter.

nMapDims specifies the number of dimensions for the map.

nMapWidth specifies the size in one dimension of the map.

(so if nMapDims is 3 and nMapWidth is 10, the map will contain 1000 (10^3) nodes.)

GClasses::GSelfOrganizingMap::GSelfOrganizingMap ( const std::vector< double > &  outputAxes,
std::size_t  numNodes,
SOM::NodeLocationInitialization topology,
SOM::TrainingAlgorithm trainer,
GDistanceMetric weightDistance,
GDistanceMetric nodeDistance 
)

Create a self-organizing map.

Parameters:
outputAxeseach node in the network will have a vector location and component i of that vector will range from 0..outputAxes[i]
numNodesthe number of nodes in the network
topologydetermines the locations assigned to each node
traineralgorithm to train the network on data
weightDistancethe distance used in determining which node's weight-set is closest to an input point
nodeDistancethe distance used to determine the influence of two nodes with different coordinates on one another.
GClasses::GSelfOrganizingMap::GSelfOrganizingMap ( GDomNode pNode)

Reconstruct this self-organizing map from its serialized form in a dom document.

virtual GClasses::GSelfOrganizingMap::~GSelfOrganizingMap ( ) [virtual]

Member Function Documentation

virtual void GClasses::GSelfOrganizingMap::beginIncrementalTraining ( sp_relation ,
double *  ,
double *   
) [inline, virtual]

TODO: implement enableIncrementalTraining.

std::vector<std::size_t> GClasses::GSelfOrganizingMap::bestData ( const GMatrix pData) const

Given a matrix containing input data of the correct dimensions, returns a vector v of nodes.size() indices into that matrix. v[i] is the index of the data in pData that gives the strongest response for node i. That is, pData->row(v[i]) is the data element that best matches node i.

Assumes there is at least one row in pData

std::size_t GClasses::GSelfOrganizingMap::bestMatch ( const double *  pIn) const

Return the index of the node whose weight vector best matches in under the distance metric for this SOM. Assumes there is at least one node.

virtual GMatrix* GClasses::GSelfOrganizingMap::doit ( GMatrix in) [virtual]

Transforms pIn after training on it.

Reimplemented from GClasses::GIncrementalTransform.

unsigned GClasses::GSelfOrganizingMap::inputDimensions ( ) const [inline]

Return the number of dimensions this takes as input.

void GClasses::GSelfOrganizingMap::invalidateSortedNeighbors ( ) [inline, protected]

Marks sortedNeighbors as invalid and need of regeneration.

std::vector<std::size_t> GClasses::GSelfOrganizingMap::nearestNeighbors ( unsigned  nodeIdx,
unsigned  numNeighbors,
double  epsilon = 1e-8 
) const

Inspector giving the indices of the nearest neighbors in output space of the node at index nodeIdx. If any neighbor with a given distance is returned all neighbors at that distance will be returned. Any distances within epsilon of one another are considered equivalent.

If there are enough nodes in the network, at least numNeigbors neighbors will be returned. If there are fewer nodes in the network than the number of neighbors, a list of all indices will be returned.

std::vector<SOM::NodeAndDistance> GClasses::GSelfOrganizingMap::neighborsInCircle ( unsigned  nodeIdx,
double  radius 
) const

Return all neighbors of the node at that have a distance from nodeIdx of less than radius.

const GDistanceMetric* GClasses::GSelfOrganizingMap::nodeDistance ( ) const [inline]

Inspector for the distance metric used in output space, that is, between two nodes for their relative influence.

const std::vector<SOM::Node>& GClasses::GSelfOrganizingMap::nodes ( ) const [inline]

Inspector for the nodes making up this map.

std::vector<SOM::Node>& GClasses::GSelfOrganizingMap::nodes ( ) [inline]

Accessor for the nodes making up this map.

std::vector<double> GClasses::GSelfOrganizingMap::outputAxes ( ) const [inline]

Return the maximum for each output axis.

unsigned GClasses::GSelfOrganizingMap::outputDimensions ( ) const [inline]

Return the number of dimensions this returns as output.

void GClasses::GSelfOrganizingMap::regenerateSortedNeighbors ( ) const [protected]

Eliminates the current contents of m_sortedNeighbors and regenerates it from m_pNodeDistance and m_nodes. sets m_sortedNeighborsIsValid on completion. Note that the members this method changes are mutable, thus it can be marked as a const method and called from const methods.

virtual GDomNode* GClasses::GSelfOrganizingMap::serialize ( GDom ) [virtual]

Add this map to a dom document and return the pointer to the tree added.

WARNING: the current imlementation DOES NOT SERIALIZE THE TRAINING ALGORITHM - do not train a GSelfOrganizingMap obtained from a dom file.

TODO: make all serialize's const

Implements GClasses::GIncrementalTransform.

const std::vector<std::vector<SOM::NodeAndDistance> >& GClasses::GSelfOrganizingMap::sortedNeighbors ( ) const [inline, protected]

Return a sorted neighbor structure in which entry i is a list of the node indices sorted by their distance from node i.

static void GClasses::GSelfOrganizingMap::test ( ) [static]

Performs unit tests for this class. Throws an exception if there is a failure.

virtual void GClasses::GSelfOrganizingMap::train ( GMatrix in) [inline, virtual]
void GClasses::GSelfOrganizingMap::transform ( const double *  pIn,
double *  pOut 
) [virtual]

Transform the given vector from input coordinates to map coordinates by finding the best match among the nodes.

see comment on GIncrementalTransform::(const double*, double*)

Implements GClasses::GIncrementalTransform.

const GDistanceMetric* GClasses::GSelfOrganizingMap::weightDistance ( ) const [inline]

Inspector for the distance metric used in input space, that is, between an input point and a weight for determining the winner.


Friends And Related Function Documentation

friend class SOM::TrainingAlgorithm [friend]

Member Data Documentation

Number of input dimensions.

The nodes in the map. Do not change the order of nodes once sortedNeighbors has been created unless you also invalidate sortedNeighbors. Remember to call m_pWeightDistance->init if you change the number of weights.

std::vector<double> GClasses::GSelfOrganizingMap::m_outputAxes [protected]

Vector containing the maximum number in each output axis - so axis #0 would have a range 0..m_vOutputAxes[0]. Remember to call m_pNodeDistance->init whenever you change the number of output axes.

The distance function to use when calculating the distance between nodes in the map. If you change this, remember to invalidate the neighbor structure. - owned by this object. Remember to call init whenever you change the number of output axes.

The algorithm to be used to train this map on the next call to train. This object owns the training algorithm and is responsable for deleting it.

The distance function to use when finding the closest weight to a given input point - owned by this object. Remember to call init on this whenever you change the number of weights.

std::vector<std::vector<SOM::NodeAndDistance> > GClasses::GSelfOrganizingMap::m_sortedNeighbors [mutable, protected]

Each entry sortedNeighbors[i] contains a vector of all the other nodes sorted by their distance from i in outputSpace.