A command-line tool for learning from and operating on sparse data, as are typically used to represent text-documents, etc. Here's the usage information:
Full Usage Information [Square brackets] are used to indicate required arguments. <Angled brackets> are used to indicate optional arguments. waffles_recommend [command] Predict missing values in data, and test collaborative-filtering recommendation systems. crossvalidate <options> [3col-data] [collab-filter] Measure accuracy using cross-validation. Prints MSE and MAE to stdout. <options> -seed [value] Specify a seed for the random number generator. -folds [n] Specify the number of folds. If not specified, the default is 2. [3col-data] The filename of 3-column dataset with one row for each rating. Column 0 contains a user ID. Column 1 contains an item ID. Column 2 contains the known rating for that user-item pair. fillmissingvalues <options> [data] [collab-filter] Fill in the missing values in an ARFF file with predicted values and print the resulting full dataset to stdout. <options> -seed [value] Specify a seed for the random number generator. [data] The filename of a dataset with missing values to impute. precisionrecall <options> [3col-data] [collab-filter] Compute precision-recall data <options> -seed [value] Specify a seed for the random number generator. -ideal Ignore the model and compute ideal results (as if the model always predicted correct ratings). [3col-data] The filename of 3-column dataset with one row for each rating. Column 0 contains a user ID. Column 1 contains an item ID. Column 2 contains the known rating for that user-item pair. roc <options> [3col-data] [collab-filter] Compute data for an ROC curve. (The area under the curve will appear in the comments at the top of the data.) <options> -seed [value] Specify a seed for the random number generator. -ideal Ignore the model and compute ideal results (as if the model always predicted correct ratings). [3col-data] The filename of 3-column dataset with one row for each rating. Column 0 contains a user ID. Column 1 contains an item ID. Column 2 contains the known rating for that user-item pair. transacc <options> [train] [test] [collab-filter] Train using [train], then test using [test]. Prints MSE and MAE to stdout. <options> -seed [value] Specify a seed for the random number generator. [train] The filename of 3-column dataset with one row for each rating. Column 0 contains a user ID. Column 1 contains an item ID. Column 2 contains the known rating for that user-item pair. [test] The filename of 3-column dataset with one row for each rating. Column 0 contains a user ID. Column 1 contains an item ID. Column 2 contains the known rating for that user-item pair. usage Print usage information. [collab-filter] A collaborative-filtering recommendation algorithm. bag <contents> end A bagging (bootstrap aggregating) ensemble. This is a way to combine the power of collaborative filtering algorithms through voting. "end" marks the end of the ensemble contents. Each collaborative filtering algorithm instance is trained on a subset of the original data, where each expressed element is given a probability of 0.5 of occurring in the training set. <contents> [instance_count] [collab-filter] Specify the number of instances of a collaborative filtering algorithm to add to the bagging ensemble. baseline A very simple recommendation algorithm. It always predicts the average rating for each item. This algorithm is useful as a baseline algorithm for comparison. clusterdense [n] <options> A collaborative-filtering algorithm that clusters users based on a dense distance metric with k-means, and then makes uniform recommendations within each cluster. [n] The number of clusters to use. <options> -norm [l] Specify the norm for the L-norm distance metric to use. -missingpenalty [d] Specify the difference to use in the distance computation when a value is missing from one or both of the vectors. clustersparse [n] <options> A collaborative-filtering algorithm that clusters users based on a sparse similarity metric with k-means, and then makes uniform recommendations within each cluster. [n] The number of clusters to use. <options> -pearson Use Pearson Correlation to compute the similarity between users. (The default is to use the cosine method.) instance [k] <options> An instance-based collaborative-filtering algorithm that makes recommendations based on the k-nearest neighbors of a user. [k] The number of neighbors to use. <options> -pearson Use Pearson Correlation to compute the similarity between users. (The default is to use the cosine method.) -regularize [value] Add [value] to the denominator in order to regularize the results. This ensures that recommendations will not be dominated when a small number of overlapping items occurs. Typically, [value] will be a small number, like 0.5 or 1.5. matrix [intrinsic] <options> A matrix factorization collaborative-filtering algorithm. (Implemented according to the specification on page 631 in Takacs, G., Pilaszy, I., Nemeth, B., and Tikk, D. Scalable collaborative filtering approaches for large recommender systems. The Journal of Machine Learning Research, 10:623-656, 2009. ISSN 1532-4435., except with the addition of learning-rate decay and a different stopping criteria.) [intrinsic] The number of intrinsic (or latent) feature dims to use to represent each user's preferences. <options> -regularize [value] Specify a regularization value. Typically, this is a small value. Larger values will put more pressure on the system to use small values in the matrix factors. nlpca [intrinsic] <options> A non-linear PCA collaborative-filtering algorithm. This algorithm was published in Scholz, M. Kaplan, F. Guy, C. L. Kopka, J. Selbig, J., Non-linear PCA: a missing data approach, In Bioinformatics, Vol. 21, Number 20, pp. 3887-3895, Oxford University Press, 2005. It uses a generalization of backpropagation to train a multi-layer perceptron to fit to the known ratings, and to predict unknown values. [intrinsic] The number of intrinsic (or latent) feature dims to use to represent each user's preferences. <options> -addlayer [size] Add a hidden layer with "size" logisitic units to the network. You may use this option multiple times to add multiple layers. The first layer added is adjacent to the input features. The last layer added is adjacent to the output labels. If you don't add any hidden layers, the network is just a single layer of sigmoid units. -learningrate [value] Specify a value for the learning rate. The default is 0.1 -momentum [value] Specifies a value for the momentum. The default is 0.0 -windowepochs [value] Specifies the number of training epochs that are performed before the stopping criteria is tested again. Bigger values will result in a more stable stopping criteria. Smaller values will check the stopping criteria more frequently. -minwindowimprovement [value] Specify the minimum improvement that must occur over the window of epochs for training to continue. [value] specifies the minimum decrease in error as a ratio. For example, if value is 0.02, then training will stop when the mean squared error does not decrease by two percent over the window of epochs. Smaller values will typically result in longer training times. -dontsquashoutputs Don't squash the outputs values with the logistic function. Just report the net value at the output layer. This is often used for regression. -crossentropy Use cross-entropy instead of squared-error for the error signal. -noinputbias Do not use an input bias. -nothreepass Use one-pass training instead of three-pass training. -activation [func] Specify the activation function to use with all subsequently added layers. (For example, if you add this option after all of the -addlayer options, then the specified activation function will only apply to the output layer. If you add this option before all of the -addlayer options, then the specified activation function will be used in all layers. It is okay to use a different activation function with each layer, if you want.) logistic The logistic sigmoid function. (This is the default activation function.) arctan The arctan sigmoid function. tanh The hyperbolic tangeant sigmoid function. algebraic An algebraic sigmoid function. identity The identity function. This activation function is used to create a layer of linear perceptrons. (For regression problems, it is common to use this activation function on the output layer.) bidir A sigmoid-shaped function with a range from -inf to inf. It converges at both ends to -sqrt(-x) and sqrt(x). This activation function is designed to be used on the output layer with regression problems intead of identity. gaussian A gaussian activation function sinc A sinc wavelet activation function