A command-line tool for plotting and visualizing datasets. Here's the usage information:
Full Usage Information [Square brackets] are used to indicate required arguments. <Angled brackets> are used to indicate optional arguments. waffles_plot [command] Visualize data, plot functions, make charts, etc. 3d [dataset] <options> Make a 3d scatter plot. Points are colored with a spectrum according to their order in the dataset. [dataset] The filename of a dataset to plot. It must have exactly 3 continuous attributes. <options> -blast Produce a 5-by-5 grid of renderings, each time using a random point of view. It will print the random camera directions that it selects to stdout. -seed [value] Specify a seed for the random number generator. -size [width] [height] Sets the size of the image. The default is 1000 1000. -pointradius [radius] Set the size of the points. The default is 40.0. -bgcolor [color] Set the background color. If not specified, the default is ffffff. -cameradistance [dist] Set the distance between the camera and the mean of the data. This value is specified as a factor, which is multiplied by the distance between the min and max corners of the data. If not specified, the default is 1.5. (If the camera is too close to the data, make this value bigger.) -cameradirection [dx] [dy] [dz] Specifies the direction from the camera to the mean of the data. (The camera always looks at the mean.) The default is 0.6 -0.3 -0.8. -out [filename] Specify the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -nolabels Don't put axis labels on the bounding box. -nobox Don't draw a bounding box around the plot. bar [dataset] <options> Make a bar chart. [dataset] The filename of a dataset for the bar chart. The dataset must contain exactly one continuous attribute. Each data row specifies the height of a bar. <options> -log Use a logarithmic scale. -out [filename] Specifies the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. bigo [dataset] Estimate the Big-O runtime of algorithms based on empirical results. Regresses the formula t=a*(n^b+c) to fit the data, where n is the value in attribute 0 (representing the size of the data), and t (representing time) in the other attributes for each algorithm. The values of a, b, and c are reported for each attribute > 0. equation <options> [equations] Plot an equation (or multiple equations) in 2D <options> -out [filename] Specify the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -size [width] [height] Specify the size of the chart. (The default is 1024 1024.) -range [xmin] [ymin] [xmax] [ymax] Set the range. (The default is: -10 -10 10 10.) -textsize [size] Sets the label font size. If not specified, the default is 2.0. -nogrid Do not draw any grid lines. [equations] A set of equations separated by semicolons. Since '^' is a special character for many shells, it's usually a good idea to put your equations inside quotation marks. Here are some examples: "f1(x)=3*x+2" "f1(x)=(g(x)+1)/g(x); g(x)=sqrt(x)+pi" "h(bob)=bob^2;f1(x)=3+bar(x,5)*h(x)-(x/foo);bar(a,b)=a*b-b;foo=3.2" Only functions that begin with 'f' followed by a number will be plotted, starting with 'f1', and it will stop when the next number in ascending order is not defined. You may define any number of helper functions or constants with any name you like. Built in constants include: e, and pi. Built in functions include: +, -, *, /, %, ^, abs, acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor, gamma, lgamma, log, max, min, sin, sinh, sqrt, tan, and tanh. These generally have the same meaning as in C, except '^' means exponent, "gamma" is the gamma function, and max and min can support any number (>=1) of parameters. (Some of these functions may not not be available on Windows, but most of them are.) You can override any built in constants or functions with your own variables or functions, so you don't need to worry too much about name collisions. Variables must begin with an alphabet character or an underscore. Multiplication is never implicit, so you must use a '*' character to multiply. Whitespace is ignored. histogram [dataset] <options> Make a histogram. [dataset] The filename of a dataset for the histogram. <options> -size [width] [height] Specify the size of the chart. (The default is 1024 1024.) -attr [index] Specify which attribute is charted. (The default is 0.) -out [filename] Specify the name of the output file. (If not specified, the default is plot.png.) It should have the .png extension because other image formats are not yet supported. -range [xmin] [xmax] [ymax] Specify the range of the histogram plot model [model-file] [dataset] [attr-x] [attr-y] <options> Plot the model space of a trained supervised learning algorithm. [model-file] The filename of the trained model. (You can use "waffles_learn train" to make a model file.) [dataset] The filename of a dataset to be plotted. It can be the training set that was used to train the model, or a test set that it hasn't yet seen. [attr-x] The zero-based index of a continuous feature attributes for the horizontal axis. [attr-y] The zero-based index of a continuous feature attributes for the vertical axis. <options> -out [filename] Specify the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -size [width] [height] Specify the size of the image. -pointradius [size] Specify the size of the dots used to represent each instance. overlay [png1] [png2] <options> Make an image comprised of [png1] with [png2] on top of it. The two images must be the same size. [png1] The filename of an image in png format. [png2] The filename of an image in png format. <options> -out [filename] Specify the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -backcolor [hex] Specify the six-digit hexadecimal representation of the background color. (This color will be treated as being transparent in [png2]. If not specified, the default is ffffff (white). -tolerance [n] Specify the tolerance (an integer). If not specified, the default is 0. If a larger value is specified, then pixels in [png2] that are close to the background color will also be treated as being transparent. overview [dataset] Generate a matrix of plots of attribute distributions and correlations. This is a useful chart for becoming acquainted with a dataset. [dataset] The filename of a dataset to be charted. <options> -out [filename] Specify the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -cellsize [value] Change the size of each cell. The default is 100. -jitter [value] Specify how much to jitter the plotted points. The default is 0.03. -maxattrs [value] Specifies the maximum number of attributes to plot. The default is 20. printdecisiontree [model-file] <dataset> <data_opts> Print a textual representation of a decision tree to stdout. [model-file] The filename of a trained decision tree model. (You can make one with the command "waffles_learn train [dataset] decisiontree > [filename]".) <dataset> An optional filename of the arff file that was used to train the decision tree. The data in this file is ignored, but the meta-data will be used to make the printed model richer. <data_opts> -labels [attr_list] Specify which attributes to use as labels. (If not specified, the default is to use the last attribute for the label.) [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. -ignore [attr_list] Specify attributes to ignore. [attr_list] is a comma-separated list of zero-indexed columns. A hypen may be used to specify a range of columns. A '*' preceding a value means to index from the right instead of the left. For example, "0,2-5" refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all but the last column. scatter [dataset] <options> Makes a scatter plot or line graph. [dataset] The filename of a dataset to be plotted. The first attribute specifies the values on the horizontal axis. All other attributes specify the values on the vertical axis for a certain color. <options> -lines Draw lines connecting sequential point in the data. (In other words, make a line graph instead of a scatter plot.) -size [width] [height] Specify the size of the chart. (The default is 1024 1024.) -logx Show the horizontal axis on a logarithmic scale -logy Show the vertical axis on a logarithmic scale -nogrid Do not draw any grid lines. (This is the same as doing both -novgrid and -nohgrid.) -novgrid Do not draw any vertical grid lines. -nohgrid Do not draw any horizontal grid lines. -textsize [size] Sets the label font size. If not specified, the default is 2.0. -pointradius [radius] Set the size of the point dots. If not specified, the default is 7.0. -linethickness [value] Specify the line thickness. (The default is 3.0.) -range [xmin] [ymin] [xmax] [ymax] Sets the range. (The default is to determine the range automatically.) -aspect Adjust the range to preserve the aspect ratio. In other words, make sure that both axes visually have the same scale. -chartcolors [background] [text] [grid] Sets colors for the specified areas. (The default is ffffff 000000 808080.) -linecolors [c1] [c2] [c3] [c4] Sets the colors for the first four attributes. The default is 0000a0 a00000 008000 504010 (blue, red, green, brown). (If there are more than four lines, it will just distribute them evenly over the color spectrum.) -spectrum Instead of giving each line a unique color, this will use the color spectrum to indicate the position of each point within the data. -specmod [cycle] Like -spectrum, except it repeats the spectrum with the specified cycle size. -out [filename] Specifies the name of the output file. (The default is plot.png.) It should have the .png extension because other image formats are not yet supported. -neighbors [neighbor-finder] Draw lines connecting each point with its neighbors as determined by the specified neighbor finding algorithm. percentsame [dataset1] [dataset2] Given two data files, counts the number of identical values in the same place in each dataset. Prints as a percent for each column. The data files must have the same number and type of attributes as well as the same number of rows. semanticmap [model-file] [dataset] <options> Write a svg file representing a semantic map for the given self-organizing map processing the given dataset. For each node n, a semantic map plots, at n's location in the map, one attribute (usually a class label) of the entry of the input data to which n responds most strongly. [model-file] The self-organizing map output from "waffles_transform som". [dataset] Data for the semantic map in .arff format. Any attributes over the number needed for input to the self-organizing map are ignored in determining som node responses. <options> -out [filename] Write the svg file to filename. The default is "semantic_map.svg". -labels [column] Use the attribute column for labeling. Column is a zero-based index into the attributes. The default is to use the last column. -variance Label the nodes with the variance of the label column values for their winning dataset entries. If the label column is a variable being predicted, then its variance its related to the predictive power of that node. Higher variance means lower predictive power. stats [dataset] Prints some basic stats about the dataset to stdout. [dataset] The filename of a dataset. usage Print usage information.