Back to the docs page

Previous      Next

waffles_plot

A command-line tool for plotting and visualizing datasets. Here's the usage information:

Full Usage Information
[Square brackets] are used to indicate required arguments.
<Angled brackets> are used to indicate optional arguments.

waffles_plot [command]
   Visualize data, plot functions, make charts, etc.
   3d [dataset] <options>
      Make a 3d scatter plot. Points are colored with a spectrum according to
      their order in the dataset.
      [dataset]
         The filename of a dataset to plot. It must have exactly 3 continuous
         attributes.
      <options>
         -blast
            Produce a 5-by-5 grid of renderings, each time using a random point
            of view. It will print the random camera directions that it selects
            to stdout.
         -seed [value]
            Specify a seed for the random number generator.
         -size [width] [height]
            Sets the size of the image. The default is 1000 1000.
         -pointradius [radius]
            Set the size of the points. The default is 40.0.
         -bgcolor [color]
            Set the background color. If not specified, the default is ffffff.
         -cameradistance [dist]
            Set the distance between the camera and the mean of the data. This
            value is specified as a factor, which is multiplied by the distance
            between the min and max corners of the data. If not specified, the
            default is 1.5. (If the camera is too close to the data, make this
            value bigger.)
         -cameradirection [dx] [dy] [dz]
            Specifies the direction from the camera to the mean of the data.
            (The camera always looks at the mean.) The default is 0.6 -0.3
            -0.8.
         -out [filename]
            Specify the name of the output file. (The default is plot.png.) It
            should have the .png extension because other image formats are not
            yet supported.
         -nolabels
            Don't put axis labels on the bounding box.
         -nobox
            Don't draw a bounding box around the plot.
   bar [dataset] <options>
      Make a bar chart.
      [dataset]
         The filename of a dataset for the bar chart. The dataset must contain
         exactly one continuous attribute. Each data row specifies the height
         of a bar.
      <options>
         -log
            Use a logarithmic scale.
         -out [filename]
            Specifies the name of the output file. (The default is plot.png.)
            It should have the .png extension because other image formats are
            not yet supported.
   bigo [dataset]
      Estimate the Big-O runtime of algorithms based on empirical results.
      Regresses the formula t=a*(n^b+c) to fit the data, where n is the value
      in attribute 0 (representing the size of the data), and t (representing
      time) in the other attributes for each algorithm. The values of a, b, and
      c are reported for each attribute > 0.
   equation <options> [equations]
      Plot an equation (or multiple equations) in 2D
      <options>
         -out [filename]
            Specify the name of the output file. (The default is plot.png.) It
            should have the .png extension because other image formats are not
            yet supported.
         -size [width] [height]
            Specify the size of the chart. (The default is 1024 1024.)
         -range [xmin] [ymin] [xmax] [ymax]
            Set the range. (The default is: -10 -10 10 10.)
         -textsize [size]
            Sets the label font size. If not specified, the default is 2.0.
         -nogrid
            Do not draw any grid lines.
      [equations]
         A set of equations separated by semicolons. Since '^' is a special
         character for many shells, it's usually a good idea to put your
         equations inside quotation marks. Here are some examples:
         "f1(x)=3*x+2"
"f1(x)=(g(x)+1)/g(x); g(x)=sqrt(x)+pi"
         "h(bob)=bob^2;f1(x)=3+bar(x,5)*h(x)-(x/foo);bar(a,b)=a*b-b;foo=3.2"
         Only functions that begin with 'f' followed by a number will be
         plotted, starting with 'f1', and it will stop when the next number in
         ascending order is not defined. You may define any number of helper
         functions or constants with any name you like. Built in constants
         include: e, and pi. Built in functions include: +, -, *, /, %, ^, abs,
         acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor,
         gamma, lgamma, log, max, min, sin, sinh, sqrt, tan, and tanh. These
         generally have the same meaning as in C, except '^' means exponent,
         "gamma" is the gamma function, and max and min can support any number
         (>=1) of parameters. (Some of these functions may not not be available
         on Windows, but most of them are.) You can override any built in
         constants or functions with your own variables or functions, so you
         don't need to worry too much about name collisions. Variables must
         begin with an alphabet character or an underscore. Multiplication is
         never implicit, so you must use a '*' character to multiply.
         Whitespace is ignored.
   histogram [dataset] <options>
      Make a histogram.
      [dataset]
         The filename of a dataset for the histogram.
      <options>
         -size [width] [height]
            Specify the size of the chart. (The default is 1024 1024.)
         -attr [index]
            Specify which attribute is charted. (The default is 0.)
         -out [filename]
            Specify the name of the output file. (If not specified, the default
            is plot.png.) It should have the .png extension because other image
            formats are not yet supported.
         -range [xmin] [xmax] [ymax]
            Specify the range of the histogram plot
   model [model-file] [dataset] [attr-x] [attr-y] <options>
      Plot the model space of a trained supervised learning algorithm.
      [model-file]
         The filename of the trained model. (You can use "waffles_learn train"
         to make a model file.)
      [dataset]
         The filename of a dataset to be plotted. It can be the training set
         that was used to train the model, or a test set that it hasn't yet
         seen.
      [attr-x]
         The zero-based index of a continuous feature attributes for the
         horizontal axis.
      [attr-y]
         The zero-based index of a continuous feature attributes for the
         vertical axis.
      <options>
         -out [filename]
            Specify the name of the output file. (The default is plot.png.) It
            should have the .png extension because other image formats are not
            yet supported.
         -size [width] [height]
            Specify the size of the image.
         -pointradius [size]
            Specify the size of the dots used to represent each instance.
   overlay [png1] [png2] <options>
      Make an image comprised of [png1] with [png2] on top of it. The two
      images must be the same size.
      [png1]
         The filename of an image in png format.
      [png2]
         The filename of an image in png format.
      <options>
         -out [filename]
            Specify the name of the output file. (The default is plot.png.) It
            should have the .png extension because other image formats are not
            yet supported.
         -backcolor [hex]
            Specify the six-digit hexadecimal representation of the background
            color. (This color will be treated as being transparent in [png2].
            If not specified, the default is ffffff (white).
         -tolerance [n]
            Specify the tolerance (an integer). If not specified, the default
            is 0. If a larger value is specified, then pixels in [png2] that
            are close to the background color will also be treated as being
            transparent.
   overview [dataset]
      Generate a matrix of plots of attribute distributions and correlations.
      This is a useful chart for becoming acquainted with a dataset.
      [dataset]
         The filename of a dataset to be charted.
      <options>
         -out [filename]
            Specify the name of the output file. (The default is plot.png.) It
            should have the .png extension because other image formats are not
            yet supported.
         -cellsize [value]
            Change the size of each cell. The default is 100.
         -jitter [value]
            Specify how much to jitter the plotted points. The default is 0.03.
         -maxattrs [value]
            Specifies the maximum number of attributes to plot. The default is
            20.
   printdecisiontree [model-file] <dataset> <data_opts>
      Print a textual representation of a decision tree to stdout.
      [model-file]
         The filename of a trained decision tree model. (You can make one with
         the command "waffles_learn train [dataset] decisiontree >
         [filename]".)
      <dataset>
         An optional filename of the arff file that was used to train the
         decision tree. The data in this file is ignored, but the meta-data
         will be used to make the printed model richer.
      <data_opts>
         -labels [attr_list]
            Specify which attributes to use as labels. (If not specified, the
            default is to use the last attribute for the label.) [attr_list] is
            a comma-separated list of zero-indexed columns. A hypen may be used
            to specify a range of columns.  A '*' preceding a value means to
            index from the right instead of the left. For example, "0,2-5"
            refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last
            column. "0-*1" refers to all but the last column.
         -ignore [attr_list]
            Specify attributes to ignore. [attr_list] is a comma-separated list
            of zero-indexed columns. A hypen may be used to specify a range of
            columns.  A '*' preceding a value means to index from the right
            instead of the left. For example, "0,2-5" refers to columns 0, 2,
            3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all
            but the last column.
   scatter [dataset] <options>
      Makes a scatter plot or line graph.
      [dataset]
         The filename of a dataset to be plotted. The first attribute specifies
         the values on the horizontal axis. All other attributes specify the
         values on the vertical axis for a certain color.
      <options>
         -lines
            Draw lines connecting sequential point in the data. (In other
            words, make a line graph instead of a scatter plot.)
         -size [width] [height]
            Specify the size of the chart. (The default is 1024 1024.)
         -logx
            Show the horizontal axis on a logarithmic scale
         -logy
            Show the vertical axis on a logarithmic scale
         -nogrid
            Do not draw any grid lines. (This is the same as doing both
            -novgrid and -nohgrid.)
         -novgrid
            Do not draw any vertical grid lines.
         -nohgrid
            Do not draw any horizontal grid lines.
         -textsize [size]
            Sets the label font size. If not specified, the default is 2.0.
         -pointradius [radius]
            Set the size of the point dots. If not specified, the default is
            7.0.
         -linethickness [value]
            Specify the line thickness. (The default is 3.0.)
         -range [xmin] [ymin] [xmax] [ymax]
            Sets the range. (The default is to determine the range
            automatically.)
         -aspect
            Adjust the range to preserve the aspect ratio. In other words, make
            sure that both axes visually have the same scale.
         -chartcolors [background] [text] [grid]
            Sets colors for the specified areas. (The default is ffffff 000000
            808080.)
         -linecolors [c1] [c2] [c3] [c4]
            Sets the colors for the first four attributes. The default is
            0000a0 a00000 008000 504010 (blue, red, green, brown). (If there
            are more than four lines, it will just distribute them evenly over
            the color spectrum.)
         -spectrum
            Instead of giving each line a unique color, this will use the color
            spectrum to indicate the position of each point within the data.
         -specmod [cycle]
            Like -spectrum, except it repeats the spectrum with the specified
            cycle size.
         -out [filename]
            Specifies the name of the output file. (The default is plot.png.)
            It should have the .png extension because other image formats are
            not yet supported.
         -neighbors [neighbor-finder]
            Draw lines connecting each point with its neighbors as determined
            by the specified neighbor finding algorithm.
   percentsame [dataset1] [dataset2]
      Given two data files, counts the number of identical values in the same
      place in each dataset.  Prints as a percent for each column.  The data
      files must have the same number and type of attributes as well as the
      same number of rows.
   semanticmap [model-file] [dataset] <options>
      Write a svg file representing a semantic map for the given
      self-organizing map processing the given dataset.  For each node n, a
      semantic map plots, at n's location in the map, one attribute (usually a
      class label) of the entry of the input data to which n responds most
      strongly.
      [model-file]
         The self-organizing map output from "waffles_transform som".
      [dataset]
         Data for the semantic map in .arff format.  Any attributes over the
         number needed for input to the self-organizing map are ignored in
         determining som node responses.
      <options>
         -out [filename]
            Write the svg file to filename.  The default is "semantic_map.svg".
         -labels [column]
            Use the attribute column for labeling.  Column is a zero-based
            index into the attributes.  The default is to use the last column.
         -variance
            Label the nodes with the variance of the label column values for
            their winning dataset entries. If the label column is a variable
            being predicted, then its variance its related to the predictive
            power of that node.  Higher variance means lower predictive power.
   stats [dataset]
      Prints some basic stats about the dataset to stdout.
      [dataset]
         The filename of a dataset.
   usage
      Print usage information.

Previous      Next

Back to the docs page