Back to the docs page

Previous      Next

waffles_generate

A command-line tool to help generate various types of data. (Most of the datasets it generates are for testing manifold learning algorithms. I add them as I need them.) Here's the usage information:

Full Usage Information
[Square brackets] are used to indicate required arguments.
<Angled brackets> are used to indicate optional arguments.

waffles_generate [command]
   Generate certain useful datasets
   crane <options>
      Generate a dataset where each row represents a ray-traced image of a
      crane with a ball.
      <options>
         -saveimage [filename]
            Save an image showing all the frames.
         -ballradius [size]
            Specify the size of the ball. The default is 0.3.
         -frames [horiz] [vert]
            Specify the number of frames to render.
         -size [wid] [hgt]
            Specify the size of each frame.
         -blur [radius]
            Blurs the images. A good starting value might be 5.0.
         -gray
            Use a single grayscale value for every pixel instead of three (red,
            green, blue) channel values.
   cube [n]
      returns data evenly distributed on the surface of a unit cube. Each side
      is sampled with [n]x[n] points. The total number of points in the dataset
      will be 6*[n]*[n]-12*[n]+8.
   entwinedspirals [points] <options>
      Generates points that lie on an entwined spirals manifold.
      [points]
         The number of points with which to sample the manifold.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -reduced
            Generate intrinsic values instead of extrinsic values. (This might
            be useful to empirically measure the accuracy of a manifold
            learner.)
   fishbowl [n] <option>
      Generate samples on the surface of a fish-bowl manifold.
      [n]
         The number of samples to draw.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -opening [size]
            the size of the opening. (0.0 = no opening. 0.25 = default. 1.0 =
            half of the sphere.)
   gridrandomwalk [arff-file] [width] [samples] <options>
      Generate a sequence of action-observation pairs by randomly walking
      around on a grid of observation vectors. Assumes there are four possible
      actions consisting of up, down, left, right.
      [arff-file]
         The filename of an arff file containing observation vectors arranged
         in a grid.
      [width]
         The width of the grid.
      [samples]
         The number of samples to take. In other words, the length of the
         random walk.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -start [x] [y]
            Specifies the starting state. The default is to start in the center
            of the grid.
         -obsfile [filename]
            Specify the filename for the observation sequence data. The default
            is observations.arff.
         -actionfile [filename]
            Specify the filename for the actions data. The default is
            actions.arff.
   imagetranslatedovernoise [png-file] <options>
      Sample a manifold by translating an image over a background of noise.
      [png-file]
         The filename of a png image.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -reduced
            Generate intrinsic values instead of extrinsic values. (This might
            be useful to empirically measure the accuracy of a manifold
            learner.)
   manifold [samples] <options> [equations]
      Generate sample points randomly distributed on the surface of a manifold.
      [samples]
         The number of points with which to sample the manifold
      <options>
         -seed [value]
            Specify a seed for the random number generator.
      [equations]
         A set of equations that define the manifold. The equations that define
         the manifold must be named y1, y2, ..., but helper equations may be
         included. The manifold-defining equations must all have the same
         number of parameters. The parameters will be drawn from a standard
         normal distribution (from 0 to 1). Usually it is a good idea to wrap
         the equations in quotes. Example:
         "y1(x1,x2)=x1;y2(x1,x2)=sqrt(x1*x2);h(x)=sqrt(1-x);y3(x1,x2)=x2*x2-h(x
         1)"
   noise [rows] <options>
      Generate random data by sampling from a distribution.
      [rows]
         The number of patterns to generate.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -dist [distribution]
            Specify the distribution. The default is normal 0 1
            beta [alpha] [beta]
            binomial [n] [p]
            categorical 3 [p0] [p1] [p2]
               A categorical distribution with 3 classes. [p0], [p1], and [p2]
               specify the probabilities of each of the 3 classes. (This is
               just an example. Other values besides 3 may be used for the
               number of classes.)
            cauchy [median] [scale]
            chisquare [t]
            exponential [beta]
            f [t] [u]
            gamma [alpha] [beta]
            gaussian [mean] [deviation]
            geometric [p]
            logistic [mu] [s]
            lognormal [mu] [sigma]
            normal [mean] [deviation]
            poisson [mu]
            softimpulse [s]
            spherical [dims] [radius]
            student [t]
            uniform [a] [b]
            weibull [gamma]
   randomsequence [length] <options>
      Generates a sequential list of integer values, shuffles them randomly,
      and then prints the shuffled list to stdout.
      [length]
         The number of values in the random sequence.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -start [value]
            Specify the smallest value in the sequence.
   scalerotate [png-file] <options>
      Generate a dataset where each row represents an image that has been
      scaled and rotated by various amounts. Thus, these images form an
      open-cylinder (although somewhat cone-shaped) manifold.
      [png-file]
         The filename of a PNG image
      <options>
         -saveimage [filename]
            Save a composite image showing all the frames in a grid.
         -frames [rotate-frames] [scale-frames]
            Specify the number of frames. The default is 40 15.
         -arc [radians]
            Specify the rotation amount. If not specified, the default is
            6.2831853... (2*PI).
   scurve [points] <options>
      Generate points that lie on an s-curve manifold.
      [points]
         The number of points with which to sample the manifold
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -reduced
            Generate intrinsic values instead of extrinsic values. (This might
            be useful to empirically measure the accuracy of a manifold
            learner.)
   selfintersectingribbon [points] <options>
      Generate points that lie on a self-intersecting ribbon manifold.
      [points]
         The number of points with which to sample the manifold.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
   swissroll [points] <options>
      Generate points that lie on a swiss roll manifold.
      [points]
         The number of points with which to sample the manifold.
      <options>
         -seed [value]
            Specify a seed for the random number generator.
         -reduced
            Generate intrinsic values instead of extrinsic values. (This might
            be useful to empirically measure the accuracy of a manifold
            learner.)
         -cutoutstar
            Don't sample within a star-shaped region on the manifold.
   windowedimage [png-file] <options>
      Sample a manifold by translating a window over an image. Each pattern
      represents the windowed portion of the image.
      [png-file]
         The filename of the png image from which to generate the data.
      <options>
         -reduced
            Generate intrinsic values instead of extrinsic values. (This might
            be useful to empirically measure the accuracy of a manifold
            learner.)
         -stepsizes [horiz] [vert]
            Specify the horizontal and vertical step sizes. (how many pixels to
            move the window between samples.)
         -windowsize [width] [height]
            Specify the size of the window. The default is half the width and
            height of [png-file].
   usage
      Print usage information.

Previous      Next

Back to the docs page