it.unipi.di.textdb
Class BucketedPPM

java.lang.Object
  extended by it.unipi.di.textdb.TextDB
      extended by it.unipi.di.textdb.BucketedPPM

public class BucketedPPM
extends TextDB

Author:
Claudio Corsi, Paolo Ferragina

Field Summary
 
Fields inherited from class it.unipi.di.textdb.TextDB
DEFAULT_FIELD_SEPARATOR
 
Constructor Summary
BucketedPPM(String filename)
           
 
Method Summary
 TextDB build(PrintStream log)
          Builds the TextDB over the textual file identified by the filename string used in the constructor (see TextDB.TextDB(String)).
static TextDB build(String filename, int contextSize, int minFreq, int memLimit, int blockSize, int bucketSize, PrintStream log)
           
 String get(int record)
          Returns the record for a given position in the range [0, N-1], where N is the number of records present in the TextDB.
 String[] getRange(int i, int j)
          Returns the records having positions from i to j in the TextDB.
 String[] getSequential(int[] records)
          Given a sorted array of record positions, this method returns all of them.
 void getSequential(int[] records, int field, PrintStream out)
          Given a sorted array of record positions and the position of a field, this method retrieves the specified field from those records.
static void main(String[] args)
           
 int size()
          Returns the number of records contained in this TextDB.
 
Methods inherited from class it.unipi.di.textdb.TextDB
build, close, fromTDBFile, get, getFieldValues, getName, getRange, getRecordFields, getSequential, open, setFieldSeparator
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BucketedPPM

public BucketedPPM(String filename)
Parameters:
filename -
Method Detail

build

public TextDB build(PrintStream log)
             throws IOException
Description copied from class: TextDB
Builds the TextDB over the textual file identified by the filename string used in the constructor (see TextDB.TextDB(String)). This method runs a build process with default values for all input parameters.

Log messages will be dumped into the passed PrintStream, or suppressed if the passed reference is null.

Specified by:
build in class TextDB
Parameters:
log - a PrintStream for log messages. A null value will suppress any output message
Returns:
A TextDB instance to access the built database.
Throws:
IOException

build

public static TextDB build(String filename,
                           int contextSize,
                           int minFreq,
                           int memLimit,
                           int blockSize,
                           int bucketSize,
                           PrintStream log)
                    throws IOException
Throws:
IOException

get

public String get(int record)
           throws IOException
Description copied from class: TextDB
Returns the record for a given position in the range [0, N-1], where N is the number of records present in the TextDB.

Specified by:
get in class TextDB
Parameters:
record - a position in the range [0, N-1]
Returns:
the requested record
Throws:
IOException

getRange

public String[] getRange(int i,
                         int j)
                  throws IOException
Description copied from class: TextDB
Returns the records having positions from i to j in the TextDB.

Specified by:
getRange in class TextDB
Parameters:
i - the starting position of the records to retrieve (inclusive)
j - the ending position of the records to retrieve (inclusive)
Returns:
the records in the defined range
Throws:
IOException

getSequential

public String[] getSequential(int[] records)
                       throws IOException
Description copied from class: TextDB
Given a sorted array of record positions, this method returns all of them.

If some of the requested records are not available, the behavior is unspecified and depend on the underlying implementation.

Specified by:
getSequential in class TextDB
Parameters:
records - a sorted array of record positions
Returns:
the records having these positions (order is preserved)
Throws:
IOException

getSequential

public void getSequential(int[] records,
                          int field,
                          PrintStream out)
                   throws IOException
Description copied from class: TextDB
Given a sorted array of record positions and the position of a field, this method retrieves the specified field from those records. If a record doesn't contain the requested field, the behavior of the method depends on its implementation (implementing classes are encouraged to dump a new line in this case, i.e. empty string).
In order to dump all fields of the specified records, you have to input the integer -1 as field position.

The retrieved records are not kept in memory but immediately dumped on the provided PrintStream without wasting further memory.

NOTE: implementations can use the method TextDB.getField(String, int) provided by this abstract class that selects a field of a record through a sequential access to the record itself. The use of a more efficient implementation of this function is encouraged.

Specified by:
getSequential in class TextDB
Parameters:
records - a sorted array of record positions
field - the position of the field to extract, or -1 to dump all fields
out - the output PrintStream
Throws:
IOException

size

public int size()
Description copied from class: TextDB
Returns the number of records contained in this TextDB. If N is the returned value then records of this database are numbered from 0 to N-1.

Specified by:
size in class TextDB
Returns:
the size of this TextDB as the number of the contained records

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception