morfologik.fsa
Class FSABuilder

java.lang.Object
  extended by morfologik.fsa.FSABuilder

public final class FSABuilder
extends java.lang.Object

Fast, memory-conservative finite state automaton builder, returning a byte-serialized ConstantArcSizeFSA (a tradeoff between construction speed and memory consumption).


Nested Class Summary
static class FSABuilder.InfoEntry
          Debug and information constants.
 
Field Summary
static java.util.Comparator<byte[]> LEXICAL_ORDERING
          Comparator comparing full byte arrays consistently with compare(byte[], int, int, byte[], int, int).
 
Constructor Summary
FSABuilder()
           
FSABuilder(int bufferGrowthSize)
           
 
Method Summary
 void add(byte[] sequence, int start, int len)
          Add a single sequence of bytes to the FSA.
static FSA build(byte[][] input)
          Build a minimal, deterministic automaton from a sorted list of byte sequences.
static FSA build(java.lang.Iterable<byte[]> input)
          Build a minimal, deterministic automaton from an iterable list of byte sequences.
static int compare(byte[] s1, int start1, int lens1, byte[] s2, int start2, int lens2)
          Lexicographic order of input sequences.
 FSA complete()
          Complete the automaton.
 java.util.Map<FSABuilder.InfoEntry,java.lang.Object> getInfo()
          Return various statistics concerning the FSA and its compilation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LEXICAL_ORDERING

public static final java.util.Comparator<byte[]> LEXICAL_ORDERING
Comparator comparing full byte arrays consistently with compare(byte[], int, int, byte[], int, int).

Constructor Detail

FSABuilder

public FSABuilder()

FSABuilder

public FSABuilder(int bufferGrowthSize)
Method Detail

add

public void add(byte[] sequence,
                int start,
                int len)
Add a single sequence of bytes to the FSA. The input must be lexicographically greater than any previously added sequence.


complete

public FSA complete()
Complete the automaton.


build

public static FSA build(byte[][] input)
Build a minimal, deterministic automaton from a sorted list of byte sequences.


build

public static FSA build(java.lang.Iterable<byte[]> input)
Build a minimal, deterministic automaton from an iterable list of byte sequences.


getInfo

public java.util.Map<FSABuilder.InfoEntry,java.lang.Object> getInfo()
Return various statistics concerning the FSA and its compilation.


compare

public static int compare(byte[] s1,
                          int start1,
                          int lens1,
                          byte[] s2,
                          int start2,
                          int lens2)
Lexicographic order of input sequences. By default, consistent with the "C" sort (absolute value of bytes, 0-255).