morfologik.stemming
Class Dictionary

java.lang.Object
  extended by morfologik.stemming.Dictionary

public final class Dictionary
extends java.lang.Object

A dictionary combines FSA automaton and metadata describing the internals of dictionary entries' coding (DictionaryMetadata.

A dictionary consists of two files:

Use static methods in this class to read dictionaries and their metadata.


Field Summary
static java.util.WeakHashMap<java.lang.String,Dictionary> defaultDictionaries
          Default loaded dictionaries.
 FSA fsa
          FSA automaton with the compiled dictionary data.
 DictionaryMetadata metadata
          Metadata associated with the dictionary.
static java.lang.String METADATA_FILE_EXTENSION
          Expected metadata file extension.
 
Constructor Summary
Dictionary(FSA fsa, DictionaryMetadata metadata)
          It is strongly recommended to use static methods in this class for reading dictionaries.
 
Method Summary
static java.lang.String getExpectedFeaturesName(java.lang.String name)
          Returns the expected name of the metadata file, based on the name of the FSA dictionary file.
static Dictionary getForLanguage(java.lang.String languageCode)
          Return a built-in dictionary for a given ISO language code.
static Dictionary read(java.io.File fsaFile)
          Attempts to load a dictionary using the path to the FSA file and the expected metadata extension.
static Dictionary read(java.net.URL fsaURL)
           Attempts to load a dictionary using the URL to the FSA file and the expected metadata extension.
static Dictionary readAndClose(java.io.InputStream fsaData, java.io.InputStream featuresData)
          Attempts to load a dictionary from opened streams of FSA dictionary data and associated metadata.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

METADATA_FILE_EXTENSION

public static final java.lang.String METADATA_FILE_EXTENSION
Expected metadata file extension.

See Also:
Constant Field Values

fsa

public final FSA fsa
FSA automaton with the compiled dictionary data.


metadata

public final DictionaryMetadata metadata
Metadata associated with the dictionary.


defaultDictionaries

public static final java.util.WeakHashMap<java.lang.String,Dictionary> defaultDictionaries
Default loaded dictionaries.

Constructor Detail

Dictionary

public Dictionary(FSA fsa,
                  DictionaryMetadata metadata)
It is strongly recommended to use static methods in this class for reading dictionaries.

Parameters:
fsa - An instantiated FSA instance.
metadata - A map of attributes describing the compression format and other settings not contained in the FSA automaton. For an explanation of available attributes and their possible values, see DictionaryMetadata.
Method Detail

read

public static Dictionary read(java.io.File fsaFile)
                       throws java.io.IOException
Attempts to load a dictionary using the path to the FSA file and the expected metadata extension.

Throws:
java.io.IOException

read

public static Dictionary read(java.net.URL fsaURL)
                       throws java.io.IOException

Attempts to load a dictionary using the URL to the FSA file and the expected metadata extension.

This method can be used to load resource-based dictionaries, but be aware of JAR resource-locking issues that arise from resource URLs.

Throws:
java.io.IOException

readAndClose

public static Dictionary readAndClose(java.io.InputStream fsaData,
                                      java.io.InputStream featuresData)
                               throws java.io.IOException
Attempts to load a dictionary from opened streams of FSA dictionary data and associated metadata.

Throws:
java.io.IOException

getExpectedFeaturesName

public static java.lang.String getExpectedFeaturesName(java.lang.String name)
Returns the expected name of the metadata file, based on the name of the FSA dictionary file. The expected name is resolved by truncating any suffix of name and appending METADATA_FILE_EXTENSION.


getForLanguage

public static Dictionary getForLanguage(java.lang.String languageCode)
Return a built-in dictionary for a given ISO language code. Dictionaries are cached internally for potential reuse.

Throws:
java.lang.RuntimeException - Throws a RuntimeException if the dictionary is not bundled with the library.