Package semantics
source code
Classes and interfaces for producing tree structures that represent
the internal organization of a text. This task is known as parsing the text, and
the resulting tree structures are called the text's parses. Typically, the
text is a single sentence, and the tree structure represents the
syntactic structure of the sentence. However, parsers can also be used in
other domains. For example, parsers can be used to derive the
morphological structure of the morphemes that make up a word, or to
derive the discourse structure for a set of utterances.
Sometimes, a single piece of text can be represented by more than one
tree structure. Texts represented by more than one tree structure are
called ambiguous
texts. Note that there are actually two ways in which a text can be
ambiguous:
-
The text has multiple correct parses.
-
There is not enough information to decide which of several candidate
parses is correct.
However, the parser module does not distinguish these two types
of ambiguity.
The parser module defines ParseI
, a standard interface
for parsing texts; and two simple implementations of that interface,
ShiftReduce
and RecursiveDescent
. It also
contains three sub-modules for specialized kinds of parsing:
-
nltk.parser.chart
defines chart parsing, which uses
dynamic programming to efficiently parse texts.
-
nltk.parser.probabilistic
defines probabilistic parsing,
which associates a probability with each parse.
|
ParseI
A processing class for deriving trees that represent possible
structures for a sequence of tokens.
|
|
AbstractParse
An abstract base class for parsers.
|