Class ConfusionProbabilityRule

java.lang.Object
org.languagetool.rules.Rule
org.languagetool.rules.ngrams.ConfusionProbabilityRule

public abstract class ConfusionProbabilityRule extends Rule
LanguageTool's homophone confusion check that uses ngram lookups to decide which word in a confusion set (from confusion_sets.txt) suits best. Also see http://wiki.languagetool.org/finding-errors-using-n-gram-data.
Since:
2.7
  • Field Details

  • Constructor Details

  • Method Details

    • getFilenames

      @NotNull protected @NotNull List<String> getFilenames()
    • getId

      public String getId()
      Description copied from class: Rule
      A string used to identify the rule in e.g. configuration files. This string is supposed to be unique and to stay the same in all upcoming versions of LanguageTool. It's supposed to contain only the characters A-Z and the underscore.
      Specified by:
      getId in class Rule
    • estimateContextForSureMatch

      public int estimateContextForSureMatch()
      Description copied from class: Rule
      A number that estimates how many words there must be after a match before we can be (relatively) sure the match is valid. This is useful for check-as-you-type, where a match might occur and the word that gets typed next makes the match disappear (something one would obviously like to avoid). Note: this may over-estimate the real context size. Returns -1 when the sentence needs to end to be sure there's a match.
      Overrides:
      estimateContextForSureMatch in class Rule
    • match

      public RuleMatch[] match(AnalyzedSentence sentence)
      Description copied from class: Rule
      Check whether the given sentence matches this error rule, i.e. whether it contains the error detected by this rule. Note that the order in which this method is called is not always guaranteed, i.e. the sentence order in the text may be different than the order in which you get the sentences (this may be the case when LanguageTool is used as a LibreOffice/OpenOffice add-on, for example).
      Specified by:
      match in class Rule
      Parameters:
      sentence - a pre-analyzed sentence
      Returns:
      an array of RuleMatch objects
    • isLocalException

      private boolean isLocalException(AnalyzedSentence sentence, GoogleToken googleToken)
    • covers

      private boolean covers(int exceptionStartPos, int exceptionEndPos, int startPos, int endPos)
    • getSuggestions

      private List<String> getSuggestions(String message)
    • isException

      protected boolean isException(String sentenceText)
      Return true to prevent a match.
    • getDescription

      public String getDescription()
      Description copied from class: Rule
      A short description of the error this rule can detect, usually in the language of the text that is checked.
      Specified by:
      getDescription in class Rule
    • getMessage

      protected String getMessage(ConfusionString textString, ConfusionString suggestion)
    • setConfusionPair

      public void setConfusionPair(ConfusionPair pair)
      Deprecated.
      used only for tests
    • getNGrams

      public int getNGrams()
      Returns the ngram level used, typically 3.
      Since:
      3.1
    • getBetterAlternativeOrNull

      @Nullable private @Nullable ConfusionString getBetterAlternativeOrNull(GoogleToken token, List<GoogleToken> tokens, List<ConfusionString> confusionSet, long factor)
    • getAlternativeTerm

      private ConfusionString getAlternativeTerm(List<ConfusionString> confusionSet, GoogleToken token)
    • getConfusionString

      private ConfusionString getConfusionString(List<ConfusionString> confusionSet, GoogleToken token)
    • getBetterAlternativeOrNull

      private ConfusionString getBetterAlternativeOrNull(GoogleToken token, List<GoogleToken> tokens, ConfusionString otherWord, long factor)
    • getContext

      List<String> getContext(GoogleToken token, List<GoogleToken> tokens, String newToken, int toLeft, int toRight)
    • debug

      private void debug(String message, Object... vars)