Class CompoundAwareHunspellRule
java.lang.Object
org.languagetool.rules.Rule
org.languagetool.rules.spelling.SpellingCheckRule
org.languagetool.rules.spelling.hunspell.HunspellRule
org.languagetool.rules.spelling.hunspell.CompoundAwareHunspellRule
A spell checker that combines Hunspell und Morfologik spell checking
to support compound words and offer fast suggestions for some misspelled
compound words.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final CompoundWordTokenizer
private static final int
private final MorfologikMultiSpeller
Fields inherited from class org.languagetool.rules.spelling.hunspell.HunspellRule
FILE_EXTENSION, hunspellDict, needsInit, nonWordPattern, RULE_ID, suggestionsOrderer
Fields inherited from class org.languagetool.rules.spelling.SpellingCheckRule
ignoreWordsWithLength, language, languageModel, LANGUAGETOOL, LANGUAGETOOLER, wordListLoader
-
Constructor Summary
ConstructorsConstructorDescriptionCompoundAwareHunspellRule
(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig) CompoundAwareHunspellRule
(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, List<Language> altLanguages) CompoundAwareHunspellRule
(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, List<Language> altLanguages, LanguageModel languageModel) -
Method Summary
Modifier and TypeMethodDescriptionprotected abstract void
filterForLanguage
(List<String> suggestions) getCandidates
(String word) Find potential corrections - it's okay if some of these are not valid words, this list will be filtered against the spellchecker before being returned to the user.getCandidates
(List<String> parts) getCorrectWords
(List<String> wordsOrPhrases) getFilteredSuggestions
(List<String> wordsOrPhrases) getSuggestions
(String word) As a hunspell-based approach is too slow, we use Morfologik to create suggestions.private void
handleWordEndPunctuation
(String punct, String word, List<String> noSplitSuggestions) sortSuggestionByQuality
(String misspelling, List<String> suggestions) Methods inherited from class org.languagetool.rules.spelling.hunspell.HunspellRule
getActiveChecks, getDescription, getDictFilenameInResources, getId, getSentenceTextWithoutUrlsAndImmunizedTokens, init, isAcceptedWordFromLanguage, isMisspelled, isQuotedCompound, match, tokenizeText
Methods inherited from class org.languagetool.rules.spelling.SpellingCheckRule
acceptedInAlternativeLanguage, acceptPhrases, addIgnoreTokens, addIgnoreWords, addProhibitedWords, addSuggestionsToRuleMatch, createWrongSplitMatch, expandLine, filterDupes, filterSuggestions, getAdditionalProhibitFileNames, getAdditionalSpellingFileNames, getAdditionalSuggestions, getAdditionalTopSuggestions, getAlternativeLangSpellingRules, getAntiPatterns, getIgnoreFileName, getLanguageVariantSpellingFileName, getProhibitFileName, getSpellingFileName, ignoreToken, ignoreWord, ignoreWord, isDictionaryBasedSpellingRule, isEMail, isProhibited, isUrl, reorderSuggestions, setConsiderIgnoreWords, setConvertsCase, startsWithIgnoredWord
Methods inherited from class org.languagetool.rules.Rule
addExamplePair, estimateContextForSureMatch, getCategory, getConfigureText, getCorrectExamples, getDefaultValue, getErrorTriggeringExamples, getIncorrectExamples, getLocQualityIssueType, getMaxConfigurableValue, getMinConfigurableValue, getSentenceWithImmunization, getUrl, hasConfigurableValue, isDefaultOff, isDefaultTempOff, isOfficeDefaultOff, isOfficeDefaultOn, makeAntiPatterns, setCategory, setCorrectExamples, setDefaultOff, setDefaultOn, setDefaultTempOff, setErrorTriggeringExamples, setIncorrectExamples, setLocQualityIssueType, setOfficeDefaultOff, setOfficeDefaultOn, setUrl, supportsLanguage, toRuleMatchArray, useInOffice
-
Field Details
-
MAX_SUGGESTIONS
private static final int MAX_SUGGESTIONS- See Also:
-
compoundSplitter
-
morfoSpeller
-
-
Constructor Details
-
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig) -
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, List<Language> altLanguages) - Since:
- 4.3
-
CompoundAwareHunspellRule
public CompoundAwareHunspellRule(ResourceBundle messages, Language language, CompoundWordTokenizer compoundSplitter, MorfologikMultiSpeller morfoSpeller, UserConfig userConfig, List<Language> altLanguages, LanguageModel languageModel)
-
-
Method Details
-
filterForLanguage
-
getSuggestions
As a hunspell-based approach is too slow, we use Morfologik to create suggestions. As this won't work for compounds not in the dictionary, we split the word and also get suggestions on the compound parts. In the end, all candidates are filtered against Hunspell again (which supports compounds).- Overrides:
getSuggestions
in classHunspellRule
- Throws:
IOException
-
handleWordEndPunctuation
-
getCandidates
Find potential corrections - it's okay if some of these are not valid words, this list will be filtered against the spellchecker before being returned to the user. -
getCandidates
-
sortSuggestionByQuality
- Overrides:
sortSuggestionByQuality
in classHunspellRule
-
getCorrectWords
-
getFilteredSuggestions
- Since:
- 4.7
-