Package com.inet.jorthodictionaries
Class BookGenerator
- java.lang.Object
-
- com.inet.jorthodictionaries.BookGenerator
-
- Direct Known Subclasses:
BookGenerator_ar
,BookGenerator_de
,BookGenerator_en
,BookGenerator_es
,BookGenerator_fr
,BookGenerator_it
,BookGenerator_nl
,BookGenerator_pl
,BookGenerator_pl_Engish
,BookGenerator_ru
,BookGenerator_ru_templates
,BookGenerator_sv
public abstract class BookGenerator extends java.lang.Object
How to use- Download the latest Wiktionary file "pages_articles.xml". It is typical compressed. The position changed. I found it last at:
- http://dumps.wikimedia.org/arwiktionary/latest/arwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/dewiktionary/latest/dewiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/eswiktionary/latest/eswiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/frwiktionary/latest/frwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/itwiktionary/latest/itwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/nlwiktionary/latest/nlwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/plwiktionary/latest/plwiktionary-latest-pages-articles.xml.bz2
- http://dumps.wikimedia.org/ruwiktionary/latest/ruwiktionary-latest-pages-articles.xml.bz2
- start the Generator with follow command line:
java -Xmx256m com.inet.spell.wiktionary.BookGenerator de
-
-
Constructor Summary
Constructors Constructor Description BookGenerator()
BookGenerator(Book book)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description private void
addFileToZip(java.util.zip.ZipOutputStream out, java.lang.String filename, boolean delete)
protected void
addWord(java.lang.String word)
Add a word to the tree.private void
createPackage(java.lang.String language)
Generate the distribution package(package private) Book
getBook()
Get the resulting book for the current generator.protected int
indexOf(java.lang.String string, char[] chars, int fromIndex)
Help function for parsing the Wiktinary formats.(package private) abstract boolean
isValidLanguage(java.lang.String word, java.lang.String wikiText)
Check if a word is a valid word of the current language.protected boolean
isValidWord(java.lang.String word)
Check if the word is valid word.static void
main(java.lang.String[] args)
(package private) void
save(java.lang.String language)
private void
saveStatistics(java.io.File dictFile)
Create statistics data and save it in statistics.txt(package private) void
start(java.io.File file)
Beginn des einlesend der Daten von dem XML stream
-
-
-
Field Detail
-
book
private final Book book
-
-
Constructor Detail
-
BookGenerator
BookGenerator()
-
BookGenerator
BookGenerator(Book book)
-
-
Method Detail
-
main
public static void main(java.lang.String[] args) throws java.lang.Exception
- Throws:
java.lang.Exception
-
start
void start(java.io.File file) throws java.lang.Exception
Beginn des einlesend der Daten von dem XML stream- Parameters:
stream
- Daten im XML format- Throws:
java.lang.Exception
-
save
final void save(java.lang.String language) throws java.lang.Exception
- Throws:
java.lang.Exception
-
saveStatistics
private final void saveStatistics(java.io.File dictFile) throws java.lang.Exception
Create statistics data and save it in statistics.txt- Parameters:
dictFile
- the created ortho file.- Throws:
java.lang.Exception
- if an error occur
-
createPackage
private final void createPackage(java.lang.String language) throws java.lang.Exception
Generate the distribution package- Throws:
java.lang.Exception
-
addFileToZip
private final void addFileToZip(java.util.zip.ZipOutputStream out, java.lang.String filename, boolean delete) throws java.lang.Exception
- Throws:
java.lang.Exception
-
indexOf
protected final int indexOf(java.lang.String string, char[] chars, int fromIndex)
Help function for parsing the Wiktinary formats.- Parameters:
string
- zu durchsuchender Stringchars
- the searching charchters, can not be emptyfromIndex
- Startposition der Suche. Index beginnt bei 0.- Returns:
- erstes vorkommen einer der Zeichen in chars oder -1, wenn nicht gefunden.
-
isValidWord
protected boolean isValidWord(java.lang.String word)
Check if the word is valid word. This exclude help pages and some phrases. It should be call ever before addWord(String)- Parameters:
word
- the to check- Returns:
- true, if the word is valid
-
addWord
protected final void addWord(java.lang.String word)
Add a word to the tree.- Parameters:
word
- can not be null
-
getBook
Book getBook()
Get the resulting book for the current generator.- Returns:
- the book
-
isValidLanguage
abstract boolean isValidLanguage(java.lang.String word, java.lang.String wikiText)
Check if a word is a valid word of the current language. With function getBook().addWord() you can add additional Flexion of the word. The current word self does not need added.- Parameters:
word
- the test wordwikiText
- die decription from Wiktionary- Returns:
- true if valid
-
-