Package org.apache.pdfbox.tools
Class ExtractText
java.lang.Object
org.apache.pdfbox.tools.ExtractText
This is the main program that simply parses the pdf document and transforms it
into text.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final String
private static final String
private static final String
private boolean
private static final String
private static final String
private static final String
private static final String
private static final org.apache.commons.logging.Log
private static final String
private static final String
private static final String
private static final String
private static final String
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate void
extractPages
(int startPage, int endPage, PDFTextStripper stripper, PDDocument document, Writer output, boolean rotationMagic, boolean alwaysNext) (package private) static int
getAngle
(TextPosition text) static void
Infamous main method.void
startExtraction
(String[] args) Starts the text extraction.private long
startProcessing
(String message) private void
stopProcessing
(String message, long startTime) private static void
usage()
This will print the usage requirements and exit.
-
Field Details
-
LOG
private static final org.apache.commons.logging.Log LOG -
PASSWORD
- See Also:
-
ENCODING
- See Also:
-
CONSOLE
- See Also:
-
START_PAGE
- See Also:
-
END_PAGE
- See Also:
-
SORT
- See Also:
-
IGNORE_BEADS
- See Also:
-
DEBUG
- See Also:
-
HTML
- See Also:
-
ALWAYSNEXT
- See Also:
-
ROTATION_MAGIC
- See Also:
-
STD_ENCODING
- See Also:
-
debugOutput
private boolean debugOutput
-
-
Constructor Details
-
ExtractText
private ExtractText()private constructor.
-
-
Method Details
-
main
Infamous main method.- Parameters:
args
- Command line arguments, should be one and a reference to a file.- Throws:
IOException
- if there is an error reading the document or extracting the text.
-
startExtraction
Starts the text extraction.- Parameters:
args
- the commandline arguments.- Throws:
IOException
- if there is an error reading the document or extracting the text.
-
extractPages
private void extractPages(int startPage, int endPage, PDFTextStripper stripper, PDDocument document, Writer output, boolean rotationMagic, boolean alwaysNext) throws IOException - Throws:
IOException
-
startProcessing
-
stopProcessing
-
getAngle
-
usage
private static void usage()This will print the usage requirements and exit.
-