public abstract class PageExtractor.Text extends Object implements Comparable<PageExtractor.Text>
PageExtractor
. Each text object has a location on the page, font-size, font-name,
color and text.Constructor and Description |
---|
Text() |
Modifier and Type | Method and Description |
---|---|
AnnotationMarkup |
createAnnotationMarkup(String type)
Create a new
AnnotationMarkup of the specified type
to cover this text. |
float |
getAngle()
Return the angle of rotation of this text on the page, in degrees
clockwise from 12 o'clock.
|
abstract float |
getBaseline()
Return the baseline of the text item, as a fraction between 0 and 1. 0 would
indicate the baseline is at the top of the text, 1 at the absolute bottom.
|
abstract int |
getByteLength()
Get the length of the original text in bytes.
|
abstract int |
getByteToCharOffset(int byteoffset)
Given a byte offset into the original String, return the Character offset
it refers to.
|
abstract Paint |
getColor()
Return the color of this text, or
null if none was set |
float[] |
getCorners()
Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the
quadrilateral that encompasses the text.
|
abstract float |
getEndOffset(int pos)
As for
getOffset() but return the end position of that letter |
abstract Reader |
getFontMetaData()
Return any XMP MetaData that has been set on the Font, or
null
if none exists. |
abstract String |
getFontName()
Return the font name of this text
|
abstract float |
getFontSize()
Return the font size of this text in points
|
abstract float |
getHorizontalScale()
Return an indication of the horizontal scale of the text.
|
float |
getLength()
Return the length of this Text in points.
|
abstract Paint |
getLineColor()
Return the outline color of this text, or
null if none was set |
abstract float |
getOffset(int pos)
Given an offset into the text, return the start position of that letter.
|
PDFPage |
getPage()
Return the
PDFPage this text was found on - simply the page
the parent PageExtractor was created from. |
PageExtractor |
getPageExtractor()
Return the
PageExtractor this text was created from |
abstract PageExtractor.Text |
getPrimaryText()
If this text is a subtext or collection of Text object, return the
primary text it starts with.
|
abstract int |
getPrimaryTextOffset()
If this text is a subtext or collection of Text object, return the
offset into the
primary text where it starts. |
abstract PageExtractor.Text |
getRowNext()
Return the next Text item in this row, or
null if there are none |
abstract PageExtractor.Text |
getRowPrevious()
Return the next Text item in this row, or
null if there are none |
abstract PageExtractor.Text |
getSubText(int off,
int len)
Return a substring of this Text object as another Text object
|
abstract String |
getText()
Return the text content of this text
|
abstract int |
getTextLength()
Return the length of the String returned by
getText() |
abstract Shape |
getVisualBounds()
Return the visual bounds of the specified character in the string.
|
abstract boolean |
isHorizontal()
Indicates whether this text is horizontal or vertical.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
compareTo
public float getLength()
public final float[] getCorners()
public AnnotationMarkup createAnnotationMarkup(String type)
AnnotationMarkup
of the specified type
to cover this text. The annotation is not added to the pagetype
- the type of markup - "Highlight", "Underline" etc.public final float getAngle()
public abstract float getFontSize()
public abstract boolean isHorizontal()
public abstract float getHorizontalScale()
public abstract float getBaseline()
public abstract float getOffset(int pos)
float left = text.getCorners()[0] + (text.getOffset(pos) * text.getLength());
pos
- the position of the letter in the Text to retrive the position for.
In the range 0 to getText().length() - 1
public abstract float getEndOffset(int pos)
getOffset()
but return the end position of that letterpublic PDFPage getPage()
PDFPage
this text was found on - simply the page
the parent PageExtractor
was created from.public PageExtractor getPageExtractor()
PageExtractor
this text was created frompublic abstract Paint getColor()
null
if none was setpublic abstract Paint getLineColor()
null
if none was setpublic abstract String getFontName()
public abstract String getText()
public abstract int getTextLength()
getText()
public abstract PageExtractor.Text getRowNext()
null
if there are nonepublic abstract PageExtractor.Text getRowPrevious()
null
if there are nonepublic abstract Reader getFontMetaData() throws IOException
null
if none exists.IOException
PDF.getMetaData()
public abstract PageExtractor.Text getSubText(int off, int len)
off
- the offset into the textlen
- the number of characters to returnpublic abstract PageExtractor.Text getPrimaryText()
null
public abstract int getPrimaryTextOffset()
primary text
where it starts.
If not, returns 0
public abstract int getByteLength()
public abstract int getByteToCharOffset(int byteoffset)
getByteLength()
public abstract Shape getVisualBounds()
Copyright © 2001-2017 Big Faceless Organization