public class SplitTagger extends Object implements IDocumentTagger, IXMLConfigurable
Can be used both as a pre-parse or post-parse handler.
XML configuration usage:
<tagger class="com.norconex.importer.tagger.impl.SplitTagger"> <split fromName="sourceFieldName" toName="targetFieldName" regex="[false|true]"> <separator>(separator value)</separator> </split> <!-- multiple split tags allowed --> </tagger>
Modifier and Type | Class and Description |
---|---|
class |
SplitTagger.Split |
Constructor and Description |
---|
SplitTagger() |
Modifier and Type | Method and Description |
---|---|
void |
addSplit(String fromName,
String separator,
boolean regex) |
void |
addSplit(String fromName,
String toName,
String separator,
boolean regex) |
boolean |
equals(Object obj) |
List<SplitTagger.Split> |
getSplits() |
int |
hashCode() |
void |
loadFromXML(Reader in) |
void |
removeSplit(String fromName) |
void |
saveToXML(Writer out) |
void |
tagDocument(String reference,
InputStream document,
Properties metadata,
boolean parsed)
Tags a document with extra metadata information.
|
String |
toString() |
public void tagDocument(String reference, InputStream document, Properties metadata, boolean parsed) throws IOException
IDocumentTagger
tagDocument
in interface IDocumentTagger
reference
- document reference (e.g. URL)document
- documentmetadata
- document metadataparsed
- whether the document has been parsed already or not (a
parsed document should normally be text-based)IOException
- problem reading the documentpublic List<SplitTagger.Split> getSplits()
public void removeSplit(String fromName)
public void loadFromXML(Reader in) throws IOException
loadFromXML
in interface IXMLConfigurable
IOException
public void saveToXML(Writer out) throws IOException
saveToXML
in interface IXMLConfigurable
IOException
Copyright © 2009-2014 Norconex Inc.. All Rights Reserved.