public abstract class AbstractMappedCommitter extends AbstractBatchCommitter implements IXMLConfigurable
A base class batching documents and offering mappings of source id and source content fields to target id and target content fields. Batched documents are queued on the file system.
Both the idSourceField
and idTargetField
must
be set for ID mapping to take place. The default source id field is
the metadata normally set by the Norconex Importer module called
document.reference
. The default (or constant) target id
field is for subclasses to define. When an ID mapping is defined, the
source id field will be deleted unless the keepIdSourceField
attribute is set to true
.
Only the contentTargetField
needs to be set for content
mapping to take place. The default source content is
the actual document content. Defining a contentSourceField
will use the matching metadata property instead.
The default (or constant) target content field is for subclasses
to define. When a content mapping is defined, the
source content field will be deleted (if provided) unless the
keepContentSourceField
attribute is set to
true
.
Subclasses implementing IXMLConfigurable
should allow this inner
configuration:
<idSourceField keep="[false|true]"> (Name of source field that will be mapped to the IDOL "DREREFERENCE" field or whatever "idTargetField" specified. Default is the document reference metadata field: "document.reference". Once re-mapped, the metadata source field is deleted, unless "keep" is set totrue
.) </idSourceField> <idTargetField> (Name of IDOL target field where to store a document unique identifier (idSourceField). If not specified, default is "DREREFERENCE".) </idTargetField> <contentSourceField keep="[false|true]>"; (If you wish to use a metadata field to act as the document "content", you can specify that field here. Default does not take a metadata field but rather the document content. Once re-mapped, the metadata source field is deleted, unless "keep" is set totrue
.) </contentSourceField> <contentTargetField> (IDOL target field name for a document content/body. Default is: DRECONTENT) </contentTargetField> <commitBatchSize> (max number of documents to send IDOL at once) </commitBatchSize> <queueDir>(optional path where to queue files)</queueDir> <queueSize>(max queue size before committing)</queueSize> <maxRetries>(max retries upon commit failures)</maxRetries> <maxRetryWait>(max delay between retries)</maxRetryWait>
DEFAULT_COMMIT_BATCH_SIZE
DEFAULT_QUEUE_DIR
DEFAULT_QUEUE_SIZE
DEFAULT_DOCUMENT_REFERENCE
Constructor and Description |
---|
AbstractMappedCommitter()
Creates a new instance.
|
AbstractMappedCommitter(int commitBatchSize)
Creates a new instance with given commit batch size.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
String |
getContentSourceField()
Gets the source field name holding the document content.
|
String |
getContentTargetField()
Gets the target field where to store the document content.
|
String |
getIdSourceField()
Gets the source field name holding the unique identifier.
|
String |
getIdTargetField()
Gets the target field name to store the unique identifier.
|
int |
hashCode() |
boolean |
isKeepContentSourceField()
Whether to keep the content source field or not, once mapped.
|
boolean |
isKeepIdSourceField()
Whether to keep the ID source field or not, once mapped.
|
void |
loadFromXML(Reader in) |
protected abstract void |
loadFromXml(org.apache.commons.configuration.XMLConfiguration xml)
Allows subclasses to load their config from xml
|
protected void |
prepareCommitAddition(IAddOperation operation)
Optionally performs actions on a document to be added before
actually committing it.
|
void |
saveToXML(Writer out) |
protected abstract void |
saveToXML(XMLStreamWriter writer)
Allows subclasses to write their config to xml
|
void |
setContentSourceField(String contentSourceField)
Sets the source field name holding the document content.
|
void |
setContentTargetField(String contentTargetField)
Sets the target field where to store the document content.
|
void |
setIdSourceField(String idSourceField)
sets the source field name holding the unique identifier.
|
void |
setIdTargetField(String idTargetField)
Sets the target field name to store the unique identifier.
|
void |
setKeepContentSourceField(boolean keepContentSourceField)
Sets whether to keep the content source field or not, once mapped.
|
void |
setKeepIdSourceField(boolean keepIdSourceField)
Sets whether to keep the ID source field or not, once mapped.
|
String |
toString() |
commitAddition, commitBatch, commitComplete, commitDeletion, getCommitBatchSize, getMaxRetries, getMaxRetryWait, setCommitBatchSize, setMaxRetries, setMaxRetryWait
commit, getQueueDir, prepareCommitDeletion, queueAddittion, queueRemoval, setQueueDir
getQueueSize, queueAdd, queueRemove, setQueueSize
public AbstractMappedCommitter()
public AbstractMappedCommitter(int commitBatchSize)
commitBatchSize
- commit batch sizepublic String getIdSourceField()
public void setIdSourceField(String idSourceField)
idSourceField
- source field namepublic String getIdTargetField()
public void setIdTargetField(String idTargetField)
idTargetField
- target field namepublic String getContentTargetField()
public void setContentTargetField(String contentTargetField)
contentTargetField
- target field namepublic String getContentSourceField()
public void setContentSourceField(String contentSourceField)
contentSourceField
- source field namepublic boolean isKeepIdSourceField()
true
when keeping ID source fieldpublic void setKeepIdSourceField(boolean keepIdSourceField)
keepIdSourceField
- true
when keeping ID source fieldpublic boolean isKeepContentSourceField()
true
when keeping content source fieldpublic void setKeepContentSourceField(boolean keepContentSourceField)
keepContentSourceField
- true
when keeping content
source fieldprotected void prepareCommitAddition(IAddOperation operation) throws IOException
AbstractFileQueueCommitter
prepareCommitAddition
in class AbstractFileQueueCommitter
operation
- addition to be performedIOException
public void saveToXML(Writer out) throws IOException
saveToXML
in interface IXMLConfigurable
IOException
protected abstract void saveToXML(XMLStreamWriter writer) throws XMLStreamException
writer
- the xml being writtenXMLStreamException
public void loadFromXML(Reader in)
loadFromXML
in interface IXMLConfigurable
protected abstract void loadFromXml(org.apache.commons.configuration.XMLConfiguration xml)
xml
- public int hashCode()
hashCode
in class AbstractBatchCommitter
public boolean equals(Object obj)
equals
in class AbstractBatchCommitter
public String toString()
toString
in class AbstractBatchCommitter
Copyright © 2009-2014 Norconex Inc.. All Rights Reserved.