|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcox.jmatt.java.MathTools.util.NoteworthyParser
public class NoteworthyParser
This class is designed to take a Noteworthy-formatted document and convert it to HTML. To use the parser first create an instance. Next create and
configure a MathGenHTML
for the parser to use. This is handy if some of the tags generated need default attributes set. Finally, feed the
parse()
method a newline-separated String of Noteworthy-formatted text. Unless the universe collapses the method will return an
HTMLTag
containing the HTML-formatted document. This can then be manipulated as needed.
Noteworthy is a set of rules and a parser designed to convert plain text lecture notes into HTML documents. Specifically, it allows both unparsed
notes and parsed markup to be mixed in a single document. The basic parse-marker is the double quote (N.B. This is the single-key double-quote character,
not a double apostrophe!) followed by a special formatting character. The parser extracts these lines, converts them to HTML, and returns them in a
HTMLTag
instance. Unquoted lines are ignored.
Noteworthy input is, as stated previously, plain text. Lines that begin with special (and logical) formatting characters are formatted per their function. Other parsed lines are assumed to be part of a paragraph, division, or directly within the <body> of the document. Some nesting of tag types is possible but it is tricky at best.
Any tag, list, paragraph or division can be closed with a single blank line. One blank line closes the closest previous open 'thing.' So if a list follows a paragraph the first blank line closes the list. In the case of lists, though, any line starting with a non-list character terminates the list. If the list immediately follows an open paragraph (no blank lines between) it will be included in the paragraph. The best way to gain a feel for this is to experiment!
The structural philosophy of Noteworthy is fairly rigid. The ultimate top-level container is the <body> tag. All elements are contained within it and it doesn't close until the end of the document. The body can hold divisions, paragraphs, or lists. Divisions can hold paragraphs and lists. Heading elements of any level can be included in the body or a division.
Opening a new division ('<div>' tag) closes the previous division, if one was defined, which automatically closes all other open things within it. A division can also be explicitly closed by issuing the close all command: '!'.The text structure is very simple. Special characters must be the first non-whitespace on the line but all lines are trimmed before parsing. The data portion of the line, which is anything after a special character or the entire line for paragraphs, is also trimmed before it is used. This allows hard-formatting using spaces or tabs, which allows and encourages organized and readable input.
Headings are preceded by one or more hash marks ('#'). The number of these characters indicates the heading level. Any hash marks at the end of the line are ignored and will be stripped before the heading is created, so symmetry is allowed or not, although it is more visually appealing.
Tables are started with a single pipe character, '|'. The first such indicates that a table is to be created, subsequent lines become the rows of the table. Table cell elements ('<td>') are separated by a double pipe with no space between. If the first line of the table contains cell elements they become the headers ('<th>'), otherwise the table has none. Subsequent rows all describe standard cells within the table, '<td>'.
All major structural elements within the document are assigned 'id=' attributes automatically: <div>, <p>, <ol>, <ul> and <dl>. Other tags do not receive IDs automatically. The IDs are very straightforward: tag type plus a count appended to the end. For example the second paragraph in the document receives the ID 'par2'.
When the automatic behavior simply will not suffice, raw mode is available. Raw mode is toggled by three equal signs alone on a line, '==='. Everything from the line after raw mode is activated to the line before it ends is added to the document without formatting. The lines are trimmed and any blank lines are ignored, but any non-blank lines are included exactly as written.
The special formatting characters and their actions are:
Character | Action |
---|---|
/ or /#... | Open a division. This closes all open tags back to <body> and starts a fresh division. The second version ('/#') opens a division and includes a header of the given level. |
"0, "1, "2, "3, "4, "5, "6, "7, "8, "9 | Start an ordered list or add an item to an existing list or a <dd> to a definition list. |
"*, "., "+, "- | Start an ordered list or add an item to an existing list or a <dd> to a definition list. |
": | Start a definition list or ad a <dt> to an existing one. |
"# | Add a heading tag to any eligible element. The number of hashes indicates the heading level. |
"| | Start a table or add a row to an existing one. Individual elements are double-pipe separated. The first row (if defined) becomes the table headers, subsequent rows are <td> elements. |
"(blank line) | A blank line closes the last tag opened. |
"= | A single equals sign is used to escape a line that begins with an otherwise-significant character. To begin a line with an actual equals sign, double it! To start a line with a double-equal use '= =='. |
"! | Forcibly close all tags back to the <body> tag. |
"{ | Execute a macro command (see below). |
'"===' | Toggle raw mode. In raw mode all content is added as-is to whatever tag is currently open. Null or blank lines are ignored and the lines added are trimmed. The triple-equals must be the only thing on the line. |
Any other character | Other characters either form <p> tags or add directly to existing <p> or <div> tags. |
Ordered, unordered and definition lists opened immediately beneath a paragraph (no blank lines between) are included in the paragraph. Lines beginning immediately after this are also included in the same paragraph. Whitespace before and after the formatting characters is trimmed as is any at the end of each line. Where necessary space is replaced.
Macro commands are directives that do not specifically affect formatting. They may affect procesing or inline configuration.
All macro commands consist of a left curly brace followed (no space!) by one or more other letters or symbols. Some commands have arguments; these are separated from the command itself by at least one space. The closing brace is not required. Available macro commands are:
Command | Result |
---|---|
"{t Page Title | Set the title of the HTML document; HTMLTag setTitle() . |
"{i fileName "{g fileName | Copy the contents of an external file into the document (rawly!). The first version uses a standard FileReader and the second uses getResourceAsStream() .Any exceptions are reported at Error level. |
Macro commands are still considered standard directives; they will not be processed in raw mode!
Constructor Summary | |
---|---|
NoteworthyParser()
Standard constructor used for scripting and other instances. |
Method Summary | |
---|---|
HTMLTag |
parse(java.lang.String pNoteworthy)
This is the reason this class exists. |
void |
reset()
Reset the parser to its initial state. |
void |
setGenerator(MathGenHTML pGen)
Set the MathGenHTML instance used to generate tag classes. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public NoteworthyParser()
Method Detail |
---|
public void setGenerator(MathGenHTML pGen)
MathGenHTML
instance used to generate tag classes. Null clears it and one will be generated automatically when needed.
public HTMLTag parse(java.lang.String pNoteworthy)
pNoteworthy
- The Noteworthy-formatted data String.
HTMLTag
containing the proper markup.public void reset()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |