This class is for parsing HTML files. It's designed to be very simple. This class might be useful, for example, for building a web-crawler or for extracting readable text from a web page.
More...
#include <GHtml.h>
List of all members.
Public Member Functions |
| GHtml (const char *pDoc, size_t nSize) |
virtual | ~GHtml () |
bool | parseSomeMore () |
| You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting.
|
virtual void | onTextChunk (const char *pChunk, size_t chunkSize) |
| This method will be called whenever the parser finds a section of display text.
|
virtual void | onTag (const char *pTagName, size_t tagNameLen) |
| This method is called whenever a new tag is found.
|
virtual void | onTagParam (const char *pTagName, size_t tagNameLen, const char *pParamName, size_t paramNameLen, const char *pValue, size_t valueLen) |
| This method is called for each parameter in the tag.
|
virtual void | onComment (const char *pComment, size_t len) |
| This method is called when an HTML comment () is found.
|
Protected Member Functions |
void | parseTag () |
Protected Attributes |
const char * | m_pDoc |
size_t | m_nSize |
size_t | m_nPos |
Detailed Description
This class is for parsing HTML files. It's designed to be very simple. This class might be useful, for example, for building a web-crawler or for extracting readable text from a web page.
Constructor & Destructor Documentation
GClasses::GHtml::GHtml |
( |
const char * |
pDoc, |
|
|
size_t |
nSize |
|
) |
| |
virtual GClasses::GHtml::~GHtml |
( |
| ) |
[virtual] |
Member Function Documentation
virtual void GClasses::GHtml::onComment |
( |
const char * |
pComment, |
|
|
size_t |
len |
|
) |
| [inline, virtual] |
This method is called when an HTML comment () is found.
virtual void GClasses::GHtml::onTag |
( |
const char * |
pTagName, |
|
|
size_t |
tagNameLen |
|
) |
| [inline, virtual] |
This method is called whenever a new tag is found.
virtual void GClasses::GHtml::onTagParam |
( |
const char * |
pTagName, |
|
|
size_t |
tagNameLen, |
|
|
const char * |
pParamName, |
|
|
size_t |
paramNameLen, |
|
|
const char * |
pValue, |
|
|
size_t |
valueLen |
|
) |
| [inline, virtual] |
This method is called for each parameter in the tag.
virtual void GClasses::GHtml::onTextChunk |
( |
const char * |
pChunk, |
|
|
size_t |
chunkSize |
|
) |
| [inline, virtual] |
This method will be called whenever the parser finds a section of display text.
bool GClasses::GHtml::parseSomeMore |
( |
| ) |
|
You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting.
void GClasses::GHtml::parseTag |
( |
| ) |
[protected] |
Member Data Documentation