GClasses

GClasses::GHtml Class Reference

This class is for parsing HTML files. It's designed to be very simple. This class might be useful, for example, for building a web-crawler or for extracting readable text from a web page. More...

#include <GHtml.h>

List of all members.

Public Member Functions

 GHtml (const char *pDoc, size_t nSize)
virtual ~GHtml ()
bool parseSomeMore ()
 You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting.
virtual void onTextChunk (const char *pChunk, size_t chunkSize)
 This method will be called whenever the parser finds a section of display text.
virtual void onTag (const char *pTagName, size_t tagNameLen)
 This method is called whenever a new tag is found.
virtual void onTagParam (const char *pTagName, size_t tagNameLen, const char *pParamName, size_t paramNameLen, const char *pValue, size_t valueLen)
 This method is called for each parameter in the tag.
virtual void onComment (const char *pComment, size_t len)
 This method is called when an HTML comment () is found.

Protected Member Functions

void parseTag ()

Protected Attributes

const char * m_pDoc
size_t m_nSize
size_t m_nPos

Detailed Description

This class is for parsing HTML files. It's designed to be very simple. This class might be useful, for example, for building a web-crawler or for extracting readable text from a web page.


Constructor & Destructor Documentation

GClasses::GHtml::GHtml ( const char *  pDoc,
size_t  nSize 
)
virtual GClasses::GHtml::~GHtml ( ) [virtual]

Member Function Documentation

virtual void GClasses::GHtml::onComment ( const char *  pComment,
size_t  len 
) [inline, virtual]

This method is called when an HTML comment () is found.

virtual void GClasses::GHtml::onTag ( const char *  pTagName,
size_t  tagNameLen 
) [inline, virtual]

This method is called whenever a new tag is found.

virtual void GClasses::GHtml::onTagParam ( const char *  pTagName,
size_t  tagNameLen,
const char *  pParamName,
size_t  paramNameLen,
const char *  pValue,
size_t  valueLen 
) [inline, virtual]

This method is called for each parameter in the tag.

virtual void GClasses::GHtml::onTextChunk ( const char *  pChunk,
size_t  chunkSize 
) [inline, virtual]

This method will be called whenever the parser finds a section of display text.

bool GClasses::GHtml::parseSomeMore ( )

You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting.

void GClasses::GHtml::parseTag ( ) [protected]

Member Data Documentation

size_t GClasses::GHtml::m_nPos [protected]
size_t GClasses::GHtml::m_nSize [protected]
const char* GClasses::GHtml::m_pDoc [protected]