PDA

View Full Version : How does SGMLReader work?



roger nie
06-21-2010, 10:59 AM
These days ,I have done something with SgmlReader. It can always work.
But I don't know how it works. I want to know the process or the theory of SgmlReader when converting a HTML to a XMLDocument.

I've got the source code,but I don't understand .
Thanks for answering. If you have some docs about SgmlReader.Can you E-mail me?

My email is niejing0804@hotmail.com.

Thanks a lot

SteveB
08-15-2010, 04:45 PM
SgmlReader uses a slightly relaxed HTML DTD for determining the meaning of elements. If an element, like <td> is not legal within the current element, the current element is closed, and the parser tries to validate it again. This process is repeated until it reaches a valid container element or the <body> element, which it's not allowed to close. This structural check is combined with a forgiving parser, which is able to recover from all kinds of badly formed HTML (but sadly, not all). Together, that forms the core of SgmlReader.