Results 1 to 2 of 2

Thread: How does SGMLReader work?

  1. #1

    Default How does SGMLReader work?

    These days ,I have done something with SgmlReader. It can always work.
    But I don't know how it works. I want to know the process or the theory of SgmlReader when converting a HTML to a XMLDocument.

    I've got the source code,but I don't understand .
    Thanks for answering. If you have some docs about SgmlReader.Can you E-mail me?

    My email is niejing0804@hotmail.com.

    Thanks a lot

  2. #2
    Join Date
    Jul 2006
    Location
    San Diego, CA
    Posts
    5,450

    Default

    SgmlReader uses a slightly relaxed HTML DTD for determining the meaning of elements. If an element, like <td> is not legal within the current element, the current element is closed, and the parser tries to validate it again. This process is repeated until it reaches a valid container element or the <body> element, which it's not allowed to close. This structural check is combined with a forgiving parser, which is able to recover from all kinds of badly formed HTML (but sadly, not all). Together, that forms the core of SgmlReader.
    Steve G. Bjorg - Chief Architect
    Did you check the MindTouch FAQ?
    Found a bug? Report it.
    Follow me on Twitter
    Find us on IRC: irc.freenode.net #mindtouch

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts