I just noticed another problem with ParseDocType, in particular regarding the internal subset.
Take the following doctype declaration:
sgmlReader.InternalSubset ends up as "\r\n<!ENTITY lt SDATA \"[lt ", and the parser dies afterwards trying to parse the remaining "]\">\r\n]>".
<!DOCTYPE root PUBLIC "-//TEST//SGML-DTD for testing//EN" [
<!ENTITY lt SDATA "[lt ]">
I didn't get around to write a fix for this yet, but I thought I'd notify you of the problem.
I'm not sure if it makes more sense to simply loop inside ParseDocType in case the parsed subset contains a '[' or rather munge some logic into Entity.ScanToEnd to account for terminators also being able to be contained within the entity being parsed (altho being symmetrically, like quotes or brackets that open and close at some point).
Note that I didn't try latest HEAD from GIT since the project moved from SVN, but from a quick look the problem might still be there.