PDA

View Full Version : Incorrect javascript string handling



koruyucu
06-29-2011, 09:02 AM
Good morning/day/evening.

First of all thank you for the SgmlReader - I've been using it quite a log and it's really helpful library.

I've encountered an 'issues' - SgmlReader does not handle javascript strings correctly. Here is an sample of 'valid' html that would work in all browsers but SgmlReader throws an "InvalidOperationException" with error message: "There was no XML start tag open.":

<html>
<body>
<div>
<script type="text/javascript">document.write('<div><script></scr' + 'ipt>\n </div><!--\n<script></scr' + 'ipt> -->');</script>
</div>
</body>
</html>

It happens because SgmlReader tries to parse html in document.write parameter and does it incorrect due to presence of carriage return symbols and that closed tags are splitted.

Can you think of any workaround for this issues?

Thank you

koruyucu
06-29-2011, 09:17 AM
looks like if to replace html comments within javascript string ('<!-- ... -->') to smth else then SgmlReader will parse it correct