Date: 2012-11-20 04:59 pm (UTC)
jack: (Default)
From: [personal profile] jack
Ah! I think maybe we do have an ambiguity in "parse".

I think I interpreted "parse html" to mean "parse an html document as html", ie. inherently involving parsing the salient features of html, notably nested tags. But you equally validly interpreted "parse html" in a more general way as "parse an html document in a structured way to get data out of it in a useful form".

And the first way, "parse html" is inherently problematic, because even if it works in some cases, "works" implies "generate some sort of data representation of the structure of the document" which is exactly what you can't do. But the second way, "parse html" includes things like "scraping useful information out of it" which is 100% valid in a reading-a-clock way.

Does that sound right?
If you don't have an account you can create one now.
No Subject Icon Selected
More info about formatting

Loading anti-spam test...

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org