Possibly one of the suppositions ptc24 is alluding to is the implicit difference in standards of success between natural languages and computer languages. With any computer language, we aim for perfection in parsing: if just one program, no matter how fiddly, satisfies the syntax specification but is not correctly understood by the parser, then that's a bug and it wants fixing. Hence, the fact that a regexp will inevitably fall down in some complicated corner case of HTML or other, no matter how many common cases they get right, is sufficient to justify the statement that regexps cannot parse HTML.
But if you applied the same standard to natural language, you'd have to say that nobody and nothing can parse it!
no subject
Date: 2012-11-20 02:31 pm (UTC)But if you applied the same standard to natural language, you'd have to say that nobody and nothing can parse it!