XML II?

There is a lot of debate about if XML parsers should try and recover from non-wellformed ness at the moment. The argument generally goes that if you’re parsing XML, that noone is going to parse non-wellformed XML the same way twice so who knows what you would end up with vs you should at least attempt to do something.

Now, My thoughts are that if you made it so that well formed XML had no obvious meaning people would be less inclined to try and correct it. For instance, XML has tag> tags at the end of a block, but if the tag was then it’s ambigious which tag it’s closing unless the document is well formed. As anyone who’s spent time trying to find out which } is missing in a C/C++/Java/etc program will tell you, it’s nearly impossible to do if you’re a human, let alone write a program to do anything sane with it.

The next thing is to make sure that attributes have closing “’s, well, noone really knows what an XML attribute is anyway, so get rid of ‘em. So we have instead:


And well, thats kinda messy, so lets replace < a>b with (a b) and uh, wee! sexp’s!

4 Responses to “XML II?”

  1. Sam Jansen Says:

    Categories don’t seem to work. If I click on “The Web”, for example, it doesn’t show any articles. Buggy wordpress? I dunno, works on mine.

  2. Sam Jansen Says:

    Also, the title is fucked. The ‘ character doesn’t show up properly in the title. This is the same in my wordpress. Maybe we need to find this bug and submit a patch.

  3. Jonathan Purvis Says:

    The reason why we have <a href="foo">bar</a> is that no one remembers the order of the foo and bar.

    See also http://www.ccs.neu.edu/home/dorai/t-y-scheme/t-y-scheme-Z-H-11.html for what happens when sexps get out of hand.

  4. Perry Lorier Says:

    right, thats why you have extra tags for attributes. ie, (a (href foo) bar)