Archive for October, 2004

The (un?)reliability of the Internet

Tuesday, October 26th, 2004

I’ve been tinkering with some of the tools I’ve been working on. I have my Traceroute Mesh tool which I wrote ages ago. With some help from AJ from Maxnet, I got a BGP feed, and started doing some more interesting things. I wrote a simple Whois Interface, with syntax colour highlighting, and information live from BGP. It’s kinda useful to see what prefixes are announced by an AS, and what AS announces what prefix, as well as looking up abuse information. I had fun writing this, especially implementing the leaky bucket rate limiting code.

I also wrote a tool that takes a destination prefix/AS, and shows the all the paths I’ve ever seen to that prefix/AS (astr). This can be used when the Internet is “broken” to figure out how the internet actually works when it’s working, or to look at historic data. Yay. However it gave me an idea, looking at almost any prefix it shows multiple paths. So how unstable is the internet?

Well, the answer seems to be “very” unstable. For each prefix I measured the longest time it had between route changes, and plotted this. This shows for instance that 50% of prefixes change more frequently than once every 2 days.

The answers to the journal problem?

Monday, October 4th, 2004

Scientific Journals take your hard worked paper that you’ve written, they then organise (unpaid) people in the field to peer review your work, then publish them, and charge everyone to read your work. Computer scientists in particular are in a good position to completely avoid this. We have the ultimate publishing infrastructure and it’s called the Internet. It’s nearly free to publish new papers, and it’s nearly free to “subscribe” to papers. However it’s missing the all important peer review step.

Well, you could allow anyone at any time to review a paper, we kind of have this now with people sometimes publishing “response” papers. However the Journals choose people to do the reviewing that have a clue. The Internet certainly has it’s share people with a clue, however there are a lot of people out there that don’t know what they’re talking about. So obviously some kind of “rating” as to how important someone’s view is to a certain topic. Note that people who are good in one area aren’t immediately to be assumed to be good in another. I may be able to write two lines of code without looking like a total fool, but don’t expect me to be able to figure out how to do neurosurgery.

Who do we trust? Well, we can trust people who have published papers in peer reviewed journals. We can also push out the trust metric like googles page rank based on citetations. Good papers get cited regularly, good people cite good papers. Somehow we need to find some way of determining if a paper is within someones field or not. Maybe by looking at the papers they cite? Is this too easy to abuse? Perhaps something using something like the nzdl’s “phrasier” to search for key phrases and group based on that?

A good search interface over this is of course necessary, being able to quickly and easily find papers that relate to a topic is a very important thing to be able to do. You could even earn money from it by allowing universities and other educational institutions to sign up for “gold” services including things like emailing/rss fields for:

  • A paper that cites a given paper is “published”
  • A review for a given paper
  • A paper is written in your “field” (for reasonably narrow definitions of narrow)
  • Any paper written by a certain author
  • Any review written by a certain author

Maybe have publishing a dead tree copy of papers in a specific “field” at regular intervals. Google ads could possibly also work, although I’m not sure how applicable they could be. (Papers by definition don’t usually talk about things people have products for yet…)

Now who should do this? The people that run Citeseer would be an obvious group to do it, they have the database more or less already there. They just don’t seem to be well funded to make good use of their database (adding a new paper to their database seems adhoc at best.) All they’d need to do is allow adding “comments” on a paper, some login system, and the “rating” system. Google would be another excellent option, probably starting by purchasing Citeseer. The NZDL people are big on search too. They could join in the fray. Maybe someone else?

The world is ready for a revolution in this space, it’s begging for someone to do it, where are they?