[Warning: Long long rant]
I'm scratching a rather irritating itch. I'd like to travel around in Bangalore, but I don't have the money to move around in rickshaws, nor can read Kannada to figure out where a particular bus goes (No dual language boards here!). After having been pampered by BEST, the BMTC is getting to be a royal pain.
I've bought maps, guide books, route maps, time tables, but they've all been "designed" without even consideration for the most common use-case. Just know the stop names? You're out of luck. No source lists stop names, only stage names (which is what the BEST also does, BTW).
Well, not exactly no source, for there is one source that has stop-level granularity. The superbly detailed large-format BMTC route map (50 bucks!) is that only source, but nobody intended it to be used that way. In fact, it's a mystery to me how people are supposed to use this thing. Only a laborious search through the entire map will reveal the locations of the stops, and once that is done, correlating that information to route numbers is next to impossible. In fact, the map wastes precious space on "segment numbers" (which I guess actually constitute routes), but that information in its present format is only intelligible to somebody on the inside--it's not used by any other information on the map.
Well, this
is Bangalore, the IT capital of India, and so the BMTC provides a web-based interface to search. Except that it sucks big time. There are two versions, (and I'm guessing here)
a "professional" JSP edition, and a
"final-year project" PHP version. Forget the technical mistakes---using POST for idempotent data---the UI design is horrid. It's what most UI designers would say is the perfect example of the programmer-designed interface. Letting the UI reflect your code, and not the opposite. Both the "search engines" present drop-downs, presenting stage-to-stage information. Wow. I imagine that everybody travels from stage to stage only.
So what do I do? I do what I do best. I'm now writing a search engine for finding information about buses. Its called
busfinder, and will initially provide "similar" functionality to the existing "search engines". Except, of course, it won't be tied down to the BMTC. I intend it to be used by any bus service. It will be small, fast, and featureful. It'll be written with the traveller in mind, and not the backend database structure. I've got a little experience doing this sorta thing, and hoping this will also clear up some of my programming blues.
Which brings me to the
real topic of this post. Which is data. Everyone who's spent some time in this industry knows that programs are a dime a dozen. It's your data that's actually valuable. That's why vendors prefer to use proprietary data formats--once locked in, they know you'll be at their mercy forever. And that's why there exists even a data conversion sector.
But for my search engine, I need data. The BMTC could do me a wonderful service by giving me access to their data in a machine-readable form (maybe they will, if I ask, but I haven't and won't). But for now, I had to scrap their HTML pages. And as I have found, their data sucks. There are missing routes, incorrect stop names, and inconsistent information that all stink to heaven of five-buck-an-hour typists and even cheaper database architects (normalization, anyone?).
And then I had this idea inspired by their application to make their route maps available via Google Maps (which now has street-level data for Bangalore, but not yet searchable, aargh!), but there seem to be no public-domain (or even freely-licensed) geo-referenced information available for Bangalore. The info exists--see
http://traffic.mapunity.org (functionally similar to busfinder, but with the same UI problems)and
http://www.janaagraha.org/jmap/, but no data is available for download. It's all locked up behind the respective applications. No APIs either, so no mash-ups possible.
Which is why I make this appeal. Free your data! Make your data available in machine-readable format with liberal licenses for use by the general public. If you're a government institution or somebody who doesn't necessarily have computer expertise---tap the internet! Release your data and see how the brilliant minds of the 'Net breath life into your data. Remember, there is
always a better program. But you may have the best data.