
At last month’s TDWG2009 conference I was on a panel for a brief discussion at the end of a session. There were around 200 people in the audience and handful of us up front as lambs for the slaughter.
One of the questions from the floor concerned the automation of the taxonomic process. I don’t recall the precise question but it triggered one of my (probably boring) canned responses.
I pointed out that the usual practice in software engineering, when asked to automate a system, is to produce a Domain Model based on an analysis of some Use Cases that then leads on to some Object Model or implementation model that is actually created in software. The assumption behind this is that whatever was being done was good but needs to be done faster – with computers!
In biodiversity informatics, and particularly in biological taxonomy, this is not such a good idea. Current working practice was developed in the light of the prevailing technology of the time. If computers and the internet had been available from the start things would probably have been done differently. The worst thing we can do now is automate a paper based system. We should take the opportunity to re-engineer our working practices. To ask the dangerous questions (that are usually only asked by those students who drop out and become multi-millionaires) like “Why do we bother doing this bit?”
Imagine my delight when I got this email from David King at the Open University agreeing with me!
You struck a chord with me in your wrap up session. Several decades ago my first job was with British Steel (remember them?). Anyway, I was in the Computer Department looking after a production service mainframe with 8Mb of memory and leading edge technology like that. When we recruited applications programmers they were told in no uncertain terms that to simply take an existing paper based workflow and replace it with one that just mapped one piece of a paper to one screen was tantamount to a sackable offence. If we were going to the expense, bother and risk of changing an existing workflow then we should take the opportunity to review it in the light of what is now possible with a computer… [my emphasis added]
Clearly great minds think alike (as fools seldom differ).
It is a good job nobody in the biodiversity informatics community is working to reproduce paper publications like this Ranunculaceae page on eFloras! I am not picking on eFloras here. There are many similar projects and I am tacitly involved in some of them. Perhaps we are all committing sackable offences. My point is this. If (and it is a big IF), it is necessary to present data like an “old fashioned” printed flora or fauna then it should only be seen as a byproduct of the taxonomic process. The specimen, character and observational data should be primary as it can be re-purposed. Data in document form (even if it is hyperlinked) is effectively dead and requires an enormous effort to re-animate – just like Frankinstien’s monster. Perhaps we should stop producing documents like this completely.
Once again, amen Roger. When do you serve the cool-aid?
Roger, tho I know you’re not picking on eFloras, a project that falls under my management, I would like to describe what’s happening behind that page. It’s not a static document; it’s atomized into a rich data model that breaks these data into the pieces you’d like. This particular web display was done so that the atomized data can be used & referenced in ways which the **VAST** majority of practicing taxonomists are comfortable. They want these web pages (just one way of displaying/consuming data) to look like printed works because that’s how they’re used to working with these data. We built these UIs 5 years ago based on user needs. However, we’ve built services on top of these data so that they can be repurposed, with EOL being the biggest (only?) consumer. We can always do better & I’m sure someone out there (Rod Page) will point out how (justifiably) incomplete our attempts here are. I’m in full agreement on your vision and just wanted to describe that we’re trying to make these data more reusable in forms like you & others want.