My post on TDWG 2011, day two, part two, starts at 2pm with the symposium on "Global biodiversity informatics initiatives update". This symposium was hugely useful to me as a biodiversity noob, giving me a broad overview of the major #biodinfo initiatives, which problems they are tackling, and where they are headed. I don't need to repeat that these are all based on my very wonky notes; all errors here are mine, and -- when you spot them -- please let me know in the comments section. The talks in this symposium were:
Eamonn O'Tuama talking about the Global Biodiversity Information Facility (GBIF) and its recent improvements, including:
Switching to using DarwinCore Archives as an input format: spreadsheets are easier for people to work with than XML! It now takes a month for new data to go from upload to being published; GBIF's goal is to get that down to a week.
GBIF has also revamped their taxonomic validation, adding other authority checklists their current taxonomic authority, the Catalogue of Life. In fact, they're now assembling a database of taxonomic checklists! I wonder if they'll show the changes in checklists over time.
A brand new GBIF Online Resource Center has been set up to make it easier for people to find out about and upload data to GBIF.
William Ulate talked about the Biodiversity Heritage Library, with a presentation explaining how the BHL network looks today, with nodes spread out across the planet. Books, journals and periodicals are uploaded through BHL nodes, and are then distributed to the others (and to the Internet Archive). Prime among these were BHL Europe and BHL-Australia (a part of the Atlas of Living Australia); he particularly noted BHL-Au's fantastic user interface, which will form the basis of UI improvements to the US website. He also mentioned BHL's Egyptian collaborator, Bibliotheca Alexandrina (a project older than BHL itself!).
William also pointed out that anybody could suggest books or journals for BHL to scan, and they would try their best to do it as fast as possible. Also, BHL's source code -- including the code that powers their website, APIs, and Citebank -- is publicly available.
BHL is also developing CiteBank, a database of biodiversity citations. Put that together with Rod Page's iTaxon idea, and you can see some really amazing -- and amazingly useful! -- interlinks between taxonomic names and literature happening.
Cyndy Parr presented on updates from the Encyclopedia of Life (EOL) project. She pointed out that EOL is all about aggregating and curating data; it's never going to be as smooth as Wikipedia, where all the data is in one place. However, it can have better import processes: all new names on ZooKeys/PhytoKeys are immediately added to EOL; it can also get feedback and corrections back to the original publishers much more efficiently than Wikipedia can.
Cyndy also showed off EOL collections of taxa, across any kingdom of life and collected by any criteria, from birds at the Smithsonian's National Zoo to the colour blue. Collections can have EOL communities form around them.
EOL also plans on moving to DarwinCore Archives flat-files, as an easier way of moving data in and out of their system. They have released a the largest possible phylogenetically-organized tree, representing the state of our phylogenetic knowledge.
Donald Hobern spoke about developments at the Atlas of Living Australia (ALAu) focusing -- much to my excitement -- on taxonomic issues.
He began by talking about descriptive taxonomy. He mentioned IdentifyLife, a database of identification keys, including -- get this -- a plan to build a single, gigantic Key to All Life. The idea is to support citizen science "in the broadest sense". ALAu is working on a project to port the DEscriptive Language for TAxonomy (DELTA) software suite into Java. Finally, he mentioned the Taxonomy Research and Information Network (TRIN), which hosts a pretty well-organized wiki.
Donald pointed out that ALAu's taxonomy arm had two primary communities to serve: the one interested primarily in geographical data, and the one interested primarily in taxonomic/collection data. I think he demonstrated this division by comparing the Australian National Herbarium datasets with Australia's Virtual Herbarium, but to be honest, my notes are somewhat in disarray at this point. This is because Donald is an excellent presenter, and so I wasn't focusing on my notes so much. Sorry about that!
The symposium ended with Dave Roberts talking about ViBRANT. He spoke mainly about the exciting new changes coming in Scratchpads 2.0, which -- based on Drupal 7 -- will have a prettier user interface, snappier access times, and much better scalability. There are ambitious plans to plug in Scratchpads to biodiversity publishers (such as PenSoft) as well as biodiversity producers and consumers (such as the Encyclopedia of Life and LifeDesks). Even more interestingly, they're setting up a job-processing backend for Scratchpads on the Oxford Batch Operations Engine (OBOE).
A sandbox Scratchpad is available, if you want to play around with the technology in its current state. Their source code is available, including Scratchpads 2.0. And, being on Git, all that code is easy to play with.