My post on TDWG 2011, day two, part two, starts at 2pm with the symposium on "Global biodiversity informatics initiatives update". This symposium was hugely useful to me as a biodiversity noob, giving me a broad overview of the major #biodinfo initiatives, which problems they are tackling, and where they are headed. I don't need to repeat that these are all based on my very wonky notes; all errors here are mine, and -- when you spot them -- please let me know in the comments section. The talks in this symposium were:
Eamonn O'Tuama talking about the Global Biodiversity Information Facility (GBIF) and its recent improvements, including:
Switching to using DarwinCore Archives as an input format: spreadsheets are easier for people to work with than XML! It now takes a month for new data to go from upload to being published; GBIF's goal is to get that down to a week.
GBIF has also revamped their taxonomic validation, adding other authority checklists their current taxonomic authority, the Catalogue of Life. In fact, they're now assembling a database of taxonomic checklists! I wonder if they'll show the changes in checklists over time.
A brand new GBIF Online Resource Center has been set up to make it easier for people to find out about and upload data to GBIF.
William Ulate talked about the Biodiversity Heritage Library, with a presentation explaining how the BHL network looks today, with nodes spread out across the planet. Books, journals and periodicals are uploaded through BHL nodes, and are then distributed to the others (and to the Internet Archive). Prime among these were BHL Europe and BHL-Australia (a part of the Atlas of Living Australia); he particularly noted BHL-Au's fantastic user interface, which will form the basis of UI improvements to the US website. He also mentioned BHL's Egyptian collaborator, Bibliotheca Alexandrina (a project older than BHL itself!).
William also pointed out that anybody could suggest books or journals for BHL to scan, and they would try their best to do it as fast as possible. Also, BHL's source code -- including the code that powers their website, APIs, and Citebank -- is publicly available.
BHL is also developing CiteBank, a database of biodiversity citations. Put that together with Rod Page's iTaxon idea, and you can see some really amazing -- and amazingly useful! -- interlinks between taxonomic names and literature happening.
Cyndy Parr presented on updates from the Encyclopedia of Life (EOL) project. She pointed out that EOL is all about aggregating and curating data; it's never going to be as smooth as Wikipedia, where all the data is in one place. However, it can have better import processes: all new names on ZooKeys/PhytoKeys are immediately added to EOL; it can also get feedback and corrections back to the original publishers much more efficiently than Wikipedia can.
Cyndy also showed off EOL collections of taxa, across any kingdom of life and collected by any criteria, from birds at the Smithsonian's National Zoo to the colour blue. Collections can have EOL communities form around them.
EOL also plans on moving to DarwinCore Archives flat-files, as an easier way of moving data in and out of their system. They have released a the largest possible phylogenetically-organized tree, representing the state of our phylogenetic knowledge.
Donald Hobern spoke about developments at the Atlas of Living Australia (ALAu) focusing -- much to my excitement -- on taxonomic issues.
He began by talking about descriptive taxonomy. He mentioned IdentifyLife, a database of identification keys, including -- get this -- a plan to build a single, gigantic Key to All Life. The idea is to support citizen science "in the broadest sense". ALAu is working on a project to port the DEscriptive Language for TAxonomy (DELTA) software suite into Java. Finally, he mentioned the Taxonomy Research and Information Network (TRIN), which hosts a pretty well-organized wiki.
Donald pointed out that ALAu's taxonomy arm had two primary communities to serve: the one interested primarily in geographical data, and the one interested primarily in taxonomic/collection data. I think he demonstrated this division by comparing the Australian National Herbarium datasets with Australia's Virtual Herbarium, but to be honest, my notes are somewhat in disarray at this point. This is because Donald is an excellent presenter, and so I wasn't focusing on my notes so much. Sorry about that!
The symposium ended with Dave Roberts talking about ViBRANT. He spoke mainly about the exciting new changes coming in Scratchpads 2.0, which -- based on Drupal 7 -- will have a prettier user interface, snappier access times, and much better scalability. There are ambitious plans to plug in Scratchpads to biodiversity publishers (such as PenSoft) as well as biodiversity producers and consumers (such as the Encyclopedia of Life and LifeDesks). Even more interestingly, they're setting up a job-processing backend for Scratchpads on the Oxford Batch Operations Engine (OBOE).
A sandbox Scratchpad is available, if you want to play around with the technology in its current state. Their source code is available, including Scratchpads 2.0. And, being on Git, all that code is easy to play with.
After a short trip to meet Walter Jetz at Yale, I'm back in Boulder, have seen Paul Simon in Broomfield, CO and Wikipedia Loves Libraries! at Norlin last week, and am making a start at putting together my research ideas. Fun times ahoy!
But first, I'm still working through my notes from TDWG 2011. Here's a first installment of my treasure trove from day two; obviously, many awesome presentations and parallel talks were missed, probably because I was too busy enjoying them to take down notes. All errors are mine, and when you spot them, please do let me know in the comments!
Part two will be out as soon as I can get around to it. Also: getting more and more excited for Life and Literature, just a week away tomorrow. It'll be great to see familiar faces from TDWG so soon after TDWG 2011 New Orleans!
I just discovered the #SciFund challenge, a Kickstarter-like project to raise money for scientific projects by promising deliverables created as part of the project. I think it's a great way of getting non-scientists involved in science, to say nothing of getting funded and of creating a concrete science-to-public deliverable system. So: excited!
Here's some of the projects I took a fancy to. Unfortunately for me, as a grad student lacking money, RocketHub (the hosting website) has no mechanism for "subscribing" to a project without paying that I can find. So this might be the only way for me to keep an eye on them (and maybe make a contribution with my next paycheck).
The most fascinating part of this enterprise for me is the connection between the science projects and the contributors: the deliverables. Rewards for donating ranged from credit (on websites or in publications), to a copy of the research results, to exclusive access to project blog and Twitter feeds (which make no sense to me: wouldn't you want those feeds to be as public as possible, to encourage more people to join in and contribute?). More interesting deliverables were also on offer, from elephant dung paper and calendars with photos of the research subject, to photos taken during the study, clay sculptures of dolphin fins, lyrics to a Tanzanian children's song, and comics personally drawn for you by Zach Weinersmith of Saturday Morning Breakfast Cereal fame. Allegedly, a comic of "Carl Sagan riding a unicorn" is not out of the question.
One thing I did find odd was the lack of links in the project descriptions. Only a few projects linked to websites, blogs, Facebook pages or Twitter feeds (one project going so far as making the Twitter and blog feeds "contributor only"), although just about all of them had really excellent websites hidden away in the "About Me" section. I love Twitter for letting me follow a project without getting too involved, so more Twitter feeds would have been great to see.
Thinking in the longer term, I wonder how well this would work without the #SciFund banner, though. Unlike music or movie crowdfunding, where artists can put up samples beforehand, it might be harder to prove your scientific worth to a broader audience. For instance, how would I differentiate between a serious project to study bee evolution and some nutjob trying to prove that evolution didn't happen? I could look at the project description, sure, but it's not hard to reference many scientific studies to give yourself an air of authenticity. Perhaps the #SciFund banner will eventually be replaced with institutional banners: if the research is being carried out by the Smithsonian, the University of Colorado or is partially sponsored by National Geographic, you can be pretty sure it's been vetted.