This is a fascinating article that provides examples from leading researchers in the biological sciences of how applying information management/data curation/semantic technologies to the scientific literature (as well as workflows and research data) can facilitate discovery and new research. Some of the best example I have seen to date from the practitioners themselves of how the new tools can transform the creation of new knowledge. From the intro:
This supplement has focused on progress in text mining applied to biology, and genomics in particular, as reflected in the BioCreative II results. There is a growing demand to 'translate' information from text into more computable forms and to cross-link the information with relevant biological databases. This linkage has the potential to improve the connection between the annotations in biological databases and the supporting evidence contained in the literature. Current biological databases rely heavily on expert human curation, which requires that PhD level biologists read the (relevant) literature carefully, extract specific kinds of information, and encode each snippet of information into an entry in a database using an ontology or controlled vocabulary (see article by Chatr-aryamontri and coworkers  in this supplement). Given the growing volume of literature and new high-throughput methods, it is becoming urgent to provide tools that can reduce time and cost of curation, increase consistency of annotation, and provide the linkages to supporting evidence in the literature that make the annotations useful to researchers. Indeed, the distinction between biological databases and the literature is becoming increasingly blurred [2-4], and there is active discussion about whether capture of information from free text can be done before publication or extracted from the literature after publication. In addition, we are seeing the emergence of new tools to aid in massive extraction of information from both literature and biological databases (for example, WikiProfessional ) or on-demand extraction of information from the literature (Information Hyperlinked Over Proteins [iHOP] ). These tools will provide improved access to different classes of users, who need different types and granularities of information, ranging from retrieval of relevant articles, to identification of passages or individual sentences, to phrases or biological facts, generally encoded in a controlled vocabulary or ontology.
And thanks to Open Access this article is available to all now, not 1 year from now, ensuring that the world evolves at a pace faster than any proprietary system would allow.