The Robertson Library at UPEI launched our new institutional repository, IslandScholar, on December 2nd, 2008. Unfortunately, I didn't have much time to make the announcement on my blog until now :-( Bad Blogger... I will say that the launch was a great success, with a keynote from Dorothy Salo that helped set the stage, a presentation from me on the approach of IslandScholar and presentations from UPEI faculty who are publishing open access journals: Godfrey Baldacchino and Annabel Cohen. We are now in the process of adding full text to the repository, a process which will also be facilitated with an upcoming version of IslandScholar, which will add additional workflow tools.
IslandScholar is a little different than the average IR:
- at launch we had close to 7000 records, almost all of which were citation only data;
- it uses an open source module we developed called Islandora, which is currently a Drupal 5/Fedora 2 implementation (a Drupal 6/Fedora 3 version is imminent);
- it incorporates a live look-up with SHERPA/RoMEO to display (where available) the publisher policy with respect to open access;
- the system is COINS compliant, providing a place for your OpenURL linker icon so your users can discover alternate fulltext, or before we load it up;
automatic conversion of documents submitted to a specific PDF format, with storage of the original file;
- implementation of a simple and flexible security/workflow model that makes it easier for authors/collaborators to submit material.
A little more of the philosophy behind IslandScholar and how we designed it might be of use. We decided to harvest the research output of our faculty as a first step, as this would not only provide us with a complete set of records, but also a kernel with which to launch a more efficient and effective workflow. For example, with the harvested citation data we can send a request to SHERPA RoMEO for a policy record and present it on the fly to the user. We also separate the data from the metadata from the workflow, providing a more modular and flexible architecture. The additional bonus is that we can provide RSS feeds or special widgets for faculty that would provide their citations (or their department, faculty, etc.) in a web mash-up - an additional incentive to help us complete the data.
In our case we harvest the metadata with RefWorks, using all the advanced features of this rich metadata tool. Generally that means we trawl databases for specific authors and then add to RefWorks, or we copy and paste from a CV sent to us. The editing tools in RefWorks provide a vert nice and powerful way to enter repeating data, etc. From there we ingest the records into Fedora using the RefWorks XML as the input format, storing the raw RefWorks record in the Fedora object as a datastream. We also do a conversion from RefWorks to Dublin Core, which accommodates our DC/QDC Lucene indexing model. When the user wants to edit the metadata we present a Drupal form for editing the RefWorks record and we also sync the changes to the DC datastream.
On the security side we implemented a simple model that says a user can have one or more of View/Create/Edit/Delete permissions (thanks to Matt Zumwalt for his "simplify" mantra) and we assign those permissions based on their username and home department. So for example, any user from the Biology Department can edit metadata or upload full text to any item where the primary or secondary author is from that department. This simplifies editorial access by assuming that anyone from a specific department can carry out those tasks. The roles are stored in the Drupal user database and are harvested from our institutional LDAP server when the user logs in. we can also create/assign roles in Drupal if needed. A Java servlet filter acts as the security cop, checking roles in Drupal and sending them to Fedora, allowing a comparison to the Fedora XACML policies attached to that object. This provides a robust security layer that can be controlled via Drupal roles and Fedora XACML, becoming as granular as needed: unique security policies at the collection (e.g. Department of Biology) or object (e.g. article) level.
The beauty of the IslandScholar model is that it saves the faculty most of the work: all they have to do (he says, with a wink on the ALL) is upload the appropriate version of their research article. With the help of the SHERPA RoMEO record and the link to our resolver, that task is fairly simple. Users who login and have the appropriate permissions automatically see the upload form, which presents buttons/checkboxes for: document version (pre, post, etc.); use permission (contact publisher/author, CC by-nc-nd); I have the right to upload. When they click the Upload button the system will auto-convert the article (if not PDF) using a headless version of Open Office 3, which does a great job of transforming most commons formats (except for WordPerfect). The system stores both the original WP version and the PDF, so we can always go back and do a little QA if the conversion was not good. With that the job is done.
Over the coming months we will be working with individual faculty to load their articles into IslandScholar and with the success of the launch, we have lots of volunteers. Over the next 2-3 months we will also be developing a more detailed rights workflow and adding it to a future release of the Islandora code, which is the base of IslandScholar, and all our digital projects. We will be announcing more collections and code developments on our IslandArchives.ca site. Many thanks to the entire Robertson Library team for their hard work on this project, but especially: Paul Pound, Gerry Forsythe, Melissa Belvadi, Dawn Hooper, Peter Lux, Grant Johnson, Alexander O'Neill.
On a more personal note, the one thing that makes me incredibly sad, having gone through a sample of the 7,000 citations to look for those allowing deposit of the publisher PDF, is how much of the farm academics have given away. It really comes home when you look at an institutions intellectual output and the restrictions publishers place on other people's hard work and intellectual property. Any academic that sees this should be concerned and immediately cross off any journal publisher that does not allow open access. My rough guess would be that less than 10% of the articles in IslandScholar would fit this category, with variation by discipline. Why have we been so lax and keen to give up our publicly-funded research record? Why do so few faculty care about this one? I know the answers but felt that I had to ask the questions because I still don't believe that we can be so cavalier about an issue so critical to our success as a society: free and unfettered access to publicly funded knowledge. What a loss.