UPEI's Model for a "Shrink-Wrapped" Institutional Repository
I have been meaning to post on UPEI's approach to building an institutional repository for a number of months, but like so many things it always seemed to go on the back burner. I thought I would take the opportunity of a cloudy day to provide a brief description of our Repository in a Box project (RIB), which we will be launching this Fall.
RIB is built using UPEI's evolving Drupal/Fedora framework, which is also the basis of our VRE project. RIB is based on a series of workflows on top of the repository backbone:
- A collection of citation data in an appropriate collecting database (currently RefWorks, but most likely to migrate to Zotero) which represents as complete a collection of the scholarly output of the campus community (at this point faculty) as we can generate. This is generated by harvesting existing databases and adding metadata from CVs.
- A Fedora content model that defines the nature of the RIB disseminators, citation objects and associated datastreams, including: Qualified Dublin Core record; Original RefWorks XML record; Sherpa-Romeo record; document thumbnail; document PDF
- A script (or as we call it, in a play on Fedora vocabulary, an inseminator) that takes a RefWorks XML file of the complete citation database, converts it to FOXML and inserts into a Fedora collection, storing each citation as a separate object.
- A special disseminator that performs a live search of the Sherpa-Romeo database of publisher open access policies and adds/updates a Sherpa datastream in the Fedora object for the article being viewed.
- An openURL button which send the citation data to out CUFTS linker and enables discovery of the publisher version of the article.
- A series of XSLTs that convert the metadata and other datastreams for display in the Drupal interface.
- A search interface, using Drupal's built-in search, that searches the complete Fedora collection, returning results to Drupal.
The end result will be an IR that launches with an almost complete collection of scholarly output for the institution. All the faculty member has to do is log in and the system will display their publications (this is the final piece we are currently working on before we launch) and all associated data.
BY viewing the detailed record, the user can view and edit metadata, look for the online version and add datastreams.
The individual can click the "Get-It @ UPEI" button to retrieve the final version, if desired, read the publisher open access policy (including a link to the full policy) and add the appropriate version of the pre-post-final print.
With a minimum of training our hope is that the presentation of a 90% complete IR record will encourage faculty to complete the process. Some future enhancements will include parsing the Sherpa-Romeo record to automatically grab the publisher version where appropriate and implement disseminators to convert word-processing formats. We will be providing the RIB as an example in our packaged open source Drupal/Fedora module, which will be available at the end of September from SourceForge.
Have you done any tests to see the percentage of your institution's output that this workflow would successfully deal with in practice?
Posted by: Les carr | October 01, 2008 at 07:19 AM
Hi Les - with respect to % of output the workflow would deal with I would expect the following: metadata only (100% - library staff do this and is largely done now); publisher copy deposit allowed (80% of items in this category - an estimate based on the number of publishers not in Sherpa); pre/post-print deposit required (70% - an estimate based on current processes we are implementing to assist faculty with adding content). These are currently estimates, as we are in process with 2 and 3 in particular. These are based on a couple of faculty we are working with now, so are not yet hard numbers - I will post firm numbers as we do more of these.
This also does not discuss the deposit of raw research data, which we are also starting to work on. I don't expect to have any good numbers on this for 6-12 months, but I suspect it would be in the 10-20% mark for newer publications.
Posted by: mleggott | October 01, 2008 at 08:47 AM
Has there been any recent developments with RIB - Repository in a Box or has it been abandoned?
Posted by: Allegra Gonzalez | November 25, 2009 at 02:43 PM
Hi Allegra - the RIB model has been released with our open source project, Islandora (http://islandora.ca/). The full set if functionality has been made available in our Institutional Repository solution pack and can be downloaded as a VirtualBox image for testing or via the code repository (http://fedora-commons.org/confluence/display/ISLANDORA/Islandora). You can also see some of the functionality at our IslandScholar site (http://islandscholar.ca/).
We will be releasing more service packs in the coming months, including a version 2 of the RIB, or institutional repository, solution.
Posted by: Mark | November 27, 2009 at 11:03 AM