Thursday, March 03, 2011

Electronic Resources and Libraries Conference - Web Scale Discovery

I just got home from the Electronic Resources and Libraries conference that took place earlier this week in Austin, Texas, where it was 30 degrees Celsius or so yesterday. I went walking yesterday afternoon in the downtown core with the Texas sun beating down on me.

I was happy to see hundreds of people skating on the Rideau Canal as I was riding home in a taxi from the Ottawa International Airport late this afternoon. How in the world do other people survive without a big winter and bracing cold and snow? And outdoor skating? It will always be a mystery to me ...

Anyway, another big theme that emerged at the conference is what is called "web scale discovery" or WSD [brief explanation and link to recent issue of Library Technology Reports on the subject].

Basically, WSD tools claim to offer a unified search of all of a library's offerings through a single interface. Contrary to federated search, WSD tools are based on a pre-harvested centralized unified index of an institution's licensed and local collections.

Services such as Serials Solutions Summon, WorldCat Local, Primo Central or EBSCO Discovery pre-index material from subscription databases, library holdings, dissertations, institutional repositories, e-book subscriptions, etc. to allow fast, simultaneous searching.

We briefly looked into WSD at my place of work but decided not to pursue things further for a few reasons. In particular, not all vendors of legal research materials play along and will allow their content and metadata to be harvested into a unified index. And these tend to be relatively expensive products.

One of the WSD presentations at the conference was by Athena Hoeppner (University of Central Florida) who provided an overview of the features of some of the major WSD products.

Despite differences, they all offer faceted search to limit results in various ways (material type, date, etc.). All allow filtering to local content only (library print holdings, archives, special collections etc.). Many offer enrichments such as bookjackets, reviews, user ratings and tagging, citation statistics (how often a scholarly article has been cited).

According to Hoeppner, user features are amazing: there are alerts, RSS feeds, capability to export to bibliographic management tools, persistent folders etc. These are products with very rich functionality and user-friendly interfaces.

Librarians at Montana State presented their experience at another workshop. They tested one product, found it wanting and opted to go with another less than a year ago that they have branded CatSearch. Their experience contained elements they described as the good, the bad and the ugly.

Among the good: interdisciplinary searching is greatly enhanced. As well, e-books are searchable as they become available from the publisher (faster in fact than the MARC cataloguing records for them are released and loaded into the OPAC). They have also noticed that forgotten (or more obscure?) material is revealed and that the tool they use makes local digital collections more "discoverable".

But there is the bad. WSD tools do seem to imply, by their very scope and humongous index size, that they are the end-all of searching, not a message libraries want to be sending to members of the "Google generation" who already assume all information must always be one effortless click away. No tool ever searches 100% of an institution's entire holdings but it's implied in the very term WSD.

Another problem Montana State librarians have run into is related to troubleshooting when links to any specific resource are broken. It has become hard to figure out who is in charge of fixing the problem: is the cause at the database provider level, the WSD vendor level, the library proxy server level, etc.?

And the ugly. Lots of noise rises to the top of results lists (book reviews, newspaper articles, open access material). It can all be filtered out using the facet limiters, but it is still a problem. Librarians also dislike the lack of clarity of the default relevancy ranking algorithm used. WSD tools create a massive unified index that is searched according to a mysterious proprietary formula and librarians are often puzzled by what pops up. Then of course, there is the overwhelming number of results as the default search is often pulling in thousands of results. Can you spell "information overload"?

In another presentation, a librarian from Arizona State University (ASU) described how that institution implemented their Library One Search using the Serials Solutions Summon product.

ASU librarians were noticing that students were going to the catalogue assuming it allowed them to search for everything: books, articles etc. when that was obviously not the case. And then, there is the Google Generation phenomenon. One button, one click, all the info you want. WSD was intended to help address that "problem".

At ASU, WSD does not cover everything but it does, for example, cover 97% of the content of ASU's top 100 journals. ASU also calculates that close to 92% of the items in sources with unique ISSNs are indexed. So, that's very "comprehensive".

ASU marketed and branded the heck out of Library One Search, plastering it on the library home page, on the student portal (where students log in to register, choose classes, pay tuition), on course management software, on library research guides.

So, it is clear to me that WSD is a trend, especially in the academic milieu.

But these tools are not cheap and it is unclear how appropriate they are for the legal research market right now as Westlaw content and content from other major legal info providers cannot be harvested as part of the process to create the unified mega-index these products require if they want to market themselves as "web scale".

For more on the topic:

Labels: , , ,

Bookmark and Share Subscribe
posted by Michel-Adrien at 7:18 pm


Post a Comment

<< Home