Presentation: Faceted Searching and Browsing in Scriblio

I was honored to be a panelist at the LITA/ALCTS CCS Authority Control in the Online Environment Interest Group presentation of “Authority Control Meets Faceted Browse.”

What is faceting? Why is it (re)emerging in use? Where can I see it in action? This program is intended to introduce the audience to facet theory, showcase implementations that use faceted approaches for online catalogs, and facilitate discussion on the relationship between structured authority data and this type of navigation.

Kathryn La Barre of University of Illinois at Urbana-Champaign explained the theory, while NCSU’s Charley Pennel, Vanderbilt’s Mary Charles Lasater, and I each described its implementation in Endeca, Primo, and Scriblio respectively. Scriblio is an open source project that has less than one FTE working on it, so it’s an honor to see it compared against commercial offerings, especially NCSU’s groundbreaking work.

My slides are online in QuickTime and PDF form, and I was proud to be able to show off the new public beta of the Lamson Library website and catalog, based on Scriblio.

I should be careful to point out that faceting is a theory of cataloging and classification, while clustering is the technical process of aggregating and reporting relevant metadata in search and browse screens. The difference is that Scriblio doesn’t impose rules on our cataloging practice, it simply supports clustering the metadata to make it easier to find the resources we’re looking for.

If anything, the importance of authority control increases in faceted/clustered search and browse systems, but it is a matter of exchanging one set of technological constraints for another. Card catalogs, with their alphabetical access and physical affordances (or limitations) demanded cataloging practice that is in some ways at odds with the very different affordances and limitations of faceted/clustered search and browse.

Among current implementations, clustering does well with subjects, but poorly with authors. Looking at the cardinality of those facets, it’s easy to understand the problem:

Scriblio at Lamson: Cardinality of selected facets

(statistics from the Lamson Library (beta) catalog.)

The number of unique authors compared to the number of total authors is very high, while a large number of subjects are represented by a small number of unique headings. Still, some authors are well suited to faceted browse, and their emergence in a result set could be mined to help users further refine their searches. Example: J. K. Rowling is an obvious top author in searches for both “harry potter” and “j k rowling”. Her statistical “pop” in the results might be worth looking at and worth leveraging elsewhere.

As currently implemented, however, clustered results most help the user who doesn’t know the proper terms for her field of interest. A user searching “sociology of education” is likely to be interested in materials cataloged under “educational sociology,” and clustered search navigation works well in that and similar circumstances.