MaisonBisson.com » opensearch http://maisonbisson.com A bunch of stuff I would have emailed you about. Sat, 14 Nov 2009 20:14:03 +0000 http://wordpress.org/?v=2.8.6 en hourly 1 DeWitt Clinton On The Birth of OpenSearch http://maisonbisson.com/blog/post/11665/dewitt-clinton-on-the-birth-of-opensearch/ http://maisonbisson.com/blog/post/11665/dewitt-clinton-on-the-birth-of-opensearch/#comments Thu, 03 May 2007 16:04:21 +0000 Casey Bisson http://maisonbisson.com/blog/post/11665/#dewitt-clinton-on-the-birth-of-opensearch

OpenSearch is a common way of querying a database for content and returning the results. The idea is that it brings sanity to the proliferation of search APIs, but a realistic view would have to admit that we’ve been trying to do that since before the development of z39.50 in libraries decades ago, and the hundreds of APIs that have followed have all well intentioned and purposeful.

So what makes makes OpenSearch something more than an also ran in a crowded herd? Part of it is in what it doesn’t do. “Rather than reinventing the wheel, it uses the simple and very popular syndication formats RSS and Atom, along with a document describing the search engine.”

DeWitt Clinton helped create the OpenSearch protocol while working at Amazon’s A9.com. DeWitt is currently at Google, but he’s continuing his work on OpenSearch as an open, Creative Common’s licensed specification, and I caught up with him there to talk about what it takes to develop an open format.

My first questions were about where OpenSearch came from.

DeWitt Clinton: Amazon launched a wholly owned subsidiary called A9. This was in late 2003, and revealed the first beta site in early 2004. A9’s mission was to explore search and to see where search could be done better.

One of the first things that we launched was the A9 front-end search interface, including search results from Google and handful of other partners. We integrated the different search results and displayed them to users, which was, I think, relatively novel for the time. It was a multiple column display where you could do one search query and see search results. They weren’t necessarily interleaved, but they were aggregated on screen.

We worked with Google’s search API, Answers.com’s search API, we worked with a few other search APIs and we started talking to additional partners about getting their searches into A9. There were a number of companies that had search engines, but far more often than not, they also had proprietary search APIs.

Basically, if you were a search company — if you were Answers.com or something like that — you would say, “OK, I can accept search requests and I’ll going to give you search results back, maybe I’ll use this XML format, maybe it’s going to be SOAP, maybe it’s going to be something else.”

So we worked with a couple more of these proprietary APIs and said, “You know, this is getting silly. We’re doing all this work on our end to integrate search results, maybe there is an easier way.” We looked around to see if there was a standard for search, and didn’t really surface anything specifically for web-based, web-type search. There were formats for more structured search, but web search is at best very loosely structured.

So we started to pick it apart, looking to propose a search format that our partners could use. But what would go into a search format? What are the common traits of search? What are the things that all web-search engines accept as parameters on the request and what are the type of things that they send back?

We started looking at the existing protocols — those that Yahoo!, Google, and even the smaller, more niche search engines had exposed — and asking ourselves what they were doing. We took the common elements from those formats until we found the subset that we could tell, just empirically, was going to cover at least the 80% case of what other people are already doing.

Then there was this moment when we realized we were inventing yet another proprietary format. You know, essentially a closed format. Fortunately, having done a lot of work with RSS in the past, we realized, “You know, search results are just a list. And the whole world is using RSS as a way of syndicating lists. So what if we — instead of trying to invent something completely new — what if we leveraged an existing protocol?”

RSS was already out there, already open, already extremely well-adopted, and had tons of client and server libraries available. Combining RSS-based responses, the extra search result metadata, and our new format for describing search interfaces gave us the common subset, the 80% case we needed for syndicated search. And that became OpenSearch 1.0.

There were three “lightbulb moments” in designing OpenSearch. The first was extracting the common features of web search. The second was leveraging existing formats, such as RSS. The third “lightbulb” was in asking the question: “who benefits if this is a proprietary A9/Amazon solution? Is the world a better place, is even our business better off if this is closed and proprietary?” And the answer, very clearly, was “no.” With that the decision was clear, “You know what, let’s open this protocol. Let’s use the Creative Commons as a way of opening the text of the format of the protocol.”

DeWitt Clinton, OpenSearch, interview, open formats, protocols, search, search syndication, RSS

]]>
http://maisonbisson.com/blog/post/11665/dewitt-clinton-on-the-birth-of-opensearch/feed/ 0
The Future Of Library Technology Is Free, Cheap, And Social http://maisonbisson.com/blog/post/11059/the-future-of-library-technology-is-free-cheap-and-social/ http://maisonbisson.com/blog/post/11059/the-future-of-library-technology-is-free-cheap-and-social/#comments Tue, 13 Mar 2007 12:14:28 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11059

delicious = Endoeavor’s course content integrator
OpenSearch = metasearch
Flickr = digital collections management

]]>
http://maisonbisson.com/blog/post/11059/the-future-of-library-technology-is-free-cheap-and-social/feed/ 0
OpenSearch Progress http://maisonbisson.com/blog/post/11397/opensearch-progress/ http://maisonbisson.com/blog/post/11397/opensearch-progress/#comments Wed, 02 Aug 2006 00:18:27 +0000 Casey Bisson http://maisonbisson.com/blog/post/11397/

I really need to keep better tabs on Michael Fagan, as his June 11 OpenSearch Update is full of goodies.

OpenSearch, OpenSearch referrer extension, extensions, microformats, search suggestions

]]>
http://maisonbisson.com/blog/post/11397/opensearch-progress/feed/ 1
OpenSearch In A Nutshell http://maisonbisson.com/blog/post/11384/opensearch-in-a-nutshell/ http://maisonbisson.com/blog/post/11384/opensearch-in-a-nutshell/#comments Wed, 19 Jul 2006 16:19:04 +0000 Casey Bisson http://maisonbisson.com/blog/post/11384/

open search aggregator

OpenSearch is a standard way of querying a database for content and returning the results.

The official docs note simply: “Any website that has a search feature can make their results available in OpenSearch format,” then adds: “Publishing your search results in OpenSearch™ format will draw more people to your content, by exposing it to a much wider audience through aggregators such as A9.com.”

It’s a lot easier to understand OpenSearch once you’ve used it, so take a look at A9.com and do a search. A9 isn’t the only OpenSearch aggregator out there, but it’s a great example. You can query a number of OpenSearch targets by clicking the buttons to add columns (also try resizing the columns), or you can add any of the 422 public search targets listed at A9.

Now, if you’ve got the beta of IE 7, you can see how it’s extending beyond server-side aggregators and into client software. Even better, you can see how this is becoming automigical via autodiscovery.

One of the most exciting features of OpenSearch is its support for complex queries as well as simple keyword searches, and the ability to return intelligent responses to a search, such as alternate search suggestions (think spelling corrections) and facets (hey, any librarians attending this?)

Now, the question for libraries is when are we going to demand OpenSearch interfaces from our information providers? The inclusion of OpenSearch in IE7 more than gives it critical mass, but so far it seems to be just something a few progressive library-types are experimenting with. In the short term, imagine how improved our metasearch tools would be if based on fully-implemented OpenSearch feeds (with the facets and suggestions). In the long term, I can’t imagine any aspect of a library’s online services not touched by this technology.

a9, API, lib20, libraries, library, library 2.0, OpenSearch, search, search aggregator, search api

]]>
http://maisonbisson.com/blog/post/11384/opensearch-in-a-nutshell/feed/ 2
Technology Scouts At AALL http://maisonbisson.com/blog/post/11381/technology-scouts-at-aall/ http://maisonbisson.com/blog/post/11381/technology-scouts-at-aall/#comments Tue, 11 Jul 2006 16:59:35 +0000 Casey Bisson http://maisonbisson.com/blog/post/11381/

AALL Presentation

I’m honored to join Katie Bauer, of Yale University Library, in a program coordinated by Mary Jane Kelsey, of Yale Law’s Lillian Goldman Library.

The full title of our program is Technology Scouts: how to keep your library and ILS current in the IT world (H-4, 4PM Tuesday, room 274). My portion of the presentation will focus on how we’re fixing up our catalogs, with a big emphasis on how APIs can be used to continuously reinvent the way we look at — and thus understand and use — the information we have. The big idea here is that as we separate the systems that store and manage our data from the applications that display and manipulate it, we open the door to faster, cheaper development — and make room for a bunch of new ideas along the way.

Because it’s a short program, I’ll only be able to gloss over some of the discussion of what’s wrong with our catalogs and how we’re fixing them, and while there’s a lot to say about WPopac, I’ll have to leave it to Jenny Levine to explain most of it.

My slides are online. As usual, all the underlined text is hotlinked along with all the screenshots, so click them for more information and detail.

AALL, AALL2006, American Association of Law Libraries, api, conference, law libraries, lib20, libraries, library, library 2.0, opensearch, presentation, rss, web 2.0, web20, xml

]]>
http://maisonbisson.com/blog/post/11381/technology-scouts-at-aall/feed/ 0
All About OpenSearch and Autodiscovery from Davey P http://maisonbisson.com/blog/post/11197/all-about-opensearch-and-autodiscover-from-davey-p/ http://maisonbisson.com/blog/post/11197/all-about-opensearch-and-autodiscover-from-davey-p/#comments Thu, 09 Mar 2006 21:06:11 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11197

I’ve been meaning to point out (and steal from) Dave Pattern’s post on tipping off IE7 (and other browsers soon too, hopefully) to available OpenSearch targets for some time now. I haven’t had time to do the stealing, so I’ll have to settle for pointing it out while it’s still news.

What’s the trick? As Dave explains, you put a link in the <head> section of your pages like this:

<link rel=“search”
     type=“application/opensearchdescription+xml”
     title=“WPopac Demo”
     href=“http://www.plymouth.edu/library/opac/opensearch.xml” />

When IE7 finds that, it’ll offer you a chance to add the new search target. The screenshots at Dave’s site show the whole thing.

autodiscovery, dave pattern, davey p, ie7, opensearch, opensearch autodiscovery, search api, autodiscovery, dave pattern, davey p, ie7, opensearch, opensearch autodiscovery, search api

]]>
http://maisonbisson.com/blog/post/11197/all-about-opensearch-and-autodiscover-from-davey-p/feed/ 1
Standards Cage Match http://maisonbisson.com/blog/post/11171/standards-cage-match/ http://maisonbisson.com/blog/post/11171/standards-cage-match/#comments Thu, 23 Feb 2006 15:14:00 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11171

The great wall of 'standards,' from my code4lib presentation.

I prefaced my point about how the standards we choose in libraries isolate us from the larger stream of progress driving development outside libraries with the note that I was sure to get hanged for it.

It’s true.

I commented that there were over 140,00 registered Amazon API developers and 365 public OpenSearch targets (hey look, there’s another one already), but that SRW/SRU would always play to a smaller audience. Basing arguments on the popularity of the subjects is dangerous, especially so within the library community, and touching on such inflammatory arguments during a 20 minute presentation is certain to leave people feisty.

It’s also especially dangerous to use an apparently sacred cow as the object of what I wanted to be a general example. My overall argument was (and remains) that we should look for opportunities to break down the barriers that isolate our work and find means to expand our community. Still, I believe a specific argument about SRW/SRU has merit, and I’m willing to carry the flag on this side.

So let’s start with what I believe we can agree on: SRW/SRU, OpenSearch, and Amazon Web Services all serve substantially similar interests: the ability to issue a query, get a list of results, get a detailed record for each result (not possible with OpenSearch). From here, many people seem to argue that XSLT can be used to mutate the results of one schema to the other, or directly to browser-displayable content with ease. On the face of it, this seems to solve many of the incompatibilities while preserving the unique features of each.

Sadly, those XSLT arguments ignore one problem while creating another.

XSLT (and similar techniques) can change the representation of the data in a record, but they can’t change the type or nature of the data and such techniques certainly can’t address differences in the way applications interact with the API. As an example, consider that an XSLT could likely be written to translate Flickr’s schema for a single image into something that looks like Amazon’s schema for a single title, but no XSLT can make an application that interacts with one API properly interact with the other.

The problem that XSLT solutions ignore is that if all these schemas can be translated between eachother (either cleanly or not), and if catalogers working with one metadata standard must be aware of the limitations of other standards to which their work might get XSLT’d to, then what’s the value of their differences? Why invest the duplicated time and effort in each?

The rest of this argument assumes that XSLT solves neither the needs of the programmer who must still learn to navigate different APIs nor the cataloger who must either use lowest-common-denominator cataloging standards or write metadata that can’t be cleanly translated to other schemas.

With XSLT out of the picture, it becomes clear that SRU/SRW is indeed among the wall of standards that make it impossible for us within the library to share executable code with anybody outside our community. And because of our low numbers and natural variations in chosen environments (preferred language & database among them), we often find it difficult to share executable code among others within our community.

It’s also worth considering the differences in features between SRW/SRU, OpenSearch, and Amazon Web Services: Both OpenSearch and AWS offer ways to include suggested alternate searches within the search response set (OpenSearch does this especially well). Nothing I’ve seen in SRW/SRU does this (please correct me if I’m wrong), yet considering how much interest there is in developing more human search interfaces and those that allow faceted searching, these are clearly essential components of any useful standard.

Further, AWS supports all aspects of the usage of materials, not just the search and retrieval of them. Are AWS’s shopping cart and checkout features not similar to our circ checkout procedures? Could AWS’s list management features not be used to show patrons what they have checked out now or throughout their history (if we or they wanted that), as well as allowing them to maintain the reading wishlists or personal bibliographies?

And AWS’s support for returning related and recommended items for each record, as well as comments and reviews is outside the scope of SRW/SRU, but required for many of the features we want to add to our applications.

The point here is that while there are substantial differences in the details between SRW/SRU and OpenSearch or AWS, it is not easy to conclude that SRW/SRU is substantially better for the applications we seem to most want to build.

And this is when we have to take note of the recent University of California libraries report and the quote that puts us all in our places: “for the past ten years online searching has become simpler and more effective everywhere, except in library catalogs” (and the same could be said of our online databases).

The problem isn’t that we’ve been bad coders, and we certainly haven’t intentionally built systems that were difficult to use. The problem is that our community has been isolated and unable to leverage advances made elsewhere. Again, my argument is that we need to change this, that we need to find more ways to collaborate not only with those within our community, but with those outside our community.

There’s much we might be able to offer coders outside libraries, but the arguments defending SRW/SRU seem to ignore the lessons we might learn from them.

Final example: it’s pretty obvious to all of us now that chat reference should be done using common and freely available IM tools, but that didn’t stop us from investing huge sums of money in building and buying custom, library specific chat reference tools. Where else will history show we’ve made similar mistakes?

a9, amazon api, amazon web services, argument, AWS, cage match, code4lib, code4lib 2006, future libraries, information retrieval, lib20, libraries, library, library 2.0, library standards, opensearch, search, search and retrieval, search retrieval, sru/srw, srw/sru, web services

]]>
http://maisonbisson.com/blog/post/11171/standards-cage-match/feed/ 6
OpenSearch Spec Updated http://maisonbisson.com/blog/post/11028/opensearch-spec-updated/ http://maisonbisson.com/blog/post/11028/opensearch-spec-updated/#comments Tue, 13 Dec 2005 18:39:49 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11028

I just received this email from the A9 OpenSearch team:

We have just released OpenSearch 1.1 Draft 2. We hope to declare it the final version shortly, and it is already supported by A9.com. Uprading from a previous version should only take a few minutes…

OpenSearch 1.1 allows you to specify search results in HTML, Atom, or any other format (or multiple formats) in addition to just RSS. In addition, OpenSearch 1.1 will be supported by Internet Explorer 7, among other software, so we strongly recommend that you upgrade. Also new is the ability to specify suggested searches, such as spelling suggestions and related queries. (link and emphasis addded)

Woot! I’ll be doing something with this soon.

a9, opensearch, open search, amazon, search, libraries, library, opac, library catalog, library catalogs, a9.com, metasearch, aggregated search, search, federated search

]]>
http://maisonbisson.com/blog/post/11028/opensearch-spec-updated/feed/ 4
OPAC Web Services Should Be Like Amazon Web Services http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/ http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/#comments Wed, 30 Nov 2005 19:25:48 +0000 Casey Bisson http://maisonbisson.com/blog/?p=10956

Search Help.No, I’m not talking about the interface our users see in the web browser — there’s enough argument about that — I’m talking about web services, the technologies that form much of the infrastructure for Web 2.0.

Once upon a time, the technology that displayed a set of data, let’s say catalog records, was inextricably linked to the technology that stored that set of data. As we started to fill our data repositories, we found it usefull to import (and export) the data so that we could benefit from the work others had done and share our contributions with others. These processes were manual, or at least actively managed, and they depended on the notion that we had to have that information in our servers to be used by and displayed for our users.

Then technology evolved. Many applications now separate the components that store and manage the information from the components that display and manipulate it, and a few applications open up their data stores to the public via web services-based APIs. This is the concept that makes HousingMaps, ChicagoCrime, and Flickr Colr Pickr, among so many others, work.

Think about this for a moment: Our ILSs are inventory management systems, but our OPACs are (supposed to be) search and retrieval systems. The difference is obvious from here, but our vendors continue to operate as though you can’t have one without the other.

It might be easier to illustrate this point with an example or two.

Amazon Light is one of hundreds of applications based on Amazon’s web services. It connects Amazon’s inventory system with a custom built search and retrieval system, and it works. The Amazon Lite developers at Kokogiak didn’t need to build the inventory system, they only needed to think about ways to make the Amazon inventory more useful to you. Try it out, you might like the ability to search your local library (via some real hacks) or bookmark things via del.icio.us.

Or, you might not. Because Amazon allows anybody to access their catalog data, everybody has the opportunity to build a better, more usable catalog — or any other application that can benefit from the bibliographic details in it.

Take LibraryThing for example. It’s hard to explain what it is about people who read books that makes them want to list the books they own or have read or are interested in reading, but LibraryThing doesn’t worry about the why. It just answers the need. And because listing books, at least making a detailed list of books, can be time consuming, LibraryThing makes it easier by fetching the full details and book jacket from Amazon’s catalog. LibraryThing doesn’t need to “own” that info, it just needs access to it.

And what’s interesting is that LibraryThing is only one of a number of similar applications. Take a look at AllConsuming, Technorati’s popular books, and listal. These services connect Amazon’s catalog data with other data gathered from users or from web crawls, then they share the results. Here’s Ryan Eby’s lists of owned and wanted books, and here they are in RSS. Why RSS? Take a look at how he’s using the listal feed for his current reading list in his blog (lower-right column).

These are not technology demos. These are real applications. They are examples of how the world changes when you open up access to your catalog data. It’s what happens when we realize that the tools that store and manage our information are separate from the tools that display and manipulate that information.

Obviously, I’m about to make the (now-old) argument that we need to open our OPACs like this, but we also need take the lesson that easy and loose is winning over detailed and difficult — even in XML representations of our catalog data. And after looking at all that’s been done so far, I want to ask: why not adopt Amazon’s web services XML schema?

Is it so bad that it was invented elsewhere? Is it a bad thing that there are perhaps hundreds of applications that are already using data in that format?

Maybe the answer to those questions is yes, but here’s where technology can serve us again: we don’t have to choose. We don’t need to bet on one technology while we watch others progress faster. Our systems can output the same catalog data in any number of different ways. RSS, OpenSearch, MARC-XML, ATOM, EAD, or DC are all possible, easy in fact — if the inventory server architecture is open enough to allow it.

What do I really mean when I say library web services should be like Amazon web services? I mean they should be that accessible, that usable, that hackable. I mean libraries will benefit when people we’ve never met are spending their evenings building new applications to use our data. People are wondering how to get more programmers in libraries (example one, two), but I’m wondering how to make library systems more programmer friendly.

Fired up? Read more with my library catalogs should be like WordPress post, John Blyberg’s ILS customer bill of rights, and Ryan Eby’s open vs. turnkey discussion.

tags: , , , , , , , , , , , , , , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/feed/ 8
Now Search Lamson Library at A9.com http://maisonbisson.com/blog/post/10907/now-search-lamson-library-at-a9com/ http://maisonbisson.com/blog/post/10907/now-search-lamson-library-at-a9com/#comments Fri, 21 Oct 2005 12:50:10 +0000 Casey Bisson http://maisonbisson.com/blog/?p=10907

Search Help.A9, the search engine from Amazon.com, does some pretty interesting things that libraries should be aware of. First, any library considering a metasearch product should look at what can be done for free, and second, libraries should take a look at the OpenSearch technology that drives it.

So now, when searching for Harry Potter, you’ll also find relevant results from Plymouth State University’s Lamson Library. We’re not the first library — I think Seattle Public was — and my work mostly follows the cookbook written up by Ryan Eby, of Michigan State University Libraries. Thanks also go to our university IT sysadmins who installed the XSLT extension for PHP5 earlier this week.

tags: , , , , , , , , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10907/now-search-lamson-library-at-a9com/feed/ 0