MaisonBisson.com » marc http://maisonbisson.com A bunch of stuff I would have emailed you about. Mon, 23 Nov 2009 16:21:01 +0000 http://wordpress.org/?v=2.8.6 en hourly 1 Parsing MARC Directory Info http://maisonbisson.com/blog/post/11513/parsing-marc-directory-info-is-easy/ http://maisonbisson.com/blog/post/11513/parsing-marc-directory-info-is-easy/#comments Thu, 16 Nov 2006 17:35:11 +0000 Casey Bisson http://maisonbisson.com/blog/post/11513/

I expected a record that looked like this:

LEADER 00000nas  2200000Ia 4500
001    18971047
008    890105c19079999mau u p       0uuua0eng
010    07023955 /rev
040    DLC|cAUG
049    PSMM
050    F41.5|b.A64
090    F41.5|b.A64
110 2  Appalachian Mountain Club
245 14 The A.M.C. White Mountain guide :| ba guide to trails in
       the mountains of New Hampshire and adjacent parts of Maine
246 13 AMC White Mountain guide
246 13 White Mountain guide
246 13 A.M.C. White Mountain guide
260    Boston,|bThe Club,
300    v. :| bill., maps (some fold., some col.) ;|c16 cm
362 0  1st-     ed.; 1907-
500    Title varies slightly
651  0 White Mountains (N.H. and Me.)|xGuidebooks

but instead got a record that looked like this:

00939cas  2200265Ia 4500001001300000003000700013005001700020008004100037020001500078040001800093050001600111110003100127245012200158246003000280246002600310246003200336246003000368260005500398300005700453362003600510500002700546650001100573651007300584999001600657
ocm18971047
OCoLC
20020918102844.0
890105c19079999mau u p       0   a0eng
  a0910146489
  aDLCcAUGdNHS
  aF41.5b.A64
2 aAppalachian Mountain Club.
14aThe A. M. C. White Mountain guide :ba guide to trails in the mountains of New Hampshire and adjacent parts of Maine.
13aAMC White Mountain guide.
13aWhite Mountain guide.
13aA.M.C. White Mountain guide
13aAMC White Mountain guide.
  aBoston, Mass. :bAppalachian Mountain Club,c1983.
  a550 p.bill., maps (some fold., some col.) ;c16 cm.
0 a1st- ed.; 1907- ; 25th ed. 1992
  aTitle varies slightly.
  aHiking
0aWhite Mountains (N.H. and Me.)xDescription and travelxGuide-books.
  aCL000018321

(some of the non-printable characters have been replaced with newlines for readability.)

After staring at that record for entirely too long, forgetting about it for a while, then returning again to think about how unreadable it was, then forgetting about it again, then taking one last look, I had that *duh* moment that made me realize what I should have seen on first glance: this is a MARC record that hasn’t had its directory parsed.

So here’s my short-but-handy-and-hopefully-usefull-to-somebody-sometime code to parse the directory and then the rest of the record. It assumes $records is an array of records.


foreach($records as $record){
	$temp = explode('', $record);
	$dir = $temp[0];
	$record = substr($record, (strlen($dir) + 1));

	$dir = substr($dir, 24);
	$dir_field = NULL;
	while($dir){
		$dir_field[] = substr($dir, 0, 12);
		$dir = substr($dir, 12);
	}

	$record = str_replace('', '|', $record);
	$marc = NULL;
	foreach($dir_field as $field){
		if(ereg_replace('[^0-9]', '', $field)){
			unset($temp);
			$len = substr($field, 3, 4);
			$pos = substr($field, 7, 5);
			$field = substr($field, 0, 3);
			$temp = substr($record, $pos, $len);
			if($field < 10)
				$temp = '  |'. $temp;
			$marc .= trim($field .'|'. $temp) .“\n”;
			$marc_array[$field] = $temp;
		}
	}
	echo $marc;
}

The actual output of that code on that record is this:

001|  |ocm18971047
003|  |OCoLC
005|  |20020918102844.0
008|  |890105c19079999mau u p       0   a0eng
020|  |a0910146489
040|  |aDLC|cAUG|dNHS
050|  |aF41.5|b.A64
110|2 |aAppalachian Mountain Club.
245|14|aThe A. M. C. White Mountain guide :| ba guide to trails in the mountains of New Hampshire and adjacent parts of Maine.
246|13|aAMC White Mountain guide.
246|13|aWhite Mountain guide.
246|13|aA.M.C. White Mountain guide
246|13|aAMC White Mountain guide.
260|  |aBoston, Mass. :| bAppalachian Mountain Club,|c1983.
300|  |a550 p.|bill., maps (some fold., some col.) ;|c16 cm.
362|0 |a1st- ed.; 1907- ; 25th ed. 1992
500|  |aTitle varies slightly.
650|  |aHiking
651| 0|aWhite Mountains (N.H. and Me.)|xDescription and travel|xGuide-books.
999|  |aCL000018321

It includes a little bit of fudging that my other MARC parsing code demands, but works and is readable.

code, libraries, library, marc, marc directory, parsing, php, raw marc

]]>
http://maisonbisson.com/blog/post/11513/parsing-marc-directory-info-is-easy/feed/ 2
OPAC Web Services Should Be Like Amazon Web Services http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/ http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/#comments Wed, 30 Nov 2005 19:25:48 +0000 Casey Bisson http://maisonbisson.com/blog/?p=10956

Search Help.No, I’m not talking about the interface our users see in the web browser — there’s enough argument about that — I’m talking about web services, the technologies that form much of the infrastructure for Web 2.0.

Once upon a time, the technology that displayed a set of data, let’s say catalog records, was inextricably linked to the technology that stored that set of data. As we started to fill our data repositories, we found it usefull to import (and export) the data so that we could benefit from the work others had done and share our contributions with others. These processes were manual, or at least actively managed, and they depended on the notion that we had to have that information in our servers to be used by and displayed for our users.

Then technology evolved. Many applications now separate the components that store and manage the information from the components that display and manipulate it, and a few applications open up their data stores to the public via web services-based APIs. This is the concept that makes HousingMaps, ChicagoCrime, and Flickr Colr Pickr, among so many others, work.

Think about this for a moment: Our ILSs are inventory management systems, but our OPACs are (supposed to be) search and retrieval systems. The difference is obvious from here, but our vendors continue to operate as though you can’t have one without the other.

It might be easier to illustrate this point with an example or two.

Amazon Light is one of hundreds of applications based on Amazon’s web services. It connects Amazon’s inventory system with a custom built search and retrieval system, and it works. The Amazon Lite developers at Kokogiak didn’t need to build the inventory system, they only needed to think about ways to make the Amazon inventory more useful to you. Try it out, you might like the ability to search your local library (via some real hacks) or bookmark things via del.icio.us.

Or, you might not. Because Amazon allows anybody to access their catalog data, everybody has the opportunity to build a better, more usable catalog — or any other application that can benefit from the bibliographic details in it.

Take LibraryThing for example. It’s hard to explain what it is about people who read books that makes them want to list the books they own or have read or are interested in reading, but LibraryThing doesn’t worry about the why. It just answers the need. And because listing books, at least making a detailed list of books, can be time consuming, LibraryThing makes it easier by fetching the full details and book jacket from Amazon’s catalog. LibraryThing doesn’t need to “own” that info, it just needs access to it.

And what’s interesting is that LibraryThing is only one of a number of similar applications. Take a look at AllConsuming, Technorati’s popular books, and listal. These services connect Amazon’s catalog data with other data gathered from users or from web crawls, then they share the results. Here’s Ryan Eby’s lists of owned and wanted books, and here they are in RSS. Why RSS? Take a look at how he’s using the listal feed for his current reading list in his blog (lower-right column).

These are not technology demos. These are real applications. They are examples of how the world changes when you open up access to your catalog data. It’s what happens when we realize that the tools that store and manage our information are separate from the tools that display and manipulate that information.

Obviously, I’m about to make the (now-old) argument that we need to open our OPACs like this, but we also need take the lesson that easy and loose is winning over detailed and difficult — even in XML representations of our catalog data. And after looking at all that’s been done so far, I want to ask: why not adopt Amazon’s web services XML schema?

Is it so bad that it was invented elsewhere? Is it a bad thing that there are perhaps hundreds of applications that are already using data in that format?

Maybe the answer to those questions is yes, but here’s where technology can serve us again: we don’t have to choose. We don’t need to bet on one technology while we watch others progress faster. Our systems can output the same catalog data in any number of different ways. RSS, OpenSearch, MARC-XML, ATOM, EAD, or DC are all possible, easy in fact — if the inventory server architecture is open enough to allow it.

What do I really mean when I say library web services should be like Amazon web services? I mean they should be that accessible, that usable, that hackable. I mean libraries will benefit when people we’ve never met are spending their evenings building new applications to use our data. People are wondering how to get more programmers in libraries (example one, two), but I’m wondering how to make library systems more programmer friendly.

Fired up? Read more with my library catalogs should be like WordPress post, John Blyberg’s ILS customer bill of rights, and Ryan Eby’s open vs. turnkey discussion.

tags: , , , , , , , , , , , , , , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10956/opac-web-services-should-be-like-amazon-web-services/feed/ 8