MaisonBisson.com » networked information http://maisonbisson.com A bunch of stuff I would have emailed you about. Sat, 14 Nov 2009 20:14:03 +0000 http://wordpress.org/?v=2.8.5.2 en hourly 1 xFruits: “Compose Your Information System” http://maisonbisson.com/blog/post/12787/xfruits-%e2%80%9ccompose-your-information-system%e2%80%9d/ http://maisonbisson.com/blog/post/12787/xfruits-%e2%80%9ccompose-your-information-system%e2%80%9d/#comments Tue, 21 Oct 2008 01:20:50 +0000 Casey http://maisonbisson.com/?p=12787

Is xFruits a worthy replacement for Yahoo! Pipes?

]]>
http://maisonbisson.com/blog/post/12787/xfruits-%e2%80%9ccompose-your-information-system%e2%80%9d/feed/ 2
How Do I Create A Semantic Web Site? http://maisonbisson.com/blog/post/12023/how-do-i-create-a-semantic-web-site/ http://maisonbisson.com/blog/post/12023/how-do-i-create-a-semantic-web-site/#comments Wed, 09 Jan 2008 21:13:14 +0000 Casey Bisson http://maisonbisson.com/blog/post/12023/how-do-i-create-a-semantic-web-site

A member of the Web4lib mail list asked:

How do I create a semantic web site?

I know I have to use either RDF or OWL but do I use either of these to create a mark up language which I then use to create the web site or, with the semantic web do we move away from mark up languages altogether?

Am I right in thinking that OWL and RDF do not contain any information on how the document is to be displayed or presented? They do not seem to allow for style sheets.

Is the creation of a semantic web site completely different from anything that has gone before and I am stuck in an old way of looking at the problem? Are mark up languages a thing of the past as far as the Web is concerned?

Any clarification would be much appreciated.

RDF is certainly among the acronyms most identified with Semantic Web, but it’s not necessarily as complex as all that, and there are things we can do today to answer the question. Among the best of them (and one that will always deliver value), is to make sure our sites are marked up meaningfully. I know this sounds simple, but it’s surprising how few data-rich library sites take advantage of it.

Example: if you want all the titles of works on a page to be bold, don’t use the <b> tag, instead, use a semantic class name like <class = “title”> and use CSS to make it look like you want. Otherwise, our pages are just a jumble of bold and non-bolded stuff (think how much easier printed citations would be to parse if they were marked up that way).

The costs and benefits of semantic markup are frequently argued on a number of lists, but it’s worth noting that we no longer substitute ‘i’ for ‘1′ or ‘O’ for ‘0′ on our keyboards. Binary just doesn’t work as well with i and o.

It’s also worth looking into Microformats, a way of encoding semantic details into the data we use every day, using the tools we already have. Tantek explains them in a recent presentation.

One huge difference between the Microformats crowd and semantic webbers is the issue of human usability. That is, Microformats are built for humans first, machines second, in part because we just don’t have good and well distributed tools to use data that’s not formatted for human use, but also because it helps clear up errors and prevent gaming.

Tantek speaks of Microformats as a cornerstone of the “lower case semantic web” in this presentation from 2004, and ReadWriteWeb directly compares the two.

I’ve been working on some of these challenges myself, and have worked hard to make content presented in Scriblio semantically clear. Take a look at some of the markup in this example. All the bibliographic data is represented inside an unordered list and is parsable as XML. Here’s an excerpt of the ISBNs:

<li class="isbn"><h3 id="12023_isbn_1" >ISBN</h3>
	<ul>
		<li>1586421158</li>
		<li>9781586421151</li>
	</ul>
</li>

That’s not to say the Semantic Web folks don’t see a difference. This article at Semantic Focus says they miss the point, but I side with Clay Shirky’s Praise for Evolvable Systems. Speaking on how the HTTP and HTML finally delivered on the promise of hyperlinks envisioned decades earlier, he notes:

Centrally designed protocols start out strong and improve logarithmically. Evolvable protocols start out weak and improve exponentially. It’s dinosaurs vs. mammals, and the mammals win every time. The Web is not the perfect hypertext protocol, just the best one that’s also currently practical. Infrastructure built on evolvable protocols will always be partially incomplete, partially wrong and ultimately better designed than its competition.

]]>
http://maisonbisson.com/blog/post/12023/how-do-i-create-a-semantic-web-site/feed/ 1
Not Invented Here http://maisonbisson.com/blog/post/11110/not-invented-here/ http://maisonbisson.com/blog/post/11110/not-invented-here/#comments Mon, 30 Jan 2006 02:07:33 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11110

I couldn’t say it, but Alexander Johannesen could: libraries are the last bastions of the “not invented here syndrome” (scroll down just a bit, you’ll find it).

Between Alex’s post and mine, I don’t think there’s much to say except this: there may be five programmers in the world who know how to work with Z39.50, but several thousand who can build an Amazon API-based application in 15 minutes. What technology do you want to bet on?

library, libraries, standards, not invented here, z39.50, Alexander Johannesen, library standards, data interchange, networked information

]]>
http://maisonbisson.com/blog/post/11110/not-invented-here/feed/ 4
The Arrival of the Stupendous http://maisonbisson.com/blog/post/11100/privacy-and-libraries/ http://maisonbisson.com/blog/post/11100/privacy-and-libraries/#comments Tue, 24 Jan 2006 03:02:49 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11100

We can be forgiven for not noticing, but the world changed not long ago.

Sometime after the academics gave up complaining about the apparent commercialization of the internet, and while Wall Street was licking it’s wounds after the first internet boom went bust, the world changed.

Around the time we realized that over 200 million Americans have internet access, that 94 million Americans use the internet ?on an average day, and that 80% of them believe the internet is a reliable source of information, we looked around and found that along with doing their banking, their taxes, and booking tickets for travel and movies, those users were making about five billion web searches each month.

Now that over 62 million households (55%) have internet-connected computers at home, and 87% of youth 12-17 are active online, is it any surprise that children may learn to type before they write? Bloggers are changing the way we get news, but it’s Craigslist that’s killing newspapers’ old cash cow.

And perhaps most amazingly, the internet became not simply a market, a bazaar, it became a component of almost every facet of our lives. Facebook and MySpace were born of this simple desire to be human, with other humans, regardless of medium. A desire that drives, to greater or lesser extents, services like Flickr and 43things.

As Kevin Kelly noted in Wired:

“The accretion of tiny marvels can numb us to the arrival of the stupendous.”

It may seem as unlikely as Norman Bel Geddes realizing his Futurama, or Chesley Bonestell achieving interplanetary flight, but what was once science fiction has become a part of our daily lives. The internet age is here. It is now. We just don’t know what it means yet.

And here’s the library connection: We will all struggle with questions of relevancy in this new world. Inevitably, this will require us to examine our core values and change our services, but the results will be magical. As never before has the technology been available to so connect questions with answers, patrons with libraries.

library, libraries, future libraries, internet, internet usage, tiny marvels, stupendous, arrival, information age, science fiction, reality, social change, cultural effects, society, culture, networked information

]]>
http://maisonbisson.com/blog/post/11100/privacy-and-libraries/feed/ 19
US Census on Internet Access and Computing http://maisonbisson.com/blog/post/11088/us-census-on-internet-access-and-computing/ http://maisonbisson.com/blog/post/11088/us-census-on-internet-access-and-computing/#comments Mon, 16 Jan 2006 22:27:16 +0000 Casey Bisson http://maisonbisson.com/blog/?p=11088

Rebecca Lieb reports for ClickZ Stats that, based on US Census data (report), most Americans have PCs and web access:

Sixty-two million U.S. households, or 55 percent of American homes, had a Web-connected computer in 2003, according to just-released U.S. Census data. That’s up from 50 percent in 2001, and more than triple 1997’s 18 percent figure.

Home Web use continues to skew toward more affluent, younger and educated demographics. Both computer ownership and Web use are lower in households comprised of seniors, among blacks and Hispanics and among households comprised of people with less than a high school education.

Conversely, nearly all households earning over $100,000 — 95 percent — own at least one computer, and 92 percent are online. In homes earning under $40,000, the online figure plummets to 41 percent.

Children have benefited enormously from the growth of home computing. In 1993, only 32 percent of children had access to a computer at home. In 2003, 76 percent of school aged children had access to a home computer, and 83 percent of America’s 57 million schoolchildren used a PC at school. Again, these figures skew when ethnic and economic criteria are applied.

In 1997, only 7 percent of adults said they used the Web to get news, weather and spots. That figure spiked to 40 percent in 2003. Those seeking government or health information grew to 33 percent from 12 percent in 1997, and over half (55 percent) used the Web for e-mail and instant messaging, up from 12 percent 10 years earlier. Eighteen percent banked online; 12 percent looked for a job; nearly half sought product and/or service information and 32 percent purchased online, a radical jump over 2.1 percent in 1993.

Of the 45 percent of households without Web access in 2003, the most common reasons given were: “don’t need it/not interested (39 percent); and costs too much” or “no computer/computer inadequate” (each 23 percent). Two percent cited Web access elsewhere. Issues of privacy, child safety and security concerns were rarely cited, each accounting for only one percent of the reasons.

Homes in the West are the most wired at 67 percent, closely followed by the Northeast and Midwest. Southern households had the lowest percentage of online computers at 52 percent.

us census, census, internet usage, statistics, usage statistics, internet access, access, information age, networked information, critical mass, the coming information age

]]>
http://maisonbisson.com/blog/post/11088/us-census-on-internet-access-and-computing/feed/ 4
Microformats http://maisonbisson.com/blog/post/10729/microformats/ http://maisonbisson.com/blog/post/10729/microformats/#comments Thu, 08 Dec 2005 15:14:24 +0000 Casey Bisson http://www.maisonbisson.com/blog/?p=10729

Oliver Brown introduced me to microformats a while ago, the Ryan Eby got excited about them, then COinS-PMH showed how useful they could be for libraries, but I still haven’t done anything with them myself (other than beg Peter Binkley to release his COinS-PMH WordPress Plugin).

What are microformats? Garrett Dimon explains the theory:

When writing markup against deadlines and priorities, it’s easy to forget that somebody else will eventually have to maintain it. Conveniently, some of the central ideas behind microformats revolve around the fact that they are designed for humans first and created with simplicity in mind. This means you’ll have markup that is easy to understand and maintain for everyone, including:

  • The engineer integrating your code next week
  • You updating your code next month
  • The new guy taking over your job when you get promoted next year

Basically, microformats suggest the use of common class names for various XHTML elements. As it turns out, the hCard microformat is a convenient way of representing the data from vCards in XHTML. The convenience is by design, of course. Here’s an example:

<div class=“vcard”>
<a class=“url fn” href=“http://maisonbisson.com/”>Casey Bisson</a>
<div class=“org”>MaisonBisson</div>
</div>

By standardizing the class names for this content, it’s easier to share and maintain stylesheets, re-use content, and read the content programatically. Perhaps most importantly, it offers valuable tips to search engines crawling your site about what the data is, making it more findable.

The principles of microformats are such:

  • solve a specific problem
  • design for humans first, machines second
  • reuse building blocks from widely adopted standards
  • modularity / embeddability
  • enable and encourage decentralized and distributed development, content, services

The potential here for libraries is huge, but we should take seriously the caution that microformats be easy to use and the design rule that it be simple.

microformat, networked information, semantic web, microformats, library, libraries, metadata, data standards

]]>
http://maisonbisson.com/blog/post/10729/microformats/feed/ 6
Is Search Rank Group-think? http://maisonbisson.com/blog/post/10911/long-tail/ http://maisonbisson.com/blog/post/10911/long-tail/#comments Tue, 01 Nov 2005 16:22:10 +0000 Casey Bisson http://maisonbisson.com/blog/?p=10911

Way back in April 1997, Jakob Nielsen tried to educate us on Zipf Distributions and the power law, and their relationship to the web. This is where discussions of the Chris Anderson’s Long Tail start, but the emphasis is on the whole picture, not just the many economic opportunities at the end of the tail.

Long tail.

Here’s how it works with hits to websites:

  • a few sites become popular and form the “big head” at the left
  • a few more sites form the slope
  • a huge number of websites score very low and form the “long tail”

Nielsen adds these examples:

  • a language has a few words (“the”, “and”, etc.) that are used extremely often, and a library has a few books that everybody wants to borrow (current bestsellers)
  • a language has quite a lot of words (“dog”, “house”, etc.) that are used relatively much, and a library has a good number of books that many people want to borrow (crime novels and such)
  • a language has an abundance of words (“Zipf”, “double-logarithmic”, etc.) that are almost never used, and a library has piles and piles of books that are only checked out every few years (reference manuals for Apple II word processors, etc.)

But the point here is about Google (or Yahoo, etc.) search results ranking, which puts enormous value in the number of incoming links to a page. It turns out that these links also follow a power-law distribution and it not uncommon to find complaints that Google’s Page Rank recognizes popularity over other factors.

So it’s worth wondering: is popularity bad? Are popularity and quality mutually exclusive? Do search rankings represent some sort of global group-think?

Now put this in an academic library context and consider a student Googling for background for a research paper (think University freshmen the night before it’s due). Is it possible that linking patterns work like Wikipedia and tend to favor quality, or do they simply represent lowest common denominator popularity. Do search results reflect the sum of our altruistic linking intentions or our base crudity?

More about search ranking and libraries:

tags: , , , , , , , , , , , , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10911/long-tail/feed/ 1
Tech Tuesdays: Blogs and Blogging http://maisonbisson.com/blog/post/10909/blogs-and-blogging/ http://maisonbisson.com/blog/post/10909/blogs-and-blogging/#comments Tue, 25 Oct 2005 15:30:00 +0000 Casey Bisson http://maisonbisson.com/blog/?p=10909

Note: these are my presentation notes for a brown bag discussion with library faculty and university IT staff today. This may become a series…[[pageindex]]

More: my presentation slides and the Daily Show video.

Introduction

Public awareness of blogs seems to begin during the years of campaigning leading up to the 2004 election, but many people credit bloggers for swaying news coverage of Senator Trent Lott’s comments at Senator Strom Thurmond’s 100th birthday celebration in December 2002. Blog reaction was strong, and critical of both Lott’s comments and the limited coverage they received at first.

Media attention to blogs has grown since, with political blogs like the top rated Instapundit and Daily Kos among the most visible. A November 2004 episode of The West Wing featured blogs in the plot, and blog coverage has now become so common in cable news that The Daily Show did a piece on it.

Most everybody understands that “blog” is a truncated contraction of “web log,” but there’s little consensus on what a blog is. What is or is not a blog can’t be strictly defined by style, form, content, structure, or even the technology employed.

Types of Blogs

Political blogs get a lot of attention, but preliminary results of an MIT Media Lab sturvey of bloggers found that 73.62% (28,141) of respondents said that half or more of their posts were “personal.”

The Washingtonienne may be the most (in)famous of personal blogs, but LiveJournal, the blog hosting provider most identified with personal blogs, claims over 8 million user-bloggers (2.5 million “active in some way”). LiveJournal’s media relations page quotes a story that connects LiveJournaling with emo rock, saying:

The impulse to LiveJournal is the same as to go to the show and sing your heart out in front of strangers.

Though LJ blogs are derided by many as “mundane, banal or even primitive, inhabited mainly by teenagers producing thoughtless and valueless babble,” the service has also attracted serious study, including in peer production of popular culture and a mood study by Gilad Mishne of the University of Amsterdam. Danah Boyd, study of linking patterns noted that personal bloggers are among the least likely link to other sites in their postings and that there is an assumed familiarity between the blogger and reader.

Starfishncoffee is one LiveJournal blogger, but I would also describe the anonymous Feel-good Librarian as a personal blogger.

Other types of blogs:

Promotion — think “online book tour”

Niche News

Blogmedia — for profit blogs with editors and staff writers

Numbers

Technorati, an online service near the center of the “blogosphere,” claims to track 20 million blogs and 1.6 billion links. Though Technorati is not a blog, it offers services like blog searching, link tracking, and “tagindexing. They also, of course, rank blogs based on the number of their incoming links.

Alexa, recognized as the Neilson ratings for websites, allows users to graph site traffic and compare it against other sites. This graph for BoingBoing, the site Technorati lists as their #1 blog, shows they’re ranked #4,195 of all sites in the world. That ranking compares favorably with The Chicago Sun-Times #1,233 position.

According to the Jan 2005 Pew Internet & American Life Project report on blogs and blogging, of the 120 million U.S. adults who use the internet…

  • 27% (32 million) read blogs
  • 12% of have posted comments or other material on blogs
  • 7% (8 million) say they have created a blog or web-based diary

James Torio’s MA Thesis in Advertising Design discusses the commercial and marketing aspects of blogs:

Blogs are effective for disseminating information because they have similar characteristics to word of mouth. People tend to listen to the recommendations of friends and trusted resources and many Bloggers are viewed this way by readers.

Torio suggests that companies ignore bloggers at their peril, and offers as examples accusations of censorship by Microsoft (handled successfully by acknowledgment, p.74) and the issue of Kryptonite locks that could be hacked with a Bic pen (completely ignored, p.77).

Blogs Are Conversations

Indeed, that personal and conversational nature of blogs seems to be hugely important in their success. Chris Bowers, in an informal study that looked at popularity of political blogs over time and their community-building features, like the ability to comment or contribute, found that such features are vital to growing readership.

Jenny Levine, The Shifted Librarian, points to Ann Arbor District Library as a an example of an organization that makes good use of a blog in their relations with their patrons:

The posts are written in the first person and in a conversational tone, with the author’s first name to help stress the people in the library. The staff isn’t afraid to note problems with the new catalog, the web site, or anything else. Full transparency — nice. You can feel the level of trust building online. They respond to every comment that needs it, whether it’s a criticism, question, or suggestion. And some of the comments are fantastic. Users are even helping debug the new catalog.

Risks

The notion that blogging is a risky career move is remains persistent. A rather negative story in The Journal of Higher Education noted (discussion):

A candidate’s blog is more accessible to the search committee than most forms of scholarly output. It can be hard to lay your hands on an obscure journal or book chapter, but the applicant’s blog comes up on any computer. Several members of our search committee found the sheer volume of blog entries daunting enough to quit after reading a few. Others persisted into what turned out, in some cases, to be the dank, dark depths of the blogger’s tormented soul; in other cases, the far limits of techno-geekdom; and in one case, a cat better off left in the bag.

Wikipedia, in fact, lists a few relatively well-known cases of bloggers fired for their blog postings, including former employees of Delta Airlines (for pictures) and Friendster (for discussing technology decisions).

Though causality can only be inferred, a 2004 MIT Media Lab Blog Survey found:

[T]he frequency with which a blogger writes highly personal things is positively and significantly correlated to how often they get in trouble because of their postings; [...] generally speaking, people have gotten in trouble both with friends and family as well as employers.

Legal

The Electronic Frontier Foundation’s Legal Guide for Bloggers and guide toblogging anonymously are worth a look. Also of relevance is a recent Delaware Supreme Court ruling that establishes precedent that readers are expected to use context to aid their evaluation of meaning.

Cold Water

The Google Economy

Web usability guru Jakob Nielsen describes blogs as “a Web-native content genre,” continuing:

[W]eblogs are part of an ecosystem (often called the Blogosphere) that serves as a positive feedback loop: Whatever good postings exist are promoted through links from other sites. More reader/writers see this good stuff, and the very best then get linked to even more. As a result, link frequency follows a Zipf distribution, with disproportionally more links to the best postings.

Google was quick to understand the value bloggers offered in identifying new resources to index, and what resources to index more often, a fact that lead to their purchase of Blogger, recognized as the first blog service, in early 2003.

As it turns out, hyperlinks are among a blog’s most valuable products. Because the web makes it easy to do large-scale citation analysis, and because every popular search engine now uses the technique as a significant component of their search ranking, the large number of bloggers hold great power over what we can or can’t find in those search engines.

The Google Economy is a recognition of the role linking and link-ability have on the propagation or success of an idea, product, or service. More discussion of this can be found in Peter Morville’s Ambient Findability, subtitled “what we find changes who we become.”

Blog Technologies

Get Yourself a Blog

tags: plymouth state university, lamson library, tech tuesday, tech tuesdays, , , , , , , , , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10909/blogs-and-blogging/feed/ 2
Wikipedia and Libraries http://maisonbisson.com/blog/post/10609/wikipedia-and-libraries/ http://maisonbisson.com/blog/post/10609/wikipedia-and-libraries/#comments Fri, 03 Jun 2005 09:28:09 +0000 Casey Bisson http://www.maisonbisson.com/blog/?p=10609

Wikipedia seems to get mixed reviews in the academic world, but I don’t fully understand why. There are those that complain that they can’t trust the untamed masses with such an important task as writing and editing an encyclopedia, then there are others that say you can’t trust the experts with it either. For my part, I’ve come to love Wikipedia, despite having access to EB and other, more traditional sources. Why? Because it takes better advantage of the web than others, and unlike those commercial products, I don’t have to sign in to use it.

In fact, my only criticism of Wikipedia is that I’d like to use it more by integrating it into library resources. One example I use is of putting biography data from Wikipedia into our catalog search results displays. We have three books about Nikola Tesla, but why not include the first few paragraphs from the Wikipedia entry on him?

In my presentations I note the increasing tendency toward self service, even when we know we can get better answers/service by talking with somebody. This is true of travel (when was the last time you booked airfare through a travel agent?), and there are signs that suggest that it’s becoming true in libraries too. What I’m suggesting is that we need to improve our automated systems so that we can continue to serve our patrons even as their needs, expectations, and wants change.

In short, we need to transform our online systems into answer systems. So my criticism of Wikipedia is that there’s a lot of valuable data there that is difficult to automatically link to library data (author names, for instance, are rarely in the library of congress’s authoritative form). I don’t have any real solutions for this right now, and I see a lot of benefit to Wikipedia’s open (more human) form, so I haven’t really argued this much.

Still, I was pleased to see this note in TeleRead suggesting that librarians are “infiltrating” Wikipedia. The tip of the spear seems to be at Quaedam cuiusdam, where Peter Binkley is talking about some things, like OpenURL resolution, that could make Wikipedia a better resource for libraries. Good stuff.

Technorati Tags: , , , , , , ,

]]>
http://maisonbisson.com/blog/post/10609/wikipedia-and-libraries/feed/ 5