The first session I joined at THATcamp was Aditi Muralidharan‘s text mining boot camp, and the topic seemed to set my agenda for the rest of the event (though I wish Aditi had also hosted her proposed data visualization session).
- Aditi’s blog: mininghumanities.com.
- If I understood correctly, much of Aditi’s presentation and experience is based on the Stanford Parser. Unfortunately, the project seems wrapped in some licensing difficulty: It’s GPL, but they claim a license is required for commercial use.
- BookLamp was named as a site that applies language processing to book texts.
- LingPipe was named as another tool for language processing, though it too is encumbered with confusing licensing terms.
- Complexity Intelligence offers a named entity recognition API, among others that are available under a usage license.
- OpenDover offers sentiment analysis API, for small money.
- OpenAmplify is also in the sentiment analysis game.
- OpenCalais was discussed, but dismissed because the free usage license seems to claim ongoing ownership of the data derived from submitted texts.