Boolean Searching in WPopac

WPopac takes advantage of MySQL’s indexing and relevance-ranked searching (go ahead, try it), including boolean searching (on MySQL versions > 4.x). Here are some details and examples taken wholesale from the MySQL manual:

  • +

    A leading plus sign indicates that this word must be present in each result returned.
     

  • A leading minus sign indicates that this word must not be present in any of the resuls that are returned.
     

  • > <

    These two operators are used to change a word’s contribution to the relevance value that is assigned to a result. The > operator increases the contribution and the < operator decreases it.
     

  • ( )

    Parentheses group words into subexpressions. Parenthesized groups can be nested.
     

  • ~

    A leading tilde acts as a negation operator, causing the word’s contribution to the result’s relevance to be negative. This is useful for marking “noise” words. A row containing such a word is rated lower than others, but is not excluded altogether, as it would be with the – operator.
     


The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it should be appended to the word to be affected. Words match if they begin with the word preceding the * operator.  
 
  • A phrase that is enclosed within double quote (‘”’) characters matches only results that contain the phrase literally, as it was typed.
     

In short, it supports the quotes and plus/minus operators that people are familiar with in Google and others. The following examples demonstrate some search strings that use boolean operators:

  • apple banana

    Find records that contain at least one of the two words.
     

  • +apple +juice

    Find records that contain both words.
     

  • +apple macintosh

    Find records that contain the word “apple”, but rank records higher if they also contain “macintosh”.
     

  • +apple -macintosh

    Find records that contain the word “apple” but not “macintosh”.
     

  • +apple ~macintosh

    Find records that contain the word “apple”, but if the row also contains the word “macintosh”, rate it lower than if row does not. This is “softer” than a search for ‘+apple -macintosh’, for which the presence of “macintosh” causes the row not to be returned at all.
     

  • +apple +(>turnover <strudel )

    Find records that contain the words “apple” and “turnover”, or “apple” and “strudel” (in any order), but rank “apple turnover” higher than “apple strudel”.
     

  • apple*

    Find records that contain words such as “apple”, “apples”, “applesauce”, or “applet”.
     

  • “some words”

    Find records that contain the exact phrase “some words” (for example, rows that contain “some words of wisdom” but not “some noise words”).
     

Now I really need to configure my own version of MySQL without the over-reaching stopword list.