SimpleQueryParser |
SimpleQueryParser is used to parse human readable query syntax. The main idea behind this parser is that a person should be able to type whatever they want to represent a query, and this parser will do its best to interpret what to search for no matter how poorly composed the request may be. Tokens are considered to be any of a term, phrase, or subquery for the operations described below. Whitespace including ' ' '\n' '\r' and '\t' and certain operators may be used to delimit tokens ( ) + | " . Any errors in query syntax will be ignored and the parser will attempt to decipher what it can; however, this may mean odd or unexpected results. Query Operators - '
+ ' specifies AND operation: token1+token2 - '
| ' specifies OR operation: token1|token2 - '
- ' negates a single token: -token0 - '
" ' creates phrases of terms: "term1 term2 ..." - '
* ' at the end of terms specifies prefix query: term* - '
~ N' at the end of terms specifies fuzzy query: term~1 - '
~ N' at the end of phrases specifies near query: "term1 term2"~5 - '
( ' and ') ' specifies precedence: token1 + (token2 | token3) The default operator is OR if no other operator is specified. For example, the following will OR token1 and token2 together: token1 token2 Normal operator precedence will be simple order from right to left. For example, the following will evaluate token1 OR token2 first, then AND with token3 : token1 | token2 + token3 Escaping An individual term may contain any possible character with certain characters requiring escaping using a '\ '. The following characters will need to be escaped in terms and phrases: + | " ( ) ' \ The '- ' operator is a special case. On individual terms (not phrases) the first character of a term that is - must be escaped; however, any '- ' characters beyond the first character do not need to be escaped. For example: -term1 -- Specifies NOT operation against term1 \-term1 -- Searches for the term -term1 . term-1 -- Searches for the term term-1 . term\-1 -- Searches for the term term-1 . The '* ' operator is a special case. On individual terms (not phrases) the last character of a term that is '* ' must be escaped; however, any '* ' characters before the last character do not need to be escaped: term1* -- Searches for the prefix term1 term1\* -- Searches for the term term1* term*1 -- Searches for the term term*1 term\*1 -- Searches for the term term*1 Note that above examples consider the terms before text processing. |