Overview
|
The search feature in the MDK allows you to enter search string and
start the search by clicking on to the "Search" button.
All results are displayed with the title of the document, which serves
as link, and a short description. Click on the link to view the document.
|
Note: When you use the search
the first time, the search index is built up. That takes about 5-10
seconds. After that the response to your search request will be immediate.
|
Fields
|
The MDK search supports two fields. The "title" field contains
the title of the document and the "summary" field contains
the short description of the document.
You can search any field by typing the field name followed by a colon
":" and then the term you are looking for.
As an example, let's assume a Lucene index contains two fields, title
and text and text is the default field. If you want to find the document
entitled "The Right Way" which contains the text "don't go this way",
you can enter:
title:"The Right Way" AND summary:go
or
title:"Do it right" AND right
Since text is the default field, the field indicator is not required.
Note: The field is only valid for the term that it directly precedes,
so the query
Will only find "Do" in the title field. It will find "it" and "right"
in the default field (in this case the text field).
|
|
Term Modifiers
|
The MDK search supports modifying query terms to provide a wide range
of searching options.
Wildcard Searches
|
Single and multiple character wildcard searches are supported.
To perform a single character wildcard search use the "?"
symbol.
To perform a multiple character wildcard search use the "*"
symbol.
The single character wildcard search looks for terms that
match that with the single character replaced. For example,
to search for "text" or "test" you can use the search:
Multiple character wildcard searches looks for 0 or more
characters. For example, to search for test, tests or tester,
you can use the search:
You can also use the wildcard searches in the middle of a
term.
Note: You cannot use a * or ? symbol as the first character
of a search.
|
|
Fuzzy Searches
|
Fuzzy searches are based on the Levenshtein Distance, or
Edit Distance algorithm. To do a fuzzy search use the tilde,
"~", symbol at the end of a Single word Term. For example
to search for a term similar in spelling to "roam" use the
fuzzy search:
This search will find terms like foam and roams
Note:Terms found by the fuzzy search will automatically get
a boost factor of 0.2
|
|
Proximity Searches
|
MDK search supports finding words are a within a specific
distance away. To do a proximity search use the tilde, "~",
symbol at the end of a Phrase. For example to search for a
"apache" and "jakarta" within 10 words of each other in a
document use the search:
|
|
Boosting a Term
|
MDK search provides the relevance level of matching documents
based on the terms found. To boost a term use the caret, "^",
symbol with a boost factor (a number) at the end of the term
you are searching. The higher the boost factor, the more relevant
the term will be.
Boosting allows you to control the relevance of a document
by boosting its term. For example, if you are searching for
and you want the term "jakarta" to be more relevant boost
it using the ^ symbol along with the boost factor next to
the term. You would type:
This will make documents with the term jakarta appear more
relevant. You can also boost Phrase Terms as in the example:
"jakarta apache"^4 "jakarta lucene"
By default, the boost factor is 1. Although, the boost factor
must be positive, it can be less than 1 (i.e. .2)
|
|
|
|
Boolean operators
|
Boolean operators allow terms to be combined through logic operators.
MDK search supports AND, "+", OR, NOT and "-" as Boolean operators
Note:
Boolean operators must be written in capital (uppercase) letters.
OR
|
The OR operator is the default conjunction operator. This
means that if there is no Boolean operator between two terms,
the OR operator is used. The OR operator links two terms and
finds a matching document if either of the terms exist in
a document. This is equivalent to a union using sets. The
symbol || can be used in place of the word OR.
To search for documents that contain either "jakarta apache"
or just "jakarta" use the query:
or
"jakarta apache" OR jakarta
|
|
AND
|
The AND operator matches documents where both terms exist
anywhere in the text of a single document. This is equivalent
to an intersection using sets. The symbol && can be
used in place of the word AND.
To search for documents that contain "jakarta apache" and
"jakarta lucene" use the query:
"jakarta apache" AND "jakarta lucene"
|
|
+
|
The "+" or required operator requires that the term after
the "+" symbol exist somewhere in a the field of a single
document.
To search for documents that must contain "jakarta" and may
contain "lucene" use the query:
|
|
NOT
|
The NOT operator excludes documents that contain the term
after NOT. This is equivalent to a difference using sets.
The symbol ! can be used in place of the word NOT.
To search for documents that contain "jakarta apache" but
not "jakarta lucene" use the query:
"jakarta apache" NOT "jakarta lucene"
Note: The NOT operator cannot be used with just one term.
For example, the following search will return no results:
|
|
-
|
The "-" or prohibit operator excludes documents that contain
the term after the "-" symbol.
To search for documents that contain "jakarta apache" but
not "jakarta lucene" use the query:
"jakarta apache" -"jakarta lucene"
|
|
|
|
Grouping
|
The MDK search supports using parentheses to group clauses to form
sub queries. This can be very useful if you want to control the boolean
logic for a query.
To search for either "jakarta" or "apache" and "website" use the
query:
(jakarta OR apache) AND website
This eliminates any confusion and makes sure you that website must
exist and either term jakarta or apache may exist.
|
|
Escaping Special Characters
|
The MDK search supports escaping special characters that are part
of the query syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example
to search for (1+1):2 use the query:
|
|
|