Knowledge Base: Advanced Search

About Advanced Search

Resolute's advanced search query syntax draws from the syntax used by the Apache Lucene project, an industry standard in the field of information retrieval.

The following syntactical features can be used directly in any free text query input box, such as the one at the top of the main search UI. (Currently, this syntax cannot be used in any facet search boxes, such as tags or categories.)

These features are also accessible in a more user-friendly manner using the Advanced Search UI. To access the Advanced Search UI, click "Search by" underneath the free text query input box, then click "Advanced".

Terms

A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.

A Single Term is a single word such as "test" or "hey".

A Phrase is a group of words surrounded by double quotes such as "hey you".

Multiple terms can be combined together with Boolean operators to form a more complex query (see below).

Fields

Fielded data is supported. When performing a search you can either specify a field, or search all fields.

You can search any field by typing the field name followed by a colon ":" and then the term you are looking for.

As an example, let's assume there are two fields, title and abstract. If you want to find the document entitled "Resolute Innovation" which contains the text "the revolution is upon us", you can enter:

title:"resolute innovation" AND abstract:revolution

or

title:"resolute innovation" AND revolution

Since all fields are included by default, the field indicator is not required.

Note: The field is only valid for the term that it directly precedes, so the query

title:resolute innovation

will only find "resolute" in the title field. It will find "innovation" across all fields.

Term Modifiers

Term modifiers can be used to modify query terms to provide a wide range of searching options.

Wildcard Searches

Single and multiple character wildcard searches are supported within single terms (not within phrase queries).

To perform a single character wildcard search use the "?" symbol.

To perform a multiple character wildcard search use the "*" symbol.

The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for "text" or "test" you can use the search:

te?t

Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or tester, you can use the search:

test*

You can also use the wildcard searches in the middle of a term.

te*t

Note: You cannot use a * or ? symbol as the first character of a search.

Fuzzy Searches

Fuzzy searches are supported based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example to search for a term similar in spelling to "roam" use the fuzzy search:

roam~

This search will find terms like foam and roams.

This uses the Damerau-Levenshtein distance to find all terms with a maximum of two changes, where a change is the insertion, deletion or substitution of a single character, or transposition of two adjacent characters.

The default edit distance is 2, but an edit distance of 1 should be sufficient to catch 80% of all human misspellings. It can be specified as:

roam~1

Proximity Searches

Proximity searches are used to find words that are within a specific distance away from each other. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for "resolute" and "innovation" within 10 words of each other in a document use the search:

"resolute innovation"~10

Boosting a Term

The relevance level of matching documents is calculated based, in part, on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.

Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for

resolute innovation

and you want the term "resolute" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type:

resolute^4 innovation

This will make documents with the term "resolute" appear more relevant. You can also boost Phrase Terms as in the example:

"resolute innovation"^4 "artificial intelligence"

By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)

Boolean Operators

Boolean operators allow terms to be combined through logic operators. The operators AND, "+", OR, NOT and "-" are supported. (Note: Boolean operators must be ALL CAPS.)

OR

The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.

To search for documents that contain either "resolute innovation" or just "innovation" use the query:

"resolute innovation" OR innovation

AND

The AND operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the AND operator is used. The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.

To search for documents that contain "resolute innovation" and "artificial intelligence" use the query:

"resolute innovation" "artificial intelligence"

or

"resolute innovation" AND "artificial intelligence"

+

The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document.

To search for documents that must contain "resolute" and may contain "innovation" use the query:

+resolute innovation

NOT

The NOT operator excludes documents that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.

To search for documents that contain "resolute innovation" but not "artificial intelligence" use the query:

"resolute innovation" NOT "artificial intelligence"

Note: The NOT operator cannot be used with just one term. For example, the following search will return no results:

NOT "resolute innovation"

-

The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.

To search for documents that contain "resolute innovation" but not "artificial intelligence" use the query:

"resolute innovation" -"artificial intelligence"

Grouping

Parentheses can be used to group clauses or to form sub queries. This can be very useful if you want to control the boolean logic for a query.

To search for either "resolute" or "innovation" and "intelligence" use the query:

(resolute OR innovation) AND intelligence

This eliminates any confusion and makes sure you that "intelligence" must exist and either term "resolute" or "innovation" may exist.

Field Grouping

Parentheses can also be used to group multiple clauses to a single field.

To search for a title that contains both the word "resolute" and the phrase "artificial intelligence" use the query:

title:(+resolute +"artificial intelligence")

Escaping Special Characters

Special characters that are part of the query syntax must be escaped. The current list of special characters includes

+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

To escape these characters use the \ before the character. For example to search for (1+1):2 use the query:

\(1\+1\)\:2

Logical Forms

The basic unit of the advanced search interface is the query clause. A query clause is composed of a field selection, operator, modifier, and query string. Each query clause appears on the UI as a single line of inputs. 

To generate a complete query string from multiple query clauses, the clauses must be combined. The method used to combine query clauses is specified by the "Logical Form" option.

Conjunctive

All query clauses are required. For a document to match the query, it must match all of the individual query clauses.

Disjunctive

All query clauses are optional. For a document to match the query, it must match at least one of the query clauses.

Conjunctive Normal Form

An "AND of ORs." Clauses are grouped by field. Each field must have at least one clause that matches.

Disjunctive Normal Form

An "OR of ANDs." Clauses are grouped by field. For a field to match, all of its clauses must match. For the entire query to match a document, at least one field must match.