blog.1.image

Search in Reveal 11: Reveal Query Language, Part 2

George Socha
George Socha

Search in Reveal 11: Reveal Query Language, Part 2

Today’s Search in Reveal 11 post continues with an examination of Reveal Query Language (RQL), picking up where Search in Reveal 11: Reveal Query Language, Part 1 left off. New with Reveal 11, RQL enables users to construct powerful and complex queries that can search across both metadata and text.

In earlier posts in this series, I wrote about combining keyword and concept searches (Search in Reveal 11: Keyword, Concept, or Both), searching with just a few clicks (Search in Reveal 11: What a Click Can Do), and using the new Term List Search function in Reveal 11 (Search in Reveal 11: Term List Search), filtering on metadata (Search in Reveal 11: Metadata Filtering), and RQL (Search in Reveal 11: Reveal Query Language, Part 1).

In the first post about RQL, I looked at three fundamental aspects of RQL: searching with terms, phrases, and special characters; moving past stop words; and searching metadata.

In this post, I take a look at search validation; basic RQL operators, including searching for content in a metadata field; additional RQL operators, in particular wild cards and dates and times; and combining operators.

By using the RQL capabilities, you can tailor your search quickly, easily, and precisely; get previews of totals, and be notified if there are issues with the query you formulated.

Validation

RQL validates keyword search syntax before you submit a search. It parses the query as you entered it in the search bars, identifying errors and letting you know what is wrong.

The following search contains an extra great-than symbol and is not valid: sent_date>>2001-01-01

If I enter that query and add it to my search, I get this message:

If I delete the extra greater-than symbol, the system lets me know right away whether I have properly reformulated the search:

Basic RQL Operators

RQL uses “operators”, as do most query languages. An operator is a word, and character, or a string of characters that is used in a search query to narrow the focus of the search. Familiar examples include such Boolean search operators as AND, OR, NOT, and AND NOT.

A key difference between RQL operators in Reveal 11 and operators in many other eDiscovery platforms is that RQL operators can be used not just with text – other platforms allow for that – but with metadata as well as with combinations of text and metadata. I will demonstrate this in the examples below.

RQL uses the following operators. RQL uses them in this order, with the top operator in the list getting the highest priority and the bottom one the lowest:

 

Searching for Content in a Metadata Field

To start, here is a search for content in a metadata field:

With this query, I asked the platform to search for all documents where the subject line (a metadata field) contains the word congrats. The search I used was subject::congrats. The search returned 187 documents out of a total population of 1,216,199 documents.

Let’s break down the query. The first part of the query, subject, tells the platform what metadata field to search.

The second part of the query, the double colons ::, instructs the platform to perform an RQL search.

The third and final part of the query, congrats, tells the platform what string of characters to search for in the designated field. Here, I have told the system to search for any documents where the subject metadata field contains the characters congrats. If I wanted to search for a phrase such as re: congrats instead of a single word, I would have put the phrase in quotation makes like this: subject::“re: congrats”.

First Variation

Next, I searched for all documents where the subject line contained only the word congrats. This time, the search returned a smaller group, just 62 documents:

Second Variation

After that, I searched for all documents where the subject line contained the word congrats and where it did not contain only that word. With this search, I got back 125 documents, all documents where the subject line contained both congrats and something else, such as :

Third Variation

Finally, I asked the system to search both text and metadata, looking for documents where (1) the document was sent before 2005 (searching a metadata field) and (2) the word raptor is found anywhere in the document (searching the text of the document). I got back 626 documents:

Additional RQL Operators

In addition to the operators discussed above, RQL lets you use operators to do the following:

Perform a Wildcard Search

With RQL, you can use an asterisk to perform wildcard searches. Here are four examples:

  • word*: By placing an asterisk at the end of a word, you can search for any document that contains a string of characters beginning with that word. A search for trad* in the Enron data returned documents containing trade, tradewave.com, traded, trademark, trademarks, traders, trades, tradewave, tradeweil, trading, tradingday.com, tradition, tradition’s, traditional, traditionally, traditions, and so on.
  • *word: By placing an asterisk at the beginning of a word, you can search for any document that contains a string of characters ending with that word. A search for *trad in the Enron data returned documents containing amtrad, energytrad, fin-trad, firmtrad, magistrad, physicaltrad, strad, etc.
  • w*rd: By replacing a letter in the middle of a word with an asterisk at the beginning of a word, you can search for any document that contains that string of characters with a different letter or set of letters where the asterisk is located. A search for tr*d in the Enron data returned documents containing transferred, treated, trend, trimmed, etc.
  • *word*: By pacing asterisks at both the beginning and the end of a word, you can search for any document containing that word, even where the characters fall in the middle of some other word. A search for *trad* in the Enron data returned documents containing words or strings of characters such as ameritrade, contradicting, estrada, simtrader1, and hemptrade.com.

Searching Dates and Times

Using RQL, you can be quite precise with data and time searches.

First, you can define date and time ranges (and ranges for other numeric fields) using four operators: > (greater than), < (less than), >= (greater than or equal to), and <= (less than or equal to).

Next, you decide how granular you want to make your search. You can search by year, month, day, hour, minute, or second.

Here is an RQL search for any documents that were sent at anytime on November 29, 2001:

Here are the results of the search, displayed on a one-day timeline:

Combining Operators

You can, of course, combine operators. Here, I did a search looking for all documents that:

  • Were sent between 1/1/2001 and 12/31/2001 inclusive,
  • Contained any word starting with “preserv” in the subject line, and
  • Contained the words “raptor” or “ljm” anywhere in the document.

The first two parts of the search (date and subject) query data in metadata fields. The third part (raptor or ljm) query the text of the document.

The first document returned by the search is an 11/29/2001 email regarding efforts to preserve voice mails relating in any way to LJM or Chewco:

And There’s More

What I showed today is just a bit more about the new Reveal Query Language (RQL) and how it enables you to better search your data, helping you enjoy the highest quality speed to insight in the industry.

In posts to come, I’ll continue exploring RQL and other ways you can use Reveal 11 and its greatly enhanced search capabilities.

For more information about how Reveal can empower your organization, contact us for a demo.