October 27th

All In On AI

George Socha
George Socha

All In On AI

For over a year I have been writing about the power AI can bring to eDiscovery. Today’s post pulls those materials together in one place, giving an overview of the AI content discussed so far.

I started by looking at some of the underlying challenges in 7 Signs It’s Time to Upgrade Your eDiscovery Solution. I laid out an AI framework and potential benefits that can accrue from the effective use of AI, first in Legal AI Software: Taking Document Review to the Next Level, AI in the Legal Sector – the Obvious Choice, and then in After 15 Years, Has the eDiscovery EDRM Model Been Realized?

I then turned to specific challenges and how you can use AI to address them. These challenges include:

Content requiring translation: How to find content in need of translation, translate that content, and work with the results of the translation.

Images: How to make the substantive content of image files such as photographs searchable.

Visualization: Presenting eDiscovery content and the context around that content visually instead of just in rows and columns.

Patterns: Finding and presenting meaningful patterns in data.

Higher precision in active learning: How to classify the content of documents at a deeper level.

Anomalies: Finding and working with data that indicates deviations from “normal” activities, patterns, or behaviors.

Emotional signals: Identifying and working with sentiment in ESI.

Stylometry: Finding information about situations where someone is being evasive or attempting to avoid the truth.

AI models: Understanding what they are, how to build them, and how to make effective use of them.


Foreign language document review has become a fact of life for may working on litigation matters. While human translation may be required in certain situations, machine translation can be an efficient and cost-effective approach that gives reviewers and analysts access to content written in languages they do not understand.

I looked at how AI can be used to identify text in different languages and translate that text, first In How Many Hurdles are in Your Foreign Language Document Review Process? and then in Are You Spending Too Much For Foreign Language Document Review?


Image Recognition and Labeling

We find images and AI image recognition everywhere we turn in our personal lives and yet when it comes to eDiscovery, pictures, photographs, and drawing seem to be largely ignored. Although too often overlooked, AI image detection and labeling is ready and available for use in lawsuits and investigations if you just know where to look.

In AI Image Recognition: The eDiscovery Feature You Didn't Know Existed, I discussed what image recognition is, how it can be used in eDiscovery, and what to look for in image recognition technology. In Image Recognition and Classification During Legal Review, I talked about what you can do with pictures once you have image recognition capabilities.

label-17 (1)

Visually Analyzing Content Concepts 

The AI-powered Cluster Wheel is the most widely used visual analytics feature in Reveal's Brainspace technology. With it, you can quickly and easily find the content that really matters. The Cluster Wheel is especially powerful when you are new to your data, not yet sure what you are looking for, and need tools that help guide you to the documents and communications that can make or break your matter.

With the cluster wheel, you get data organized from day one and quickly drill into your data. Using color as you guide and drilling into deeper levels of cluster, can range wide and deep and find common high-level concepts quickly. To explore the cluster wheel, read 11 Reasons Lawyers Love Reveal's Brainspace Cluster Wheel

cluster-wheel-1.1 (1)

Visually Analyzing Connections between Communications

In almost every investigation and lawsuit, we need to analyze communications. We look at the content of the communications, of course, but we should also look at the communications’ context. In particular, the connections between the communications. We analyze this information to figure out who was communicating with whom, when, where, how, about what, and why. And, often, we can do that most effectively with strong visualization tools, as discussed in Visualize: Analyzing Connections Between Communications


Pattern Recognition and Machine Learning

An important for of AI regularly used in eDiscovery is pattern recognition. Pattern recognition software is a form of artificial intelligence (AI) software. AI comes in various flavors, including machine learning. Machine learning, in turn, comes in different subsets including supervised learning and unsupervised learning. Both supervised and unsupervised learning have subcategories. One subcategory, which falls under both these forms of machine learning, is pattern recognition, a topic explored in Pattern Recognition Software for Legal Compliance.

AI subdomains-1

A process where humans train software to identify documents that meet certain criteria, active learning is a powerful tool to help classify document. Offered in different forms and going by various names such as technology assisted review (TAR) and continuous active learning (CAL), active learning is a capability lawyers and related professionals have been using in eDiscovery for over a decade. For an overview of active learning, go to How Important is Active Learning for eDiscovery?, and for a discussion about TAR, go to What is Technology Assisted Review?

With supervised machine learning, you can filter junk, prioritize review, find what matters most, located privileged communications, solve complex problems, and do so much more. For details, read 5 Things You Can Do with Supervised Machine Learning




High Precision Active Learning

On the “what’s coming next” front, I wrote in Legal Document Review's New BFF: High Precision Active Learning about an upcoming feature in Reveal AI, high precision classification. HPC takes classification to a new level. Traditional classification uses a document as its base unit. High precision classification is much more focused, using selected text. 


Working with Anomalies

"Something different, abnormal, peculiar, or not easily classified," anomalies matter because they are powerful pieces of information that attorneys and investigators can use to accomplish one of their key tasks: figuring out what happened and why.

Anomaly detection is a powerful tool for anyone seeking to understand the nuances of complex datasets. While many eDiscovery and machine learning tools offer some basic ability to detect data anomalies, few tap into the possibilities opened up when a wide array of anomaly detection capabilities are incorporated into the search, data analysis, and display functions of an eDiscovery platform.

To better understand anomalies, read What is an Anomaly? For thoughts on how to make effective use of anomalies, go to The Exquisite eDiscovery Magic of Data Anomaly Detection.

cards-2 (1)

Using Emotional Signals in eDiscovery

Our written communications can be heavily larded with sentiment, “the emotional significance of a passage or expression as distinguished from its verbal context”. As litigators, investigators, and the people who support them, if we can find and analyze sentiment in discovery documents then we greatly improve our ability to figure out who did what, when, where, how, and most notably why.

Reveal AI uses unsupervised machine learning to look for language expressing seven types of sentiment: intent, opportunity, pressure, rationalization, sentiment alternation, positivity, and negativity. When Reveal AI encounters content containing any of these sentiments, its assigns a score to that content.

In Getting Sentimental: Using Emotional Signals in eDiscovery, we focused on how to find and use emotional signals in written communications.


Using Stylometry to Find Indications of Fraud and Similar Behaviors

An on-going challenge for those of us who handle litigation and investigations is to cost-effectively find documents containing indicia of fraud. Rarely does one who is setting out to commit fraud write, “I am setting out to commit fraud.” To the contrary, fraudsters are far more likely to use language that is intentionally vague.

We need to sidle up to that content, if you will, using peripheral vision to notice that which a direct gaze does not reveal. Sidling up to the content means looking for indicators such as emotional signals suggesting that someone was under pressure, presented with an opportunity, and attempting to rationalize their decision. With Reveal AI, you can search for these emotional signals today. For a more detailed discussion, go to Stylometry and the Fraud Triangle


AI Models & Model Library

Generally, an AI model is a software program that has been trained on a set of data to perform specific tasks like recognizing certain patterns. Artificial intelligence models use decision-making algorithms to learn from the training and data and apply that learning to achieve specific pre-defined objectives. For a look at supervised, unsupervised, and semi-supervised machine learning models and how you can use them, see What Is An AI Model?


Layering Legal AI Models

With layering, multiple pre-existed models are combined to deliver a legal technology that, like a brick wall, is greater than the output from any of the individual models used alone: quicker, less expensive access to targeted content.

You can pack and stack similar models from previous cases, combining them into new uber models that take advantage of what you and your AI models have learned over time. You also can combine disparate models, beginning with different pre-built models that you fit together to accomplish a specific goal. For more on layering AI models, go to Layering Legal AI Models for Faster Insights


What Data Scientists Do

None of this happens all by itself. Data scientists are a core group involved in almost every facet of the use of AI in discovery. To learn more about what a data scientist is and what data scientists do, read What Do Data Scientists Do?


No matter what AI you use and how you use it, you should keep defensibility in mind. While the defensibility of AI generally has gotten increased attention of late, there has been less focus on the defensibility of AI specifically when used for eDiscovery.

When assessing defensibility, four key criteria to consider are functionality, reasonableness, reliability, and understandability. For a discussion on the state of AI defensibility, check out Defensibility of eDiscovery AI in Court.

All In On AI

If you and your organization would like to go all in on AI – or even just begin to explore the power AI can bring to the matters you work on – and want learn more about how Reveal uses AI as an integral part of its AI-powered end-to-end legal document review platform, contact us to learn more.