Going for Gold in the eDiscovery Olympics
With the Tokyo Olympics in full swing, what better time than now to take a cheeky look at the events that might make up an eDiscovery Olympics and just what it would take for practitioners to take the Gold. Curious how to become the greatest of all time (GOAT), the Michael Phelps of Privilege Log-Rolling or the Usain Bolt of Anomaly detection dashing, or the Mary Lou Retton of AI Gymnastics? Since we cannot possibly cover all 339 events across 33 different sports that the 2021 Summer Olympics will offer gold medals in, let’s take a look at the top events that might make up an eDiscovery Olympics.
Document Review Marathon
While it is true that in the case of eDiscovery and document review we are running a marathon and not a sprint, practitioners are increasingly having to run as if it is both. The document review race is facing truncated timelines (like the 100 day MID discussed previously) and increased pressure outside of these pilot programs to demonstrate expediency and cost efficiency. As a result, many professionals are turning to unsupervised and supervised machine learning to quickly uncover the key facts of a case earlier in the review process.
Although a review may still take weeks or months, the practitioners winning the eDiscovery marathon are sprinting at the outset using AI-powered tools to quickly identify the key actors, facts and context in a case. Tools like Brainspace’s Cluster wheel help practitioners quickly uncover key concepts and eliminate superfluous material first. Then they require human review and communication analysis tools that enable legal teams to quickly identify communication patterns between custodians to prioritize, or in some cases eliminate, custodians from review.
AI powered technology, especially active learning which brings together context, insights from the case team, and powerful linguistic analytics, helps accelerate completion of the doc review marathon no matter the data volume. Surfacing key issues, concepts, and documents earlier in the process and enabling practitioners to increase review speed and accuracy allow practitioners to complete the entire review in a fraction of the time and expense of more manual approaches.
As the sheer volume and variety of data has continued to explode, showing no signs of abating, the amount of heavy lifting required to identify, preserve, and collect it has steadily increased. Long gone are the days where practitioners could look into three main buckets of paper, email, and stored documents and call it a day when scoping for an ESI collection.
Today there are a myriad of devices, both business and personal. There are applications ranging from on-prem to public cloud and even the enterprise data itself may span the gamut from company owned on-prem to the cloud. Practitioners, if they want to avoid boiling the ocean, must cast their net wide enough to capture both the traditional and emerging data sources proliferating interpersonal communication. At the same time, they must remain targeted enough that they do not break their back or the bank trying to undertake the collection and subsequent doc review heavy lifting.
Effective scoping at the outset can help limit overall data creep, but is best combined with robust and AI powered Early Case Assessment (ECA) to further refine the universe of data progressing along the EDRM. Well deployed ECA, powered by Reveal’s industry leading advanced AI, can quickly categorize, promote, or remove low value data to organize concepts, PII, and junk email. This helps the case team understand the facts of a legal matter much earlier in the timeline and can inform legal strategy to minimize costs and significantly reduce risk to your organization. This checklist is a great resource to help frame your ECA process.
In-House, Law Firm, Vendor & AI Relay
In matters from large to small, it is critically important that the entire case team collaborate to reach the shared goal of an accurate and effective eDiscovery process. As in an actual relay, each leg of the race (in this case law firm, service provider, corporate client, and the technology they are leveraging) must go beyond executing their function expeditiously. They must also seamlessly hand off to each other. Communication and collaboration throughout the process are key to ensuring no one is retracing any steps or likely to stumble once they grab the baton.
Frequent status updates, working sessions, and transparency about each step along the EDRM are necessary to ensure that no one member of the team falls behind, fails to execute on their leg of the race, or misses a critical hand off. In eDiscovery there will always be things to stumble over and various minor snafu’s. The key to gold medal status is a cross collaboration across all the main stakeholders and a steadfast “no finger pointing” approach.
Weird data is the new normal and businesses and individuals are drowning in it. As a result, the amount of data that practitioners must wade through in order to identify responsive, non-privileged Electronically stored information (ESI) has dramatically increased. And yet at the same time, the deadlines and budgets for many matters are continuing to tighten.
Practitioners are taking a leap of faith and diving head first into this tsunami of data. Luckily, they have robust AI-powered tools like social network analysis to prioritize custodians, concept clustering to quickly identify most likely relevant concepts, and active learning built on deep linguistic and behavioral AI to amplify their decisions and accelerate time to production set. AI-powered precision is helping practitioners go deep to surface key evidence dramatically faster than ever before.
As with log-rolling in real life, there is a high degree of balance required in quickly identifying privileged information and creating an appropriate log to send along with your production. While privilege reviews and the attendant log generation may be the bane of many a case team’s existence, the exercise is explicitly mandated in FRCP 26(b)(5).
(5) Claiming Privilege or Protecting Trial-Preparation Materials.
(A) Information Withheld. When a party withholds information otherwise discoverable by claiming that it is privileged or subject to protection as trial preparation material, the party must:
(i) expressly make the claim; and
(ii) describe the nature of the documents, communications, or tangible things not produced or disclosed -- and do so in a manner that, without revealing information itself privileged or protected, will enable other parties to assess the claim.
Case teams trying to mitigate the balancing act or privilege identification and log generation are increasingly turning to advanced AI models. The algorithm helps surface potentially privileged information and reduces the time and expense associated with privilege review. Reveal AI’s Model Marketplace, like a Netflix queue for AI models, includes multiple discrete models aimed at privilege detection and a comprehensive multi-model solution for bridging the privilege gap.
Anomaly Detection 100 Yard Dash
In an investigation or litigation, it is often the unexpected or even missing data that tells the most compelling, and potentially relevant story. Quickly locating communication patterns outside the norm, gaps or surges in the volume of communication, and even unexpected topics or interactions can all be extremely powerful in building the narrative of your case or investigation. Put simply, the vast majority of communication and ESI in a data set is frankly boring, to be expected, and not likely to be terribly material to a case. Whereas aberrant trading behavior, unusual sentiment, gaps or surges in communication all can be indicative of behavior that might be relevant to an investigation or litigation.
Quickly identifying aberrant behavior or data points accelerates time to fact-driven case development and/or legal strategy in dealing with regulators. The absence of expected anomalies can also be equally powerful.
A few years ago I had a large financial client come to me with some concerns about a team of traders they had recently hired. They came from the group implicated in a large market manipulation scheme at a competing financial institution that had just received a billion dollar fine. Things did not look good.
Through using Brainspace’s communication analysis I uncovered that the recent hires were in communication with the named bad actors at the other bank. After applying sentiment analysis we also uncovered anomalous sentiment and language being used on the bloomberg chats the bankers shared. But, when I went a level deeper and dug into the structured data of the trade databases, no pattern of illegal or outlier trading behavior existed among my organization. By quickly identifying the absence of trading anomalies, and reporting to the regulator, potentially billions of dollars in fines and costly reputation damage was completely avoided. If that is not a gold medal worthy outcome, I am not sure what is!
The variety of AI-powered tools at the disposal of legal professionals today is vast, and still expanding. In crafting the perfect eDiscovery routine, practitioners must demonstrate flexibility and an ability to balance the power of AI with the graceful execution of the rule of law. People reaching for the Gold in this event demonstrate an ability to deploy widely varying flavors of AI depending on the requirements of the case. They must gracefully weave social network analysis with robust linguistic and sentiment analysis to power insights across their data sets. A perfect ten AI gymnastics performance is far from routine and should demonstrate precision, speed to insight, and efficiency.
In the olympics, a steeplechase event can best be described as a chaotic train wreck of a distance race consisting of dozens of obstacles and plunging water jumps that generally results in one, or several, mid leap collisions. Talk about the perfect analogy for navigating the high velocity and ever changing digital ecosystem facing legal practitioners today!
Successful racers must demonstrate endurance, an ability to navigate the many obstacles posed by large data volumes, and the proliferation of atypical data sources. On top of that, they still must get to key evidence faster than the competition, no matter how many unexpected twists, turns, or mid air collisions they face along the way.
Fellow Legal Tech thought leader and founding Partner of the Morgan Lewis eData practice Scott Milner described this event as “the new (or better) way to analyze disparate data sources to get to the “facts” quicker rather than looking at everything in silos. Tying mobile, email and structured data along with tons of other sources to see the real story.” Inspired by Dan Regard, Founder and CEO of IDS, this concept is about creating an interrelated, cohesive story across the many, many methods of communication that populate the digital ecosystem to more quickly identify key concepts, context and custodians.
The winner of this event must excel at extracting insights and key evidence across social media data, short format communication, traditional email and documents, and more to create a complete narrative of the who, what, when, where, and why of a case. And as the volume and variety of data continues to grow at an exponential pace, this comprehensive view is the only possible technology that can detect patterns across the varied sources and make connections between custodians, concepts, and data sources that are beyond what the human mind can do unassisted.
How do you become the GOAT?
There are several consistent themes across all events for people taking home the eDiscovery gold. Namely a willingness to embrace the full spectrum of technology available to accelerate time to evidence, the flexibility and adaptability to pivot as a matter evolves, and a crystal clear understanding that what got them to the big event might not be what they need to reach the finish line first.
The Olympic season is still going strong and there's still more eDiscovery events to compete in, so check in next week for a Part 2 of eDiscovery Olympics!