eDiscovery AI Ethics, Part 1: Definitions and Related Resources
There has been much discussion of late about whether and to what extent it is ethical to use artificial intelligence (AI) for eDiscovery.
As I have read posts and articles on the topic and listen to what is said during conferences and webinars, I have had a hard time pinning down the concerns and their implications.
In this series of posts, I will try to make sense of the issues and figure out where we as an industry stand – and should stand – on these issues.
In this post, the first in the series, I will look at definitions of key terms we need to understand if we are to make sense of the ethical issues related to the use of AI in eDiscovery. In the next post, I will examine ethical frameworks that may provide guidance.
Ethics and Legal Ethics
According to Britannica, ethics, “also called moral philosophy, [is] the discipline concerned with what is morally good and bad and morally right and wrong. The term is also applied to any system or theory of moral values or principles.”
Merriam-Webster draws a distinction: “Morals often describes one's particular values concerning what is right and what is wrong.… While ethics can refer broadly to moral principles, one often sees it applied to questions of correct behavior within a relatively narrow area of activity…”
Legal ethics is more focused, as noted by Cornell Law School’s Legal Information Institute. Also known as professional responsibility, it is the law governing lawyers:
Because of their role in society and their close involvement in the administration of law, lawyers are subject to special standards, regulation, and liability. Sometimes called legal ethics, sometimes professional responsibility, the topic is perhaps most comprehensively described as the law governing lawyers.
For lawyers in the US, the ABA Model Rules of Professional Conduct “serve as models for the ethics rules of most jurisdictions”.
Discovery and eDiscovery
Discovery, as defined by the ABA Division for Public Education, is “the formal process of exchanging information between the parties about the witnesses and evidence they’ll present at trial.” In US Federal courts, forms of discovery available to parties include:
- Initial disclosures (FRCP 26),
- Depositions to perpetuate testimony (FRCP 27),
- Depositions by oral examination (FRCP 30),
- Depositions by written questions (FRCP 31),
- Interrogatories to parties (FRCP 33),
- Requests for production of documents, electronically stored information, and tangible things and requests for permission for entry onto designated lands or other property (FRCP 34),
- Physical and metal examinations (FRCP 35), and
- Request for admission (FRCP 36).
eDiscovery, a subset of discovery, focuses on electronically stored information (ESI), as opposed to information stored on paper, in people’s heads, or in tangible documents. eDiscovery is the process of finding, using, and managing that information, narrowly in the context of litigation and more broadly for any form of investigation or dispute resolution. The general eDiscovery process is described in the EDRM model, a conceptual depiction of the key steps in that process and the relationships between those steps. Federal rules that implicate eDiscovery include:
- Federal Rule of Civil Procedure 1. Scope and Purpose
- FRCP 16. Pretrial Conferences; Scheduling; Management
- FRCP 26. Duty to Disclose; General Provisions Governing Discovery
- FRCP 27. Depositions to Perpetuate Testimony
- FRCP 28. Persons Before Whom Depositions May Be Taken
- FRCP 29. Stipulations About Discovery Procedure
- FRCP 30. Depositions by Oral Examination
- FRCP 32. Using Depositions in Court Proceedings
- FRCP 33. Interrogatories to Parties
- FRCP 34. Producing Documents, Electronically Stored Information, and Tangible Things, or Entering onto Land, for Inspection and Other Purposes
- FRCP 36. Requests for Admission
- FRCP 37. Failure to Make Disclosures or to Cooperate in Discovery; Sanctions
- FRCP 45. Subpoena
- Federal Rule of Evidence 502. Attorney-Client Privilege and Work Product; Limitations on Waiver
- FRE 902. Evidence That Is Self-Authenticating
Definitions of artificial intelligence abound.
John McCarthy, a Dartmouth and Stanford computer scientist widely credited for having coined the phrase “artificial intelligence” in 1955, defined AI as “the science and engineering of making intelligent machines, especially intelligent computer programs.”
Merriam-Webster defines artificial intelligence as a “branch of computer science dealing with the simulation of intelligent behavior in computers” and “the capability of a machine to imitate intelligent human behavior.”
The ABA House of Delegates, in Resolution 112, adopted in 2019, offered this definition (citations removed):
Artificial intelligence has been defined as “the capability of a machine to imitate intelligent human behavior.” Others have defined it as “cognitive computing” or “machine learning.” Although there are many descriptive terms used, AI at its core encompasses tools that are trained rather than programmed. It involves teaching computers how to perform tasks that typically require human intelligence such as perception, pattern recognition, and decision-making.
AI can be divided into categories. One common view divides AI into four subsets: Reactive, limited memory, theory of mind, and self-aware. Reactive AI is where AI “is programmed to provide a predictable output based on the input it receives.” Limited memory AI “learns from the past and builds experiential knowledge by observing actions or data.” Theory of mind AI, still a theory rather than a product, will be available when “machines will acquire true decision-making capabilities that are similar to humans.” Self-aware AI also has not yet been achieved: “When machines can be aware of their own emotions, as well as the emotions of others around them, they will have a level of consciousness and intelligence similar to human beings.”
A more pragmatic approach organizes AI into four functional groups: machine learning, natural language processing, computer visions, and robotics. I discussed this in an earlier post, Legal AI Software: Taking Document Review to the Next Level.
Perhaps the easiest way of understanding what AI means in the context of eDisovery is to consider ways in which AI has been deployed in platforms such as Reveal’s. Examples include:
- Anomaly detection. Anomaly detection, also called outlier detection, is a means of finding unexpected, hopefully useful patterns in data. See Using AI for Privilege Review, What is an Anomaly? and The Exquisite eDiscovery Magic of Data Anomaly Detection.
- Concept searching and clustering. With the right AI tools, you can search for concepts and then use tools such as cluster wheels to find high-level concepts quickly, drill in for greater details, and ultimately get to potentially key messages and other content. See Search in Reveal 11: Keyword, Concept, or Both, Using AI for Privilege Review, Using AI to Prepare the Answer to a Complaint, 11 Reasons Lawyers Love Reveal's Brainspace Cluster Wheel, and Introducing the Brain Explorer.
- Communications maps and analysis. With communications visualization and similar tools, you can better understand the content and context of communications. See Visualize: Analyzing Connections Between Communications.
- Entity extraction. Entity extraction is a form of unsupervised machine learning. An entity can be a person, place, thing, event, category, or even a piece of formatted data such as a credit card number. Entities are identified by AI system. Having identified an entity, the system then extracts information about that entity and makes it available for you to use as you review and analyze data. See Getting to Know You: Entity Extraction in Action and Using AI to Respond to Written Discovery Requests - Part 2.
- Image labeling (computer vision). AI analyzes the pixels in images, detects objects in the images, labels those objects (“construction site”, “baby”, “mold”, etc.) and assigns a confidence level to each label. See Image Recognition and Classification During Legal Review, AI Image Recognition: The eDiscovery Feature You Didn't Know Existed, and Testing the Efficacy of Image Labeling.
- AI models: Generally, an AI model is a software program that has been trained on a set of data to perform specific tasks like recognizing certain patterns. Artificial intelligence models use decision-making algorithms to learn from the training and data and apply that learning to achieve specific pre-defined objectives. See What Is An AI Model? and Using AI for Privilege Review, BERT, MBERT, and the Quest to Understand.
- Sentiment analysis. Written communications can be heavily larded with sentiment – “the emotional significance of a passage or expression as distinguished from its verbal context“. Unsupervised machine learning can be used to look for language expressing sentiment. See Getting Sentimental: Using Emotional Signals in eDiscovery.
- Supervised machine learning (aka predictive coding or TAR). See Using AI for Privilege Review and Legal Document Review's New BFF: High Precision Active Learning.
- Translation. AI can be used to identify languages in text and translate that content from one language to another. See How Many Hurdles are in Your Foreign Language Document Review Process?, Are You Spending Too Much For Foreign Language Document Review?
In the next post, I will examine ethical frameworks and related materials that may provide guidance, including the ABA Resolution 112 mentioned above, the ABA Model Rules of Professional Conduct, and the White House’s “Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People”.