October 13th

Visualize: Analyzing Connections Between Communications

George Socha
George Socha

Visualize: Analyzing Connections Between Communications

Let no one’s work evade your eyes
Remember why the good Lord made your eyes
So don’t shade your eyes
But visualize, visualize, visualize

(With apologies to Tom Lehrer and Lobachevsky.)

In almost every investigation and lawsuit, we need to analyze communications. We look at the content of the communications, of course, but we should also look at the communications’ context. In particular, the connections between the communications. We analyze this information to figure out who was communicating with whom, when, where, how, about what, and why. And, often, we can do that most effectively with strong visualization tools.

Today’s post is about visualizing connections, but first let’s turn to content.

The Standard Approach to Understanding Communications

In analyzing communications, we tend to focus on the content of the communications. We examine their text first then, increasingly, on their metadata as well. To accomplish this, we turn to a bevy of capabilities. Tools and techniques we might use include active learning, supervised machine learning, and technology assisted review (TAR), AI models, anomaly detection, high precision active learning, image recognition, pattern recognition, sentiment analysis, stylometry, and translation.

Sometimes we are on an exploratory mission which can be open-ended. We might, for example, be trying to identify affirmative defenses to assert, unearth additional witnesses to contact, or develop a different explanation for what caused the alleged harm. That mission might also have a pre-set list of objectives. Perhaps the complaint sets forth three causes of action, each cause of action has several elements that the plaintiff needs to prove, and we are looking for evidence to support or refute each of those elements.

Other times our objective is to find “more like this”. A regulator or the opposing counsel has sent us a list of document requests. They want every document we have that meets the requirements set forth in each request. We agree to make reasonable efforts to comply. Then we start the hunt. First, we try to find at least one document that is responsive, then a second, and a third, and so on, typically continuing until we run out of time, money, or some other critical resource.

Gaining Understanding by Visualizing Connections

To get a truly full picture, I posit that we need not just content but context. We need to look at the various pieces of information that surround the content of, say, an email message. That surrounding information can throw new light on the message’s meaning. As a first step, let’s look at some of the types of contextual information that can be useful. We will use Brainspace to illustrate this discussion.

Let No One's Work Evade Your Eyes

It helps to get a visual view of contextual information. A rows-and-columns view, such as the one below, can be useful in many ways. Understanding context, however, generally is not the forte of this approach.

[insert rows-and-columns2.png]

Visualizations can be much more effective. If you go to Brainspace, search in the Enron database for the concept “raptor”, and select the “Communications” option, you see a visualization of communications and the connections between them. Here you see the People view, which depicts communications between individuals. (There also is a Domain view, which displays web domains instead of people.)

[insert brainspace-people.png]

I have a friend in Minsk

I have a friend in Minsk
Who has a friend in Pinsk….

Each circle in the picture denotes a person, perhaps the protagonist of Tom Lehrer’s Lobachevsky. In the screen capture above, the central person is Sara Shackleton, of the many people in the Enron data.

In Brainspace these circles, called nodes, come in two forms. Some nodes display a person icon. That means the node consists of a collection of different email addresses. Other nodes display an “@” symbol, signifying that the dataset only has one email address for the person.

[insert brainspace-nodes2.png]

Click on a person, and you can find out how many unique emails the person sent and how many unique emails the person received.

[insert node-details1.png]

Click on the down arrow, and you can see further details including “Sends Most To”, “Receives Most From”, and “Top Terms”.

[insert node-details2.png]

Who has a friend in Pinsk

I have a friend in Minsk
Who has a friend in Pinsk
Whose friend in Omsk
Has friend in Tomsk
With friend in Akmolinsk
His friend in Alexandrovsk
Has friend in Petropavlovsk
Whose friend somehow is solving now
The problem in Dnepropetrovsk

The real power of visualizing communications comes from the ability to examine the communications between people. These appear as the lines connecting people. In Brainspace, these lines are called edges.

insert edge2.1.png]

Clicking on the edge displays the edge details. In this example, you see the numbers of to, cc, and bcc messages that were exchanged by Sara Shackleton and Mary Cook. Shackleton sent 37 messages to Cook, cc’d Cook on 10 more, but did not bcc Cook. Cook, in turn, sent 90 messages to Shackleton, cc’d her on 24, and did not bcc her on any. You also can see the top terms found in those communications.

[insert edge2.2.png]

Controlling Scope

Whatever visualization tool you choose to use, you should be able to the scope of the information you see on the screen. You should be able to do such things as specify the maximum number of people shown on the screen. I can, for example, change the maximum number of nodes (people) from the default of 1,000 to a much smaller number such as 5. 

[insert communications-filter2.png]

If you limit your view to 5 people, this is what you see. 

Another way to control scope is to filter by incoming and outgoing messages as well as by to, cc, and bcc.

Meaningful Color

Nodes have colors assigned to them. The colors of nodes may indicate people or groups of people that are in frequent communication with each other. Nodes’ colors also can be a function of the type of label that has been assigned to that person or email address. If, for example, a label is created for “Account Executives” and the color blue is assigned to that label, any person or email address labeled as “Account Executive” will appear as a blue circle. Red might be assigned to “Managers”, so that managers appear in red. 

What You Can Do with These Tools

With the power to look at communication connections, you can literally get a big picture view by combining search capabilities, starting with a broad overview and refining as needed. Here, for example, I asked the platform to show me the map of communications for 1,000 people.

Next I zoomed in on a section, to see greater detail. 

Wondering whether Sara Shackleton might be of interest, I clicked on her name. 

Noticing a thicker line of communications, I clicked on that. 

Starting like this, you could further refine what you see using the concept search, people search, and so on.

You could use approaches such as this one for all manner of exploratory activities. You could, for example:

    • Look for areas with higher volumes of communication. While quantity does not necessarily mean quality, quantity can be important in its own right. You might want to scan the display for clusters where one person was communicating with a large number of people, or an instance where a large number of communications went back and forth between two people.
    • Search for gaps. Look for gaps in communications, such as situations where you would expect to see messages between two people but none show up, where you would expect communications from someone but you only see communications sent to them, or where you would expect to find communications between two people discussing specific topics but none appear.
    • Seek out anomalous behaviors. Look for the person who stands apart from others, the thin line amongst thick ones, the email address that doesn’t fit with all the others but gets a lot of traffic.

These are just a few suggestions for how you can make effective use of the visual display of communications connections. With an interface like this one, you can be quick and nimble in your explorations. And by doing that, you up your chances of finding content, developing context, and ultimately building a stronger case.

 Learn More

If your organization is interested in learning more about visual communications analytics and how Reveal uses AI as an integral part of its AI-powered end-to-end legal document review platform, contact us to learn more.