“We rushed into the captain’s cabin . . . there he lay with his brains smeared over the chart of the Atlantic . . . while the chaplain stood with a smoking pistol in his hand.”
Used to mean indisputable evidence the “smoking pistol” was first used in the Sherlock Holmes story, The Gloria Scott (1893). Modern parlance now refers to it as a “smoking gun” and as NY Times columnist William Safire identified, it has been used during the Watergate scandal and the Iraqi nuclear arms controversy.
As a document investigator in litigation, I live and die by concept of the smoking gun. What trial lawyer doesn’t rub his or her hands with glee when presented with a “smoking gun” document…the next best thing to an on the stand admission (maybe even better, hmmm?). It’s that tangible thing, the witnesses’ own words coming back to haunt her.
Over the years, I’ve trained a number of lawyers and legal professionals on the best way to search in a large document collection. I go through my presentation and then set them to it. Their training task is to locate something “of interest” that relates to their case. (They are the lawyers – they decide what’s key.) They are searching a produced custodial document set of a few thousand emails. There’s some finger cracking, friendly wagers are made. They get to it. Can you guess what most folks type into that search box first? It splits about 50/50 into (1) the searcher’s name or (2) curse words.
Of course this is simply human nature. When I sit down to crack open a brand new corpus, I go straight for the negative sentiment and red flag language as well. But there are better, much more productive words and phrases (language markers) to start with than F*** you. My favorites include: a bad situation, the problem with, will have an issue with.
Search and retrieval requires a level of language expertise and subject matter familiarity that can prove difficult and elusive those who do not routinely work with huge natural language datasets, or in the particular genre of business communication, or subgenres therein. In traditional document review settings, finding smoking gun documents often rests on chance and the ability to dedicate resources to the task. However, when dealing with huge collections of text typical of today’s productions, collections that are growing exponentially every day, leaving things to chance and just assigning a team of contract lawyers to read millions of pages of text are neither valid and reliable approaches.
Let’s consider how a linguist, or in this case an applied linguist specializing in legal document collections and a lawyer trained in linguistics could apply their expertise to contribute to finding smoking gun documents. (Yes, I am referring to my partner Dr. Barry and yours truly.) Over the years, we have found dozens upon dozens of smoking gun documents. Overtime we learned that it isn’t just the content of the documents that makes them important, it’s not just the combination of linguistic features or patterns, but it’s also context and extra-linguistic variables such as when the document was produced, who wrote the document and to whom, all of these things together make a document smoking gun material. For example, a whistle blower’s memo wouldn’t be so impressive if it were written after the DOJ started it’s investigation of the company would it? It would just look like some self-serving CYA. Something easily explained away by opposing counsel. Or an admission by a pharmaceutical sales rep that “drug X has some problematic and concerning side effects” AFTER the FDA has withdrawn it from the market isn’t very crucial. It’s an intersection of linguistic and extra-linguistic variables that have legal import; and, it requires language AND legal expertise to engage in this process.
We have had the good fortune of combining our expertise and informing each others research objectives for a decade now, and what’s more, we’ve worked with countless corpora consisting of business communications and documentation produced specifically in the context of civil litigation. Here’s what we know: Who better to develop and implement investigative methods in large natural language datasets of unstructured text than linguists, particularly forensic linguists who understand and are comfortable working within a legal setting. Here is another important point to make: Language is complex, to say the least, but investigating Language in a legal setting adds another layer of complexity to the task. Remember, the linguist isn’t making the call about what is or is not a smoking gun. The linguist is leveraging their linguistic expertise about Language and patterns of communication in tandem with another legal professional’s expert opinion, often times a lawyer, in order to wring every bit of critical information out of a collection of ESI.
For example, when we see patterns of overlapping content like financial language and risk/benefit language that includes talks of fatalities in a document devoid of personal opinion and emotive language, we understand this is something our client should look at immediately. Likewise, when we see a communication that includes some higher ups emailing back and forth, expressing negative sentiment and including a lot of informal, personal language, co-occuring with business strategy-related language and unique terms of art, we understand this is also something the legal team should look at immediately. We then distill these linguistic patterns of communication into an algorithm and refer to this algorithm going forward.
So yes, a smoking gun document is the Holy Grail of document investigation. And you should certainly put yourself in the best position to find them. But like the cup, the smoking gun can be elusive; and perhaps, expert investigators should work with legal teams on more productive areas like identifying material that proves or disproves the elements of a case, that solidly support crucial case themes, or that are the foundation for the story you want to tell the jury. Sure, it would be great to have a smoking gun as the centerpiece. But more on that next week. Until then, good luck.