AI explained: AI and e-discovery

Authors

Anthony J. Diana, Therese Craparo, David R. Cohen

Authors

Anthony J. Diana

Partner

New York

Therese Craparo

Partner

New York

David R. Cohen

Contract Attorney

Pittsburgh

Reed Smith and its lawyers have used machine-assisted case preparation tools for many years (and it launched the Gravity Stack subsidiary) to apply legal technology that cuts costs, saves labor and extracts serious questions faster for senior lawyers to review. Partners David Cohen, Anthony Diana and Therese Craparo discuss how generative AI is creating powerful new options for legal teams using machine-assisted legal processes in case preparation and e-discovery. They discuss how the field of e-discovery, with the help of emerging AI systems, is becoming more widely accepted as a cost and quality improvement.

Transcript:

Intro: Hello, and welcome to Tech Law Talks, a podcast brought to you by Reed Smith's Emerging Technologies Group. In each episode of this podcast, we will discuss cutting-edge issues on technology, data, and the law. We will provide practical observations on a wide variety of technology and data topics to give you quick and actionable tips to address the issues you are dealing with every day.

David: Hello, everyone, and welcome to Tech Law Talks and our new series on AI. Over the the coming months, we'll explore the key challenges and opportunities within the rapidly evolving AI landscape. Today, we're going to focus on AI in eDiscovery. My name is David Cohen, and I'm pleased to be joined today by my colleagues, Anthony, Diana, and Therese Craparo. I head up Reed Smith's Records & eDiscovery practice group, big practice group, 70 plus lawyers strong, and we're very excited to be moving into AI territory. And we've been using some AI tools and we're testing new ones. Therese, I'm going to turn it over to you to introduce yourself.

Therese: Sure. Thanks, Dave. Hi, my name is Therese Craparo. I am a partner in our Emerging Technologies Group here at Reed Smith. My practice focuses on eDiscovery, digital innovation, and data risk management. And And like all of us, seeing a significant uptick in the interest in using AI across industries and particularly in the legal industry. Anthony?

Anthony: Hello, this is Anthony Diana. I am a partner in the New York office, also part of the Emerging Technologies Group. And similarly, my practice focuses on digital transformation projects for large clients, particularly financial institutions. and also been dealing with e-discovery issues for more than 20 years, basically, as long as e-discovery has existed. I think all of us have on this call. So looking forward to talking about AI.

David: Thanks, Anthony. And my first question is, the field of e-discovery was one of the first to make practical use of AI in the form of predictive coding and document analytics. Predictive coding has now been around for more than two decades. So, Teresa and Anthony, how's that been working out?

Therese: You know, I think it's a dual answer, right? It's been working out incredibly well, and yet it's not used as much as it should be. I think that at this stage, the use of predictive coding and analytics in e-discovery is pretty standard, right? Right. As Dave, as you said, two decades ago, it was very controversial and there was a lot of debate and dispute about the appropriate use and the right controls and the like going on in the industry and a lot of discovery fights around that. But I think at this stage, we've really gotten to a point where this technology is, you know, well understood, used incredibly effectively to appropriately manage and streamline e-discovery and to improve on discovery processes and the like. I think it's far less controversial in terms of its use. And frankly, the e-discovery industry has done a really great job at promoting it and finding ways to use this advanced technology in litigation. I think that one of the challenges is that still is that while the lawyers who are using it are using it incredibly effectively, it's still not enough people that have adopted it. And I think there are still lawyers out there that haven't been using predictive coding or document analytics in ways that they could be using it to improve their own processes. I don't know, Anthony, what are your thoughts on that?

Anthony: Yeah, I mean, I think to reiterate this, I mean, the predictive coding that everyone's used to is it's machine learning, right? So it's AI, but it's machine learning. And I think it was particularly helpful just in terms of workflow and what we're trying to accomplish in eDiscovery when we're trying to produce relevant information. Information, machine learning made a lot of sense. And I think I was a big proponent of it. I think a lot of people are because it gave a lot of control. The big issue was it allowed, I would call, senior attorneys to have more control over what is relevant. So the whole idea is you would train the model with looking at relevant documents, and then you would have senior attorneys basically get involved and say, okay, what are the edge cases? It was the basic stuff was easy. You had the edge cases, you could have senior attorneys look at it, make that call, and then basically you would use the technology to use what I would say, whatever you're thinking in your brain, the senior attorney, that is now going to be used to help determine relevance. And you're not relying as much on the contract attorneys and the workflow. So it made a whole host of sense, frankly, from a risk perspective. I think one of the issues that we saw early on is everyone was saying it was going to save lots of money. Didn't really save a lot of money, right? Partly because the volumes went up too much, partly because, you know, the process, but from a risk perspective, I thought it was really good because I think you were getting better quality, which I think was one of the things that's most important, right? And I think this is going to be important as we start talking about AI generally is, and in terms of processes, it was a quality play, right? It was, this is better. It's a better process. It's better managing the risks than just having manual review. So that was the key to it, I think. As we talked about, there was lots of controversy about it. The controversy often stemmed from, I'll call it the validation. We had lots of attorneys saying, I want to see the validation set. They wanted to see how the model was trained. You have to give us all the documents and train. And I think generally that fell by the wayside. That really didn't really happen. One of the keys though, and I think this is also true for all AI, is the validation testing, which Teresa touched upon, that became critical. I think people realized that one of the things you had to do as you're training the model and you started seeing things, you would always do some sampling and do validation testing to see if the model was working correctly. And that validation testing was the defensibility that courts, I think, latched on on. And I think when we start talking about Gen AI, that's going to be one of the issues. People are comfortable with machine learning, understand the risks, understand, you know, one of the other big risks that we all saw as part of it was the data set would change, right? You have 10 custodians, you train the model, then you got another 10 custodians. Sometimes it didn't matter. Sometimes it really made a big difference and you had to retrain the model. So I think we're all comfortable with that. I think as Therese said, it's still not as prevalent as you would have imagined, given how effective it is, but it's partly because it's a lot of work, right? And often it's a lot of work by, I'll say, senior attorneys instead of developing it, when it's still a lot easier to say, let's just use search terms, negotiate it, and then throw a bunch of contract attorneys on it, and then do what you see. It works, but I think that's still one of the impediments of it actually being used as much as we thought.

Therese: And I think to pick up on what Anthony is saying, what I think is really important is we do have 20 years of experience using AI technology in the e-discovery industry. So much has been learned about how you use those models, the appropriate controls, how you get quality validation and the like. And I think that there's so much to use from that in the increasing use of AI in e-discovery, in the legal field in general, even across organizations. There's a lot of value to be had there of leveraging the lessons learned and applying them to the use of the emerging types of AI that we're seeing that I think we need to keep in mind and the legal field needs to keep in mind that we know how to use this and we know how to understand it. We know how to make it defensible. And I think as we move forward, those lessons are going to serve us really well in facilitating, you know, more advanced use of AI. So in thinking about how the changes may happen going forward, right, as we're looking forward, how do we think that generative AI based on large language models are going to change e-discovery in the future?

Anthony: I think there, in terms of how generative AI is going to work, I have my doubts, frankly, about how effective it's going to be. We all know that these large language models are basically based on billions, if not trillions of data points or whatever, but it's generic. It's all public information. That's how the model is based. One of the things that I want to see as people start using generative AI and seeing how it would work, is how is that going to play when we're talking about very, it's confidential information, like almost all of our clients that are dealing with e-discovery, all this stuff's confidential. It's not stuff that's public. So I understand the concept if you have a large language model that is billions and billions of data points or whatever is going to be exact, but it's a probability calculation, right? It's basically guessing what the next answer is going going to be, the next word is going to be based on this general population, not necessarily on some very esoteric area that you may be focused on for a particular case, right? So I think it remains to be seen of whether it's going to work. I think the other area where I have concerns, which I want to see, is the validation point. Like, how do we show it's defensible? If you're going in and telling a court, oh, I use Gen AI and ran the tool, here's the relevant stuff based on prompts, what does that mean? How are we going to validate that? I think that's going to be one of the keys is how do we come up with a validation methodology that will be defensible that people will be comfortable with? Again, I think intuitively machine learning was I'm training the model on what a person, a human being deemed is responsive. So that. Frankly, it's easier to argue to a court. It's easier to explain to a regulator. When you say, I came up with prompts based on the allegations of the complaint or whatever, it's a little bit more esoteric, and I think it's a little bit harder for someone to get their heads around. How do you know you're getting relevant information? So, I think there's some challenges there. I don't know how that's going to play out. I don't know, Dave, because I know you're testing a lot of these tools, what you're seeing in terms of how we think this is actually going to work in terms of using generative AI in these large language models and moving away from the machine learning.

David: Yeah, I agree with you on the to be determined part, but I think I come in a little bit more optimistic and part of it might be, you know, actually starting to use some of these tools. I think that predictive coding has really paved the way for these AI tools because what held up predictive coding to some extent was people weren't sure that courts were going to accept it. Until the first opinions came out, Judge Peck's decision and the Silvermore and subsequent case decisions, there was concern about that. But once that precedent came out, and it's important to emphasize that the precedent wasn't just approving predictive coding, it was approving technology-assisted review. And this generative AI is really just another form of technology-assisted review. And what it basically said is you have to show that it's valid. You have to do this validation testing. But the same validation testing that we've been doing to support predictive coding will work on the large language model generative AI-assisted coding. It's essentially you do the review and then you take a sample and you say, well, was this review done well? Did we hit a high accuracy level? The early testing we're doing is showing that we are hitting even better accuracy levels than with predictive coding alone. And I should say that it's even improved in the six months or so that we've been testing. The companies that are building the software are continuing to improve it. So I am optimistic in that sense. But many of these products are still in development. The pricing is still either high or to be announced in some cases. And it's not clear yet that it will be cost effective beyond current models of using human review and predictive coding and search terms. And they're not all mutually exclusive. I mean, I can see ultimately getting to a hybrid model where we still may start with search terms to cut down on volume and then may use some predictive coding and some human review and some generative AI. Ultimately, I think we'll get to the point where the price point comes down and it will make review better and cheaper. Right. But I also didn't want to mention, I see a couple other areas of application in eDiscovery as well. The generative AI is really good at summarizing single large documents or even groups of documents. It's also extremely helpful in more quickly identifying key documents. You can ask questions about a whole big document population and get answers. So I'm really excited to see this evolution. And I don't know when we're going to get there and what the price effectiveness point is going to be. But I would say that in the next year or two, we're going to start seeing it creep in and use more and more effectively, more and more cost effectively as we go forward.

Anthony: Yeah, that's fascinating. Yeah, I can see that even in terms of document review. If a human was looking at it, if AI is summarizing the document, you can make your relevance determination based on the summary. Again, we can all talk about whether it's appropriate or not, but that would probably help quite a bit. And I do think that's fascinating. I know another thing I hear is the privilege log stuff. And again, I think using AI, generative AI to draft privilege logs in concept sounds great because obviously it's a big costs factor and the like. But I think we've talked about this, Dave and Therese, like we already have, like there's already tools available, meaning you can negotiate metadata logs and some of these other things that cut the cost down. So I think it remains to be seen. Again, I think this is going to be like another arrow in your quiver, a tool to use, and you just have to figure out when you want to use it.

Therese: Yeah. And I think one of the things I think in not limiting ourselves to only thinking about, right, document review, where there's a lot of possibility with generative AI, right, witness kits, putting together witness outlines for depositions and the like, right? Not that we would ever just rely on that, but there's a huge opportunity there, I think, as a starting point, right? Just like if you're using it appropriately. And of course, today's point, the price point is reasonable, you can do initial research. There's a lot of things that I think that it can do in the discovery realm, even outside of just document review, that I think we should keep our minds open to because it's a way of giving us a quicker, getting to the base more quickly and more efficiently and frankly, more cost-effectively. And then you can take a look at that and the person and can augment that or build upon it to make sure it's accurate and it's appropriate for that particular litigation or that particular witness and the like. But I do think that Dave really hit the nail on the head. I don't think this is going to be, we're only going to be moving to generative AI and we're going to abandon other types of AI. There's reasons why there's different types of AI is because they do different things. And I think what we are most likely to see is a hybrid. Right. Right. Some tools being used for something, some tools being used for others. And I think eventually, as Dave already highlighted, the combination of the use of different types of AI in the e-discovery process and within the same tool to get to a better place. I think that's where we're most likely heading. And as Dave said, that's where a lot of the vendors are actually focusing is on adding into their workflow this additional AI to improve the process.

David: Yeah. And it's interesting that some of the early versions are not really replacing the human review. They are predicting where the human review is going to come out. So when the reviewer looks at the document, it already tells you what the software says. Is it relevant or not relevant? And it does go one step beyond. It's hard because it not only tells you the prediction of whether it's relevant or not, but it also gives you a reason. So it can accelerate the review and that can create great cost savings. But it's not just document review. Already, there's e-discovery tools out there that allow you to ask questions, query databases, but also build chronologies. And again, with that benefit, then referencing you to certain documents and in some cases having hyperlinks. So it'll tell you facts or it'll tell you answers to a question and it'll link back to the documents that support those answers. So I think there's great potential as this continues to grow and improve.

Anthony: Yeah. And I would say also, again, let's think about the whole EDRM model, right? Preservation. I mean, we'll see what enterprises do, but on the enterprise side, using AI bots and stuff like that for whether it's preservation, collection and stuff, it'll be very interesting to see if these tools can be used there to sort of automate some of the standard workflows before we get to the review and the like, but even on the enterprise side. The other thing that I think it will be interesting, and I think this is one of the areas where we still have not seen broad adoption, is on the privilege side. We know and we've done some analysis for clients where privilege or looking for highly sensitive documents and the like is still something that most lawyers aren't comfortable using. Using AI, don't know why I've done it and it worked effectively, but that is still an area where lawyers have been hesitant. And it'll be interesting to see if gender of AI and the tools there can help with privilege, right? Whether it's the privilege logs, whether it's identifying privilege documents. I think to your point, Dave, having the ability to say it's privileged and here's the reasons would be really helpful in doing privilege review. So it'll be interesting to see how AI works in that sphere as well, because it is an area where we haven't seen wide adoption of using predictive coding or TAR in terms of identifying privilege. And that's still a major cost for a lot of clients. All right, so then I guess where this all leads to is, and this is more future-oriented. Do we think we're at this stage now that we have generative AI that there's a paradigm shift, right? Do we think there's going to be a point where even, you know, we didn't see that paradigm shift bluntly with predictive coding, right? Predictive coding came out, everyone said, oh my God, discovery is going to change forever. We don't need contract attorneys anymore. You know, associates aren't going to have anything to do because you're just going to train the model, it goes out. And that's clearly hasn't happened. Now people are making similar predictions with the use of generative AI. We're now not going to need to do docker view, whatever. And I think there is concern, and this is concern just generally in the industry, is this an area, since we're already using AI, where AI can take over basically the discovery function, where we're not necessarily using lots of lawyers and we're relying almost exclusively on AI, whether it's a combination of machine learning or if it's just generative AI. And they're doing lots of work without any input or very little input from lawyers. So I'll start with Dave there. What are your thoughts in terms of where do we see in the next three to five years? Are we going to see some tipping point?

David: Yeah, it's interesting. Historically, there's no question that predictive coding did allow lawyers to get through big document populations faster and for predictions that it was going to replace all human review. And it really hasn't. But part of that has been the proliferation of electronic data. There's just more data than ever before, more sources of data. It's not just email now. It's Teams and texts and Slack and all these different collaboration tools. So that increase in volume is partially made up for the increase in efficiency, and we haven't seen any loss of attorneys. I do think that over the longer run that there is more potential for the Gen AI to replace replace attorneys who do e-discovery work and, frankly, to replace lawyers and other professionals and all other kinds of workers eventually. I mean, it's just going to get better and better. A lot of money is being invested in. I'm going to go out on a limb and say that I think that we may be looking at a whole paradigm shift in how disputes are resolved in the future. Right now, there's so much duplication of effort. If you're in litigation against an opposing party, You have your documents set that your people are analyzing at some expense. The other side has their documents set that their people are analyzing at some expense. You're all looking for those key documents, the needles in the haystack. There's a lot of duplicative efforts going on. Picture a world where you could just take all of the potentially relevant documents. Throw them into the pot of generative AI, and then have the generative AI predetermine what's possibly privileged and lawyers can confirm those decisions. But then let everyone, both sides of court, query that pot of documents to ask, what are the key questions? What are the key factual issues in the case? Please tell us the answers and the documents that go to those answers and cut through a lot of the document review and document production that's going on now that frankly uses up most of the cost of litigation. I think we're going to be able to resolve disputes more efficiently, less expensively, and a lot faster. And I don't know whether that's five years into the future or 10 years into the future, but I'll be very surprised if our dispute resolution procedure isn't greatly affected by these new capabilities. Pretty soon, I think, when I say pretty soon, I don't know if it's five years or 10 years, but I think judges are going to have their AI assistance helping them resolve cases and maybe even drafting first drafts of court opinions as well. And I don't think it's all that far off into the future that we're going to start to see them.

Therese: I think I'm a little bit more skeptical than Dave on some of this, which is probably not surprising to either Dave or to to Anthony on this one. Look, I think, I don't see AI as a general rule replacing lawyers. I think it will change what lawyers do. And it may replace some lawyers who don't keep pace with technology. Look, it's very simple. It's going to make us better, faster, more efficient, right? So that's a good thing. It's a good thing for our clients. It's a good thing for us. But the idea, I think, to me, that AI will replace the judgment and the decision-making or the results of AI is going to replace lawyers and I think is maybe way out there in the future when the robots take over the world. But I do think it may mean less lawyers or lawyers do different things. Lawyers that are well-versed in technology and can use that are going to be more effective and are going to be faster. I think that. You're going to see situations where it's expected to be used, right? If AI can draft an opinion or a brief in the first instance and save hours and hours of time, that's a great thing. And that's going to be expected. I don't see that being ever being the thing that gets sent out the door because you're going to still need lawyers who are looking at it and making sure that it is right and updating it and making sure that it's unique to the case and all the judgments that go into those things are appropriate. I do find it difficult to imagine a world having, you know, been a litigator for so many years where everyone's like, sure, throw all the documents in the same pod and we'll all query it together. Maybe we'll get to that point someday. I find it really difficult to imagine that'll happen. There's too much concern about the data and control over the data and sensitivity and privilege and all of those things. You know, we've seen pockets of making data available through secure channels so that you're not transferring them and the like, where it's the same pool of data that would otherwise be produced, so that maybe you're saving costs there. But I don't, again, I think it'll be a paradigm shift eventually in that, paradigm shift that's been a long time coming, though, I think, right? We started using technology to improve this process years ago. It's getting better. I think we will get to a point where everyone routinely more heavily relies on AI for discovery and that that is not the predictive coding or the tar for the people who know how to use it, but it is the standard that everybody uses. I do think, like I said, it will make us better and more efficient. I don't see it really replacing, like I said, entirely lawyers or that will be in a world where all the data just goes in and gets spit out and you need one lawyer to look at it and it's fine. But again, I do think it will change the way we practice law. And in that sense, I do think it'll be a paradigm shift.

Anthony: The final thought is, I think I tend to be, I'm sort of in the middle, but I would say generally we know lawyers have big egos, and they will never allow, they will never think that a computer, AI tool or whatever, is smarter than they are in terms of determining privilege or relevance, right? I mean, I think that's part of it is, there's, you know, you have two lawyers in a room, they're going to argue about whether something is relevant. You have two lawyers in a room, they're going to argue about something privileged. So it's not objective, right? There's subjectivity. And I think that's going to be one of the chances. And I think also, we've seen it already. Everyone thought. Every lawyer who's a litigator would have to be really well-versed in e-discovery and all the issues that we deal with. That has not happened. And I don't see that changing. So unless I'm less concerned about being a paradigm shift than all of us going out for those reasons.

David: Well, I think everyone needs to tune back in on July 11th, 2029 when we come back to get stuff to begin and see who we're going.

Anthony: Yes, absolutely. All right. Thanks, everybody.

David: Thank you.

Outro: Tech Law Talks is a Reed Smith production. Our producers are Ali McCardell and Shannon Ryan. For more information about Reed Smith's emerging technologies practice, please email [email protected]. You can find our podcasts on Spotify, Apple Podcasts, Google Podcasts, reedsmith.com and our social media accounts.

Disclaimer: This podcast is provided for educational purposes. It does not constitute legal advice and is not intended to establish an attorney-client relationship, nor is it intended to suggest or establish standards of care applicable to particular lawyers in any given situation. Prior results do not guarantee a similar outcome. Any views, opinions, or comments made by any external guest speaker are not to be attributed to Reed Smith LLP or its individual lawyers.

Transcript is auto-generated.

Authors

Anthony J. Diana, Therese Craparo, David R. Cohen

Authors

Anthony J. Diana

Partner

New York

Therese Craparo

Partner

New York

David R. Cohen

Contract Attorney

Pittsburgh

Related Insights

Our insights