Coding with ChatGPT: A Journey to Create A Dynamic Legal Research Aid

I haven’t quite gotten this whole ChatGPT thing. I’ve attended the webinars and the AALL sessions. I generally understand what it’s doing beneath the hood. But I haven’t been able to find a need in my life for ChatGPT to fill. The most relevant sessions for me were the AALS Technology Law Summer Webinar Series with Tracy Norton of Louisiana State University. She has real-world day-to-day examples of when she has been able to utilize ChatGPT, including creating a writing schedule and getting suggestions on professional development throughout a career. Those still just didn’t tip the balance for me.

A few weeks ago, I presented to one of our legal clinics and demonstrated a form that our Associate Director, Tara Mospan, created for crafting an efficient search query. At its heart, the form is a visual representation of how terms and connectors work with each other. Five columns of five boxes, each column represents variations of a term, and connectors between the columns. For a drunk driving case, the term in the first box could be car, and below that we would put synonyms like vehicle or automobile. The second column could include drunk, inebriated, and intoxicated. And we would choose the connector between the columns, whether it be AND, w/p, w/s, or w/#. Then, we write out the whole search query at the bottom: (car OR vehicle OR automobile) w/s (drunk OR inebriated OR intoxicated).

Created years ago by Tara Mospan, this worksheet is loved by ASU Law students who frequently request copies from the law librarians even years after they use it for Legal Research and Writing.

After the presentation, I offered a student some extra copies of the form. She said no, that I presented to her legal writing class last year and she was so taken with the form that she had recreated it in Excel. Not only that, she used macros to transform the entered terms into a final query. I was impressed and asked her to send me a copy. It was exactly as she had described, using basic commands to put the terms together, with OR between terms within a column, and drop downs of connectors. She had taken our static form and transformed it into a dynamic utility.

An ASU Law student recreated the Crafting an Efficient Search PDF using Excel so that it had drop-downs.

Now I was inspired: What if I could combine the features of her Excel document with the clean layout of our PDF form? Finally, I saw a use for ChatGPT in my own life. I had read about how well ChatGPT does with programming and it seemed like the perfect application. It could help me create a fillable PDF, with nuanced JavaScript code to make it easy to use and visually appealing.

I went into ChatGPT and wrote out my initial command:

I am trying to create a fillable PDF. It will consist of five columns of text boxes, and each column will have five boxes. Search terms will be placed in the boxes, although not necessarily in every box. There will be a text box at the bottom where the terms from the boxes above will be combined into a string. When there are entries in multiple boxes in a column, I want the output to put a set of parentheses around the terms and the word OR between each term.

ChatGPT immediately gave me a list of steps, including the JavaScript code for the results box. I excitedly followed the directions to the letter, saved my document, and tested it out. I typed car into the first box and…nothing. It didn’t show up in the results box. I told ChatGPT the problem:

The code does not seem to be working. When I enter terms in the boxes, the text box at the bottom doesn’t display anything.

And this began our back and forth. The whole process took around four hours. I would explain what I wanted, it would provide code, and I would test it. When there were errors, I would note the errors and it would try again. A couple times, the fix to a minor error would start snowballing into a major error, and I would need to go back to the last working version and start over from there. It was a lot like having a programming expert working with you, if they had infinite patience but sometimes lacked basic understanding of what you were asking.

For many things, I had to go step-by-step to work through a problem. Take the connectors, for example. I initially just had AND between them as a placeholder. I asked it to replace the AND with a drop-down menu to choose the connector. The first implementation of this ended up replacing OR between the synonyms instead of the second needed search term. We went back and forth until the connector option worked between the first two columns of terms. Then we worked through the connector between columns two and three, and so on.

At times, it was slow going, but it was still much faster than learning enough JavaScript to program it myself. ChatGPT was also able to easily program minor changes that made the form much more attractive, like not having parentheses appear unless there are two terms in a column, and not displaying the connector unless there are terms entered on both sides of it. And I was able to add a “clear form” button at the end that cleared all of the boxes and reverted the connectors back to the AND option, with only one exchange with ChatGPT.

Overall, it was an excellent introduction to at least one function of AI. I started with a specific idea and ended up with a tangible product that functioned as I initially desired. It was a bit more labor intensive than the articles I’ve read led me to believe, but the end result works better than I ever would have imagined. And more than anything, it has gotten me to start thinking about other projects and possibilities to try with ChatGPT.

Unlocking the Power of Semantic Searches in the Legal Domain

The language of law has many layers. Legal facts are more than objective truths; they tell the story and ultimately decide who wins or loses. A statute can have multiple interpretations, and those interpretations depend on factors like the judge, context, purpose, and history of the statute. Legal language has distinct features, including rare legal terms of art like “restrictive covenant,” “promissory estoppel,” “tort,” and “novation.” This complex legal terminology poses challenges for normal semantic search queries.  

Vector databases represent an exciting new trend, and for good reason. Rather than relying on traditional Boolean logic, semantic search leverages word associations by creating embeddings and storing them in a vector database. In machine learning and natural language processing, embeddings depict words or sentences as dense vectors of real numbers in a continuous vector space. This numerical representation of text is typically generated by a model that tokenizes the text and learns embeddings from the data. Vectors capture the contextual and semantic meaning of each word. When a user makes a semantic query, the search system works to interpret their intent and context. The system then breaks the query into individual words or tokens, converts them into vector representations using embedding models, and returns ranked results based on their relevance. Unlike Boolean search which requires specific syntax, (“AND”, “OR”, etc.) semantic search allows for queries in natural language and opens up a whole new world of potential when searches are not constrained by the rules of exact matching of text. 

However, legal language differs from everyday language. The large number of technical terms, the careful precision, and the fluid interpretations inherent in law mean that semantic search systems may fail to grasp the context and nuances of legal queries. The interconnected and evolving nature of legal concepts poses challenges in neatly mapping them into an embedding space representation. One potential way to improve semantic search in the legal domain is by enhancing the underlying embedding models. Embedding models are often trained on generalized corpora like Wikipedia, giving them a broad but shallow understanding of law. This surface-level comprehension proves insufficient for legal queries, which may seem simple but have layers of nuance. For example, when asked to retrieve the key facts of a case, an embedding model might struggle to discern what facts are relevant versus extraneous details.  

The model may also fail to distinguish between majority and dissenting opinions due to a lack of legal background needed to make such differentiations. Training models on domain-specific legal data represents one promising approach to overcoming these difficulties. By training on in-depth legal corpora, embeddings could better capture the subtleties of legal language, ideas, and reasoning. For example, Legal Bert, which stands for Bidirectional Encoder Representations was pre-trained on the CaseHold dataset. The size of this corpus (37GB) is large, representing 3,446,187 legal decisions across all federal and state courts. The CaseHold data set is larger than the size of the Book Corpus/Wikipedia corpus originally used to train the BERT model.  When tested on the LexGlue benchmark- a benchmark dataset to evaluate the performance of NLP methods in legal tasks, Legal Bert performed better than ChatGPT.  

Semantic search shows promise for transforming legal research, but realizing its full potential in the legal domain poses challenges. Legal language is complex and can make it difficult for generalized embedding models to grasp the nuances of legal queries. However, recent optimized legal embedding models indicate these hurdles can be overcome by training on ample in-domain data. Still, comprehensively encoding the interconnected, evolving nature of legal doctrines into a unified embedding space remains an open research problem. Hybrid approaches combining Boolean and vector models are a promising new frontier that many researchers are exploring. 

Realizing the full potential of semantic search for law remains an ambitious goal requiring innovative techniques. But the payoff could be immense – responsive, accurate AI assistance for case law research and analysis. While still in its promising infancy, the continued maturation of semantic legal search could profoundly augment the capabilities of legal professionals. A shift from generic to domain-specific models holds promise.  

Audit Trails for AI in Legal Research

LLMs have come a long way even in the time since I wrote my article in June.  Three months of development time with this technology feels like three years – or maybe that’s just me catching up.  Despite that, there are still a couple of nagging issues that I would like to see implemented to improve their usage to legal researchers.  I’m hoping to raise awareness about this so that we can collectively ask vendors to add quality-of-life features to these tools for the benefit of our community. 

Audit Trails

Right now the tools do not have a way for us to easily check their work.  Law librarians have made a version of my argument for over a decade now. ‌The legendary Susan Nevelow Mart famously questioned the opacity of search algorithms in legal research and evaluated their impact on legal research.  More recently, I was in the audience at AALL2023 when the tenacious and brilliant Debbie Ginsburg from Harvard asked Fastcase, BLaw, Lexis, and Westlaw how we (law librarians) could evaluate the inclusivity of the dataset of cases that the new AI algorithms are searching.  How do we know if they’ve missed something if we don’t know what they’re searching and how complete it is?

As it stands, the legal research AI that I’ve demoed do not give you a summary of where they have gone and what they have done.  An “audit trail” (as I’m using this expression) is a record of which processes were used to achieve a specific task, the totality of the dataset, and why they chose the results to present to the user. This way if something goes wrong, you can go back and look at what steps were taken to get the results. This would provide an extra layer of security and confidence in the process.

Why Do We Need This?

These tools have introduced an additional layer of abstraction that separates legal researchers from the primary documents they are studying, altering how legal research is conducted. While the new AI algorithms can be seen as a step forward, they can undermine the precision that boolean expressions once offered, which allowed researchers to predict the type of results they would encounter with more certainty. Coverage maps are still available to identify gaps in the data for some of these platforms but, there is a noticeable shift towards less control over the search process, calling for a thoughtful reassessment of the evolving dynamics in legal research techniques.  

More importantly, we (law librarians) are deep enough into these processes and technology to be highly skeptical and evaluate the output with a critical eye.  Many students and new attorneys may not.  I have told this story at some of my presentations but a recent graduate called me with a Pacific Reporter citation for a case that they could not find on Westlaw.  This person was absolutely convinced that they were doing something wrong and had spent around an hour searching for this case because “this was THE PERFECT case” for their situation.  It ended up being a fabrication from ChatGPT but the alumni had to call me to discover that.  This is obviously a somewhat outdated worry, since Rebecca Fordon has gamed all of us up on the steps being taken to reduce hallucinations (and OpenAI got a huge amount of negative publicity from the, now infamous, ChatGPT Lawyer). 

My point is less about the technology and more about the incentives set in place – if there is a fast, easy way to do this research then there will inevitably be people who are going to uncritically accept those results.  “That’s their fault and they should get in trouble,” you say?  Probably, but I plan to write about the duty of technological competency and these tools in a future post, so we’ll have to hash that out together later.  Also, what if there was a fast, easy way to evaluate the results of these tools…

What Could Be Done

Summarizing the steps involved in research seems like it would be a feasible task for Westlaw, Lexis, Blaw, et al. to implement.  They already have to use prompting to tell the LLM where to go and how to search; we’re just asking for a summary of those steps to be replicated somewhere so that we can double-check it.  Could they take that same prompting and place a prompt around that says something to the effect of, “Summarize the steps taken in bullet points” and then place that into a drop-down arrow so that we could check it?  Could they include hyperlinks to coverage maps in instances where it would be useful to the researcher to know how inclusive the search is?  In instances where they’re using RAG, could they include a prompt that says something to the effect of, “Summarize how you used those underlying documents to generate this text?” 

As someone who has tinkered with technology, all of these seem like reasonable requests that are well within the ability of these tools. I’m interested to hear if there are reasons why we couldn’t have these features or if people have other features they would like. Please feel free to post your ideas in the comments or email me.

Why Law Librarians?

Some of you reading this may be skeptical that these new AI technologies are 1) within your skillset and/or 2) worth the effort to learn. I’m the congenital optimist who is here to win you over. These tools are on the verge of revolutionizing the field of law (once they get out of their prototype phase) and I can’t think of a better group of people on law school campuses, in government organizations, and in law firms to evaluate and implement these technologies. Law Librarians (traditionally) have two crucial skill sets that make us well-suited to take the lead here:

  • We understand how information is organized and
  • We understand how information is used in the research and practice of law.

This is an AI Youtuber with ~70k subscribers who develops and trains LLMs from scratch. Do you see what he has listed as the number one discipline that people need to learn to use these tools? Computer Science skills rank third on his list compared to “Librarianship and Information Science” at #1.

This dude gets it.

Many of the tips that David Shapiro provides in that video for people creating custom LLMs will be absolutely obvious to law librarians because we live and breathe these every day at our jobs: taxonomies, data organization, “source of truth,” etc. Whether in the tech services department or research instruction, we are well-versed in organizing and finding information.

We already have many of the data structures in place that could be easily used by these technologies. Besides constructing the initial models, our role will be pivotal in continuously updating and assessing their effectiveness. Moreover, we will provide vital guidance on the proper utilization of these tools.

Does this list look like something your Technical Services department does? Can you think of anyone else in your organization who would be better at making knowledge graphs, indexes, or tables of contents for legal materials? Who would be better suited than your Research and Instruction team to teach newcomers how to interact with these tools to get the information that they need? Who in your organization is best positioned to teach (or already teaches) information literacy? I would argue that nobody can do it better than law librarians (not even computer science people).

Now What?

Let’s mobilize a push to collaborate on these tools. We need to get groups of law librarians together who are interested in rolling up their sleeves and digging into the nitty-gritty of creating, auditing, and using LLMs. I am a member of LIT-SIS in AALL and maybe we need a special caucus to address this specific technology. Additionally, we can get consortiums of schools together in each state to develop our own LLMs – outside of the subscription-based products that will roll out for Lexis and Westlaw. Anything we build ourselves will have the needs of our community at the forefront. We can build in all of the transparency, privacy, and accuracy that may be lacking in commercial models. Schools can build tools that would not be commercially viable at firms. Firms and courts could build specialized tools to achieve their unique workflows. It opens up many options that are not available if we’re stuck with the one-size-fits-all nature of Lexis and Westlaw subscriptions.

This is an open-source model that is close to competing with GPT4 (ChatGPT’s underlying model). There are many of these and new models show up every day.

There are many options to create, train, and locally run custom LLMs as long as you have the data. As David Shapiro said in the video, “data is the oil of the information age” and law libraries are deep wells of the type of data that could be used to accurately train these services. Additionally, when you are locally hosting an LLM many of the concerns surrounding privacy, permissions, and student data completely evaporate because you are in control of what information is being sent and stored.

To do all of this, we need organization, collaboration, and funding. Individually this could be difficult but if we band together in consortium, we can get a lot done.

Students

Students are an incredible resource in this area. Many of them come to law school with computer science and data science backgrounds and can help with the creation and development of these models. They need mentors and organizers to help focus their efforts, provide resources, and nurture their creativity. In addition, they provide a deep reservoir of diverse voices and experiences that may not occur to people who have spent decades in academia, the public sector, or law firms. We can bring in students to have competitions to create their own LLM apps for law practice and access to justice initiatives. We can fund fellowships to do work at schools, courts, and firms. We can bring them under our wing to usher in the next generation of tech-savvy law librarians. We can leverage the excitement and energy associated with these new tools to attract new talent into our field – I skimmed TikTok and the #ChatGPT hashtag as around 7.7 billion views. To do that, we need to brainstorm together so that we can get these programs in place.

In Sum

As the torchbearers in this promising venture, it’s time for us, the law librarians, to step up and show the world our unmatched prowess in harnessing the potential of LLMs in law, weaving our expert knowledge in information science, law, and emerging technology. Let us band together, utilizing the rich data reserves at our disposal, and carve out a future where legal technology is not just efficient and transparent, but also a collaborative masterpiece fostered by our relentless pursuit of innovation and excellence.