Iain Carmichael and Michael Kim Present at PyData Carolinas Conference

Iain Carmichael and Michael Kim recently gave a presentation at the PyData Carolinas conference on the topic of Networks and the Law. For their talk they analyzed data from CourtListener, applying a variety of network algorithms to identify important or influential cases.

Here’s the video from the talk:

And their slides are available too.


What does network science have to say about the law? Can we determine which are the most the most influential cases in our legal system? Can we understand how legal doctrine evolves? Using tools from network statistics and data provided by Court Listener (an open legal data project), we analyze the network of law case citations.

Citation networks have recently been a topic of interest to network scientists. Court Listener, an open data initiative, provides the network of law case citations as well as the text of (almost every) court case in the US. This network data set provides a rich array of questions that are of interest to legal scholars as well as network scientists.

Can we determine which cases are the most influential in our legal system? Can we understand how legal doctrine evolves? We will discuss what we learned about how …

more ...

Extracting Text from Our Collection of PACER Documents

We’re getting ready to launch a brand new search engine for PACER content. When it launches, one of the big features it will have is full-text search for the millions of documents that people have submitted using our RECAP system. To our knowledge, this will be the first free system for searching PACER content in this way, allowing you to look up documents by any word they might contain.

The big problem with this goal? We have about a million PDFs that consist only of images. Some of these are actually quite beautiful:

Handwritten Motion

A beautiful handwritten motion. It goes on like this for 46 pages.

But others are hideous:

Log from 1957

An 84 page log from 1957. It’s come a long ways just to appear on this blog today.

But no matter how a document looks, we want to extract the text so that we can make it searchable. This is done using a system called Optical Character Recognition (OCR), which looks at each pixel in each page of each document and tries to figure out what letter it is a part of. As you might expect, this can take a while when you’re processing millions of documents averaging …

more ...

CourtListener.com Now Supports Oral Arguments from the Second Circuit

Seal for Second Circuit

We are happy to share that as of today, oral argument recordings from the Second Circuit Court of Appeals are finally available on CourtListener.com. This means that you can search these recordings, create email alerts for them, listen to them on our site, and even include them in custom podcasts. Of course, we also provide enhanced versions of these recordings for download, and for developers or researchers they’re also available as bulk data or via our APIs.

Before today, we were unable to provide these features for the Second Circuit because they didn’t post their oral argument recordings on their website, so we’re thrilled that they’ve begun doing so. At this point, only the Tenth and Eleventh Circuits do not post their oral argument recordings, but we are hopeful that they will follow the lead of the other circuits and begin doing so soon.

more ...

CourtListener’s SCOTUS Data Gets Even Better with Legacy Data from the Supreme Court Database

We’re excited to share that as of today, we have added the latest data from the Supreme Court Database (SCDB) into CourtListener. This update adds SCDB ID’s, parallel citations, vote counts, and decision direction data to about 20,000 Supreme Court cases. Each of these enhancements enables some great functionality.

For example, now that we have vote counts for older cases, you can create visualizations of older topics, like the “Separate but Equal” doctrine or the Commerce Clause. Colin Starger, the creator of SCOTUS Mapper, has been working with this early data and has created a variety of fascinating historical Supreme Court network graphs. If you want to experiment with this, the place to start is at the SCOTUS visualization homepage.

Here’s a taste, showing Katz v. U.S. plotted to Olmstead. In this graph you can see that over time the vote went from a divided conservative vote in 1928 to a divided liberal vote in 1967:

The other big enhancement that we’re excited about is that we were able to add about 60,000 parallel citations to the cases we have in CourtListener. This enables our citation parser to find these old citations and …

more ...

Launching the Next Version of CourtListener.com

Today we’re launching a new version of CourtListener.com that focuses on making the site more useful and intuitive.

The big new feature in this version is the addition of a new navigation bar at the top of every screen, as you can see below:

New Navigation Bar

The new navigation bar on every page.

This should be a big improvement over our old site, making it much easier and intuitive to use the oral arguments, judges, and visualizations sections of CourtListener.

The other big feature in this version is new advanced search pages for opinions, oral arguments, and judges. For example, here’s a screenshot of the new judicial advanced search page:

Screenshot of the Judge Advanced Search page

The new judicial advanced search page.

This page should make it much easier to understand and query our judge database, and we expect the pages for advanced oral argument search and advanced opinion search will be similarly valuable.

The final enhancement we’re excited about is a layer of polish across the entire site. This cleans up some old issues, adds explanations to areas that were somewhat unclear before, and makes the site more accessible to people with certain physical disabilities.

This kind of work doesn’t sound like …

more ...

Retiring and Consolidating RECAP Websites

We’re talking a lot about RECAP lately and we’ve realized that it’s a good time to retire the recapthelaw.org website. Free Law Project took over RECAP back in May of 2014 and since then there have been two places where we wrote blog posts, two places where you could get information about PACER and RECAP, and two places we had to maintain on a day to day basis. By winding down this site, we’ll be able to focus more clearly on the task at hand — liberating documents from PACER.

As of now, all the old content has been moved to this site, and the new home for RECAP is https://free.law/recap/. You can go check it out —- we spent some time on it, and it should be a great homepage for the project.

If you have any thoughts or notice anything broken please let us know. We’ll have more announcements about RECAP soon.

more ...

Judge Profiles on CourtListener Now Have Campaign Finance Information from the National Institute on Money In State Politics

We’re proud to share that as of today we’ve added campaign finance data to our database of judges. This update links judges in the CourtListener system to their fundraising profiles in the FollowTheMoney.org database, allowing researchers and members of the public a new way to understand judges elected in State Supreme Court jurisdictions. This work was made possible by a prototype grant from the John S. and James L. Knight Foundation.

Using this system, you can easily see the sources of money that a judge received as part of an election, and you can put it side by side with all of the data that we have already gathered about that judge, such as the decisions they’ve written, the positions they’ve held professionally and in the judiciary, and their biographical information.

For example, on the page for Judge Tom Parker, there is a new section that looks like this:

Example screenshot

Tom Parker has raised approximately $2.1M dollars.

To our knowledge, it has never previously been possible to research the decisions written by a judge side by side with the money they’ve received. We invite researchers and journalists to use this information to uncover interesting …

more ...

Judge Profiles on CourtListener Now Show the Cases Authored by Each Judge

When we launched our judicial database, we shared our plan to show the cases written by each judge. As of today, we’re pleased to share that we’ve launched the first iteration of that endeavor. If you pull up any judge, say, Sonia Sotomayor, you’ll see a new section at the bottom that looks like this:

This listing provides the five most important opinions by the judge, and you can click the button at the bottom to see all of the cases they wrote or participated in. Clicking the button takes you to our search results, where you can slice and dice the data, choosing, for example, to see only their opinions from the Second Circuit, or their Supreme Court Cases.

In the search results and in the list on the judge profile page, the opinions are ordered by relevance, using our CiteGeist relevance engine. This highlights the cases that have been cited the most frequently by the most important cases.

Finally, you can now get an RSS feed for any active judge in our system, enabling you to keep up with anything they write. To do so, click the RSS icon (), and configure it with your RSS …

more ...

Twenty-nine New Jurisdictions and an Improved Interface Coming Soon to CourtListener

At CourtListener, we’re currently working on one of the biggest upgrades yet, and in doing that we’re adding many new jurisdictions that we didn’t have previously:

  • Attorney General opinions from Arkansas, California, Colorado, Florida, Kansas, Louisiana, Maryland, Missouri, Montana, Nebraska, New York, Texas, Washington, and Wisconsin
  • Arkansas Workers’ Compensation Commission
  • Appellate Division of the Superior Court of California
  • Colorado Industrial Claim Appeals Office
  • Connecticut Compensation Review Board
  • Massachusetts Department of Industrial Accidents
  • New Jersey Court of Chancery
  • Superior Court of North Carolina
  • North Carolina Industrial Commission
  • Civil Court of the City of New York
  • Criminal Court of the City of New York
  • Oregon Tax Court
  • Superior Court of Rhode Island
  • Tennessee Superior Court for Law and Equity
  • Texas Judicial Panel on Multidistrict Litigation
  • Court of King’s Bench

This brings the number of jurisdictions to more than 400, and it is getting hard to tell which courts are important. For example, some of the courts above have been terminated, and the King’s Bench is actually an English jurisdiction—-or was, until it was terminated in 1873.

To address the confusion that is caused by so many jurisdictions (a good problem to have), we’ve identified the …

more ...

Second Circuit of Appeals to Finally Place Oral Arguments Online by Default — Write to the Court with Your Suggestions

At the end of last week there was some excellent news coming out of the Second Circuit:

That’s right, years after the other circuits put their oral arguments online, the Second Circuit has decided to join the party. According to the Court’s announcement (presently on the homepage; will eventually be in their archive):

At its quarterly meeting on May 23, 2016, the judges of the United States Court of Appeals for the Second Circuit approved the posting of audio recordings of oral arguments to the Court’s website, commencing August 15, 2016, the first day of the 2016 Term.

A few months ago, we calculated that this content would cost $300,000 to purchase, so this is great news for historians, scholars, legal practitioners, and everybody in between.

To make this change, the Court has proposed a change to its local rules, and there is a 30 day period ending July 15th for the public to make comments on the change. The change the Court has proposed is quite minor, simply stating that the website should now have …

more ...

Colin Starger’s Talk about Citation Visualizations at CALI Conference

Last week was a busy one for Free Law Project. While I was at the LTDCA conference in San Diego, presenting on our database of judges, Colin Starger of University of Baltimore School of Law was at the annual CALI conference talking about our new Supreme Court citation visualizations.

While my talk wasn’t recorded, Colin’s was, and you can watch it now. He’s also got a great post on his blog that has lots more information about the talk and his process. Both are a great way to get started with our Supreme Court visualizations. Enjoy!

more ...

Milestone: CourtListener has 365 Days of Continuous Oral Argument Listening

You said you liked listening to oral argument recordings, and we heard you. Back in 2014, we began collecting oral argument recordings, and we’re happy to share that as of today we have more than 365 days of continuous oral argument listening — a full year. You can sit down today, start listening to oral arguments, and 365 days later, you’ll have finished listening to what we currently have. (Of course, by then, we’ll have thousands more minutes to listen to!)

Lots of people like binge watching TV shows. So, for comparison, this much oral argument audio is similar to watching:

  • Every episode of The Simpsons…40 times
  • Every episode of Law & Order…19 times
  • Every episode of Sesame Street…2 times
  • About half of the episodes of General Hospital!


Listening to lawyers argue for this much time is not recommended, but we’ve seen demand for this material and we’re very pleased to offer it as oral argument podcasts or directly on CourtListener.com.

We’re also working on and investigating a few new projects to enhance oral argument recordings:

  • Removing dead air at the beginning and ends of oral argument recordings and doing volume …

more ...

New CSV of Reporters of Decisions

One of the projects we maintain at Free Law Project is a database of reporters of judicial decisions. This database has been popular among developers, but we’ve heard that the data was hard to work with.

To fix this, we created a new CSV of the data that is available now. It currently has 440 reporters.

For each reporter, we collect the following information:

  • Any series that the reporter has, for example, a 2d or 3d.
  • Any variations that the abbreviation for the reporter may have. For example, Kentucky Reports can be cited variously as, “B. Mon.”, “Ky.(B.Mon.)”, “Mon.”, “Mon.B.”, or “Monroe, B.” We have nearly 1,000 of these so far.
  • The start and end dates for each series of each reporter.
  • The jurisdictions covered by each reporter.

Together, this information is vital for creating citators and for identifying what decision a citation actually refers to.

More information about the database can be found on its page here. We have used this database in production on CourtListener for years and we believe the collection of reporters is nearly complete. However, we do need help getting the start and end dates for each reporter series. If …

more ...

Ending our PACER Drainage Initiative and Stopping our Email Lists

We at Free Law Project have been working towards our goals for a few years now, and as a result we’ve accumulated a bit of cruft. Today, we shed just a bit of it, as we simultaneously end one of our smaller projects and deprecate our list server.

The project we are ending is our PACER Drainage project. The idea of the project was to put pressure on PACER by having lots of people use their $15 fee waiver to download PACER content to the RECAP archive.

The main reason we’re ending this project is because PACER is vast, and we never got the kind of uptake we needed for this program to be successful. We knew that we’d never make a big impact on PACER unless thousands of people started using their fee waivers to download content, but we went ahead with this project anyway for two reasons. First, we wanted to be part of the effort last year to apply pressure to PACER, and second, we wanted to raise awareness about the PACER issue via a direct call to action.

While I don’t think we succeeded in applying the pressure we wanted (the AO …

more ...

More Information about our Judicial Database and Some Responses to Feedback

Robert Ambrogi recently wrote an article about our new judicial database for his LawSites blog. In his article, he makes a few concrete observations about our judicial database, and I want to use these observations as a launching point to talk some more about what we have made, why it is useful, and what we are working on next.

The two observations Robert makes are:

  1. Our page for Justice Robert Cordy is sparse compared to the same page on Ballotpedia.

  2. Scalia’s end date was not set for his time on the Supreme Court, and his education data was not quite correct.

These kinds of observations are really important to us, and it shows that we still have work to do building and explaining our work.

On Sparseness

To the first observation about Robert Cordy, our response is that we’re building a database, not a more free-form wiki. Unlike the incredible work Ballotpedia is doing, which allows almost any kind of information, our work is focused on gathering specific facts about judges and appointing officials. This approach has pros and cons, and Robert is fair to point out that our data about this important judge is fairly sparse. He …

more ...

Notes and Sketches from Making SCOTUS Network Visualizations

A design sketch with case names

A sketch showing links between cases (click for enlarged view)

In February we announced our Supreme Court Citation Network tool that we developed with University of Baltimore School of Law. We haven’t had a chance until now to comment on some of the technical difficulties that came up while we were working on it. If you’re not familiar with this tool, you should take a moment now to go check it out (gallery, homepage).

In this post I’ll be talking about the challenges that we overcame in order to efficiently generate these visualizations. If you like what you read here, you might want to vote (hint, hint) for Colin Starger’s talk at the Cali Conference.

In the Beginning…

A goal at the start was to create a system that could quickly generate these diagrams in response to a user’s request, without resorting to any kind of “please wait” mechanism such as a spinner () or any other tricks that might frustrate our users. This would turn out to be a very difficult goal beacuse of the nature of citation networks.

In a database like ours, the data is organized into tables, much like in an Excel …

more ...

CourtListener Podcasts Now on Google Play

Google Play Logo

Just a quick post today to share that our oral argument podcasts are now available on Google Play Music.

If you are a user of Google Play Music, you can easily subscribe to our podcasts by searching for “Free Law Project”, “CourtListener”, or simply, “oral arguments”. Once you subscribe, the podcasts will download to your device if you use one, or will be playable via the website.

These podcasts contain all of the oral argument audio for a given court or for a search that you create. This means that in 2016, you can literally pipe the audio from the Supreme Court and Federal Circuit Courts directly to your pocket.

In honor of this announcement we’ve created a new page on our site that lists our existing, pre-made podcasts, explains how to make custom ones, and explains how to subscribe to them in Google Play Music or Stitcher Radio.

We hope you’ll enjoy these podcasts. Who doesn’t want the Supreme Court piped to their pocket?

more ...

Free Law Project and Princeton/Columbia Researchers Launch First-of-its-Kind Judicial Database

A screenshot of President, Judge Taft

President Taft’s Biography Page

Today we’re extremely proud and excited to be launching a comprehensive database of judges and the judiciary, to be linked to Courtlistener’s corpus of legal opinions authored by those judges. We hope that this database, its APIs, and its bulk data will become a valuable tool for attorneys and researchers across the country. This new database has been developed with support from the National Science Foundation and the John S. and James L. Knight Foundation, in conjunction with Elliott Ash of Princeton University and Bentley MacLeod of Columbia University.

At launch, the database has nearly 8,500 judges from federal and state courts, all of which are available via our APIs, in bulk data, and via a new judicial search interface that we’ve created.

The database is aimed to be comprehensive, including as many facts about as many judges as possible. At the outset, we are collecting the following kinds of information about the judges:

  • Biographical information including their full name, race, gender, birth and death dates and locations, and any aliases or nicknames that a judge may have.

  • Their educational information including which schools they went to, when they went, and …

more ...

It’s easier than ever to contribute to CourtListener and Free Law Project

Working on complex software with a lot of dependencies can be difficult, and over the years many people have struggled to install and configure all the complex components CourtListener requires. Our previous solution to this was to create and maintain the Free Law Virtual Machine, which allowed people to work in a VM with all the right software installed. Alas, keeping the VM maintained was a real burden and we weren’t the best at it. It fell into decay and without realizing it we informally stopped recommending people use it.

Today we have a new, more modern solution using Vagrant. We’re getting the last of the kinks out of the new system, but already a number of people are using it in their daily process. It has all of the dependencies installed and configured. Best of all, although your code will be running in a virtual machine like before, when you’re using this setup, you still get to use all of your favorite tools on your local machine. If you haven’t used Vagrant before, you’ll find it’s a rather magical experience.

If you’re a contributor to CourtListener it’s definitely worth checking this …

more ...