Uploading PACER Dockets and Oral Argument Recordings to the Internet Archive

At Free Law Project, we collect a lot of legal information. In our RECAP initiative, we collect (or are donated) around one hundred thousand items from PACER every day. Separately, in our collection of oral argument recordings, we have gathered more than 1.4 million minutes of legal recordings — more than anywhere else on the web. All of this content comes from a variety of sources, and we merge it all together to make a searchable collection of PACER dockets and a huge archive of oral argument recordings.

Part of our mission at Free Law Project is to share this information and to ensure its long-term distribution and preservation. A great way to do that is to give it to a neutral third party so that no matter what happens, the information will always be available. For years, we have been lucky to partner with the Internet Archive for this purpose and today we are pleased to share two pieces of news about how we give them information. more ...


Announcing PACER Docket Alerts for Journalists, Lawyers, Researchers, and the Public

Make Alerts Now

Today we are thrilled to announce the general availability of PACER Docket Alerts on CourtListener.com. Once enabled, a docket alert will send you an email whenever there is a new filing in a case in PACER. We started CourtListener in 2010 as a circuit court monitoring tool, and we could not be more excited to continue expanding on those roots with this powerful new tool.

The best way to get started with Docket Alerts is to just make one. Try loading a popular case like U.S. v. Manafort or The District of Columbia v. Trump. Once the case is open, just press the “Get Alerts” button near the top. Then, just wait for your first alert.

We believe PACER Docket Alerts will be a valuable resource to journalists, researchers, lawyers, and the public as they grapple with staying up to date with the latest PACER filings.

Our goal with docket alerts is to make them as simple as possible to use. Once you have found a case you are interested in, a single click is all it takes to turn on an alert for that docket. From then on, we will send you an email …

more ...

The Next Version of RECAP is Now Live

The original RECAP extension for Firefox was launched eight years ago. Today we launch an all new version. Since the original launch in 2009, we’ve kept the system running smoothly, added a Chrome extension, and — with your help — collected and shared information about tens of millions of PACER documents.

Today we’re announcing the future of RECAP. If you’re an existing Firefox or Chrome user, you should automatically get this update over the next 24 hours. If you’re a new user, just learning about RECAP, you can find links for Firefox or Chrome on the right, and you can learn more on the RECAP homepage.

As this new system rolls out, these are the big changes:

  1. As you’re using PACER, the extensions will stop providing links to the Internet Archive, and will instead provide links to CourtListener and the RECAP Archive, where dockets and documents are fully text searchable.

  2. Links to CourtListener will be available very soon after an upload from PACER is complete — possibly within seconds or minutes. This has been the most-requested enhancement we’ve heard over the years, and we’re really happy to be bringing this …

more ...

We Have Every Free PACER Opinion on CourtListener.com

Free Opinion Report Dropdown

At Free Law Project, we have gathered millions of court documents over the years, but it’s with distinct pride that we announce that we have now completed our biggest crawl ever. After nearly a year of work, and with support from the U.S. Department of Labor and Georgia State University, we have collected every free written order and opinion that is available in PACER. To accomplish this we used PACER’s “Written Opinion Report,” which provides many opinions for free.

This collection contains approximately 3.4 million orders and opinions from approximately 1.5 million federal district and bankruptcy court cases dating back to 1960. More than four hundred thousand of these documents were scanned and required OCR, amounting to nearly two million pages of text extraction that we completed for this project.

All of the documents amassed are available for search in the RECAP Archive of PACER documents and via our APIs. New opinions will be downloaded every night to keep the collection up to date.

The RECAP Archive now has more than twenty million documents.

With this additional collection, the RECAP Archive now has information about more than twenty million PACER documents.

As a backup and permanent repository, we are continuing our partnership with the Internet …

more ...

More Details on the PACER Vulnerability We Shared with the Administrative Office of the Courts

PACER Logo

PACER/ECF is a system of 204 websites that is run by the Administrative Office of the Courts (AO) for the management of federal court documents. The main function of PACER/ECF is for lawyers and the public to upload and download court documents such as briefs, memos, orders, and opinions.

In February we reported that we disclosed a major vulnerability in PACER/ECF to the AO. The proof of concept and disclosure/resolution timeline are available here.

We are pleased to share that this issue is now properly addressed, and that we are now able to report more details about it. Throughout the process of researching, disclosing, and resolving this vulnerability, the AO has been prompt and professional, something that we greatly appreciate given the considerable constraints and complexities they are facing. However, despite their skill in dealing with this issue, after discovering it we have lingering concerns about the security of PACER/ECF on the whole.

In this post, we discuss three topics. First, we outline what the vulnerability was and how to identify if you were a victim of it. Second, we discuss why the vulnerability is troubling for a system of PACER/ECF’s size and …

more ...

A Complete Chronology of PACER Fees and Policies

Today, the PACER system contains millions of court filings for the federal district, circuit, and bankruptcy courts, most of which are sold at a dime per page with a three dollar cap per document. But content in PACER was not always priced this way, and indeed the PACER system goes back all the way to the early 1990’s, before computers were generally connected to the Internet.

Fees for using PACER are set by the Judicial Conference of the Administrative Office of the Courts, which scrupulously keeps notes from its bi-annual proceedings going back to its creation in 1922. In this post, we have gone through all of the relevant proceedings, and we present what we believe is a complete history of PACER fees and changes.

During the 27 year history outlined below, technology has changed significantly, and the Administrative Office of the Courts has done its best to keep up. Over the years, PACER has offered a variety of ways to get court information. These include a 1-900 number, a search service available via a regular phone call, the ability to connect your own computer directly to the courts’, and the websites that we know today.

But regardless of …

more ...

Why We Are Downloading all Free Opinions and Orders from PACER

PACER Logo

Today we are launching a new project to download all of the free opinions and orders that are available on PACER. Since we do not want to unduly impact PACER, we are doing this process slowly, giving it several weeks or months to complete, and slowing down if any PACER administrators get in touch with issues.

In this project, we expect to download millions of PDFs, all of which we will add to both the RECAP Archive that we host, and to the Internet Archive, which will serve as a publicly available backup.1 In the RECAP Archive, we will be immediately parsing the contents of all the PDFs as we download them. Once that is complete we will extract the content of scanned documents, as we have done for the rest of the collection.

This project will create an ongoing expense for Free Law Project—hosting this many files costs real money—and so we want to explain two major reasons why we believe this is an important project. The first reason is because there is a monumental value to these documents, and until now they have not been easily available to the public. These documents are a critical …

more ...

Parties, Attorneys, and Firms are Now Searchable in the RECAP Archive

Today we are launching party, attorney, and firm search for the RECAP Archive of PACER documents. This unlocks powerful new ways to do your research.

For example, consider the following queries:

Click any of the above queries to see how they were made.

To use this new feature, type the name of the party or attorney into the fields on the RECAP Archive homepage or in the sidebar to the left of any search results. These boxes also accept advanced query syntax, and there are several new fields that can be queried from the main search box including party, attorney, and firm.

For example, in the main box you can search for attorney:”eric holder”~2 firm:covington. This query shows the cases where the attorney has the word “Eric” within two words of “Holder” (thus allowing his middle name) which were handled at the firm “Covington & Burling”.

Demo of Eric Holder at Covington & Burling

A search for Eric X Holder while …

more ...

Free Law Project has Notified the Administrative Office of the Courts about a Major Security Vulnerability in the PACER/ECF System

Recently, as part of our routine business practices, we discovered what we believe is a major vulnerability in the PACER system of websites that we believe affects both the electronic case filing and public access portals.

At this time, as part of a responsible disclosure process, we have notified the appropriate parties at The Administrative Office of the Courts, the agency that runs PACER. According to industry norms, we have given them a broad 90 day window to resolve the vulnerability.

After the 90 days are up or the issue is resolved, we plan to publish the details of what we discovered, the ramifications of the discovery, and the solution that they have put in place, if any.

Further questions about the vulnerability can be directed to our contact form where you can find our GPG key, if needed.

more ...

Free Law Project to Serve as PACER Data Provider to Department of Labor Grantees at Georgia State University

EMERYVILLE, CA — Free Law Project is proud to announce that it has been selected by researchers at Georgia State University to provide PACER data for their research on employment misclassification lawsuits. The purpose of their research is to gain an understanding of how courts distinguish between employees and independent contractors, and the factors influencing those decisions across federal jurisdictions. This research will be funded by a two-year grant from the U.S. Department of Labor, and will be conducted by primary researchers Charlotte S. Alexander and Mohammad Javad Feizollahi of Georgia State University’s J. Mack Robinson College of Business.

Free Law Project’s role in this grant will be to acquire court opinions and orders from PACER, and to provide them to Alexander and Feizollahi for their research. Because PACER is not optimized for automated access, a key outcome of the grant will be to develop tools and infrastructure to enable other researchers to utilize PACER data through future grants.

PACER data is too difficult for researchers to access, and it’s high time that a centralized service be created by a non-profit to gather this kind of data for researchers,” says Michael Lissner, Founder and Executive Director of …

more ...

Roundup of House Judiciary Committee’s PACER Review

HJC Seal

The House Judiciary Committee held a hearing today on the topic of the “the effectiveness of the PACER service and use of audio and video recordings of courtroom procedures.” Three witnesses were invited by the committee to speak at the hearing, including our board member, Thomas Bruce, who spoke at length on the topic of reforming the PACER system. His written testimony can be found here.

Bruce framed his testimony by providing an overview of the things that PACER is and is not. In his words, these are the characteristics that define PACER:

  1. First, PACER charges fees for access to public records.
  2. Second, PACER became outmoded two years after it was built, and in some ways has never caught up.
  3. Third, PACER suffers from a split personality. On one hand, it is an electronic filing and case management system that supports the Federal courts […]On the other […] it is a data publishing system that offers the work of the Federal courts, both documents and metadata, to a very wide range of people[…]

And these are the things, in his words, that it is not:

  1. It is not transparent in its business model or operations.
  2. PACER is not an adequate facility …
more ...

Free Law Project Receives “Le Hackie” Award from D.C. Legal Hackers for PACER Research and Blogging

On Tuesday we were proud and humbled to receive a Le Hackie award from the D.C. Legal Hackers group for a top ten legal hack of the year:

The “hack” that we received this award for was our series of blog posts about PACER:

And our older pieces:

D.C. Legal Hackers is an amazing group, and we’re really proud to get this award from them.

more ...

Free Law Project Re-Launches RECAP Archive, a New Search Tool for PACER Dockets and Documents

After months of development, we are thrilled to share a from-scratch re-launch of the RECAP Archive. Our new archive, available immediately at https://www.courtlistener.com/recap/, contains all of the content currently in RECAP and makes it all fully searchable for the first time. At launch, the collection contains information about more than ten million PACER documents, including the extracted text from more than seven million pages of scanned documents.

RECAP Advanced Search Screen

The new advanced search interface for the RECAP Archive.

The search capabilities of this new system empower researchers in new ways. For example:

more ...

Downloading Important Cases on PACER Costs More than a Brand New Car

By now most readers of this blog know that PACER brings in a lot of money by selling public domain documents at a dime per page. What people might not realize is how these costs can add up for individual researchers or journalists. Looking through our database, we realized that we have quite a few really big cases.

All of the cases below have more than ten thousand entries that we know about.1 There are some names you might recognize:

At the top of the above list is the Lehman Brothers bankruptcy case …

more ...

How Much Money Does PACER Make?

PACER is the system that the public and various organizations use to access electronic records in the federal district and appeals courts. When PACER is used, it charges for certain activities, like downloading a PDF or making a search query. Raising funds this way was authorized by congress in the E-Government Act to the extent that the revenue paid for running PACER.

In the beginning the revenue from these charges was fairly modest, but the revenue has risen for many years, culminating in revenue of $145M in 2015 (the last year that’s available).

This chart shows the trends in PACER revenue since 1995:

PACER Revenue Timeline

In total, that’s $1.2B that PACER has brought in over 21 years, with an average revenue of $60.7M per year. The average for the last five years is more than twice that —- $135.2M/year.1 These are remarkable numbers and they point to one of two conclusions. Either PACER is creating a surplus —- which is illegal according to the E-Government Act —- or PACER is costing $135M/year to run.

Whichever the case, it’s clear that something has gone terribly wrong. If the justice system is turning a profit selling public domain …

more ...

What is a “Page” of PACER Content?

As most readers of this blog know, PACER is a system run by the Administrative Office of the Courts (AO) that hosts over a billion documents from the Federal District and Circuit courts. The system was created in the nineties and was set up with a paywall so that you pay for every “page” of data that you receive. The idea of the fees, as established by the E-Government Act, is that the AO could use them to recoup the cost of running the PACER, but the pricing of the content has always been a bit odd. In my last post I talked about how these fees result in an outrageous cost for PACER data. In this post, I do a deep dive into the core unit of PACER’s pricing and attempt to answer the question, what is a “page” of PACER data?

The size of PACER’s fees has varied over the years, but they’ve always gone up, and they’ve always been assessed roughly as follows:

  1. If you download a PDF from PACER, you pay by the page.

  2. If you do a search, you pay by the number of search results returned. Because you don’t …

more ...

The Cost of PACER Data? Around One Billion Dollars.

Recently, we started a new project to analyze a few million PACER documents that we acquired through the RECAP Project. As we began working with the data, one thing we did was count how many pages every document had so that we could calculate the average length of a PDF in PACER. Fairly quickly we learned that based on our sample, the average length of a PACER document is 9.1 pages.1

This is a really interesting statistic. Another is that there are more than one billion documents in in PACER:

CM/ECF currently contains, in aggregate, more than one billion retrievable documents spread among the 13 courts of appeals, 94 district courts, 90 bankruptcy courts, and other specialized tribunals.

With these two statistics and the knowledge that downloading a document costs ten cents per page, we can once again see how PACER—-the biggest paywall the world has ever known—-is a deeply troubling system. At this price, purchasing the …

more ...

Extracting Text from Our Collection of PACER Documents

We’re getting ready to launch a brand new search engine for PACER content. When it launches, one of the big features it will have is full-text search for the millions of documents that people have submitted using our RECAP system. To our knowledge, this will be the first free system for searching PACER content in this way, allowing you to look up documents by any word they might contain.

The big problem with this goal? We have about a million PDFs that consist only of images. Some of these are actually quite beautiful:

Handwritten Motion

A beautiful handwritten motion. It goes on like this for 46 pages.

But others are hideous:

Log from 1957

An 84 page log from 1957. It’s come a long ways just to appear on this blog today.

But no matter how a document looks, we want to extract the text so that we can make it searchable. This is done using a system called Optical Character Recognition (OCR), which looks at each pixel in each page of each document and tries to figure out what letter it is a part of. As you might expect, this can take a while when you’re processing millions of documents averaging …

more ...

Ending our PACER Drainage Initiative and Stopping our Email Lists

We at Free Law Project have been working towards our goals for a few years now, and as a result we’ve accumulated a bit of cruft. Today, we shed just a bit of it, as we simultaneously end one of our smaller projects and deprecate our list server.

The project we are ending is our PACER Drainage project. The idea of the project was to put pressure on PACER by having lots of people use their $15 fee waiver to download PACER content to the RECAP archive.

The main reason we’re ending this project is because PACER is vast, and we never got the kind of uptake we needed for this program to be successful. We knew that we’d never make a big impact on PACER unless thousands of people started using their fee waivers to download content, but we went ahead with this project anyway for two reasons. First, we wanted to be part of the effort last year to apply pressure to PACER, and second, we wanted to raise awareness about the PACER issue via a direct call to action.

While I don’t think we succeeded in applying the pressure we wanted (the AO …

more ...

The Right to Read Anonymously

In 1996, in the Connecticut Law Review, legal scholar Julie Cohen wrote what has become a landmark article in Internet Law entitled, “A Right to Read Anonymously: A Closer Look at “Copyright Management” In Cyberspace.” She began by stating,

A fundamental assumption underlying our discourse about the activities of reading, thinking, and speech is that individuals in our society are guaranteed the freedom to form their thoughts and opinions in privacy, free from intrusive oversight by governmental or private entities.

Cohen notes that, in the past, our right to read anonymously has been protected by libraries and librarians. See, for example, the American Library Association’s Freedom to Read statement, adopted in 1953. Our American experience has generally been that one is able to walk into a public library, take almost any book off the shelf, sit down, and read without ever identifying oneself or asking anyone’s permission. Most libraries, as vigorous defenders of reader privacy, only maintain information about which books you check out until you return them and then they destroy any record connecting your identity to the books checked out. It was, in 1996, the growing prevalence of electronic dissemination of information and technologies to monitor …

more ...