UniCourt CEO & Co-Founder, Josh Blandi Joins the Geek In Review Podcast
In the legal industry, there are countless blogs, podcasts, and news outlets focused on the intersection of law and technology, but few of them cut through the noise, and few of them can hold a torch to the eclectic focus of Three Geeks and a Law Blog.
Created by Greg Lambert, Chief Knowledge Services Officer of Jackson Walker LLP, Toby Brown, Chief Practice Management Officer at Perkins Coie LLP, and Sophia Lisa Salazar, Senior Digital Marketing Manager at Norton Rose Fulbright US LLP, Three Geeks and a Law Blog focuses on the administrative side of the large law firm environment.
Along with their well-read blog posts, Three Geeks and a Law Blog also has an ongoing podcast, the Geek In Review, hosted by Greg Lambert and Marlene Gebauer, Director of Knowledge Management at Locke Lord LLP, and we’re excited to share that UniCourt’s CEO and Co-Founder, Josh Blandi was recently invited to join them on the Geek In Review to talk about all things legal data.
In their wide-ranging interview, Josh, Greg, and Marlene talk about UniCourt’s work on behalf of Public Resource for the Code Improvement Commission GitHub, the origin story of UniCourt, the importance of APIs and normalization, pain points connected to court data accessibility, UniCourt’s upcoming Apollo APIs release, the PACER Collective, and much more.
The Code Improvement Commission GitHub
After the Supreme Court’s historic ruling that no one can copyright the law, Carl Malamud and Public.Resource.Org worked with our team at UniCourt to beautify what were once ugly .rtf files of the Georgia Annotated State Codes into beautiful HTML, and we created the Code Improvement Commission GitHub on behalf of Public.Resource.Org to act as a public repository for the public to access those codes.
The Code Improvement Commission is spearheaded by Carl Malamud and Public.Resource.Org, but there are several contributors to the effort, including UniCourt, Justia, Fastcase, and Cornell’s Legal Information Institute, all unified behind the mission to make state laws publicly available and accessible in a usable form with no rights reserved.
As Josh shared with Greg and Marlene, while UniCourt is developing the structure behind the state codes as a contractor for Public.Resource.Org, Carl Malamud and others connected to the Code Improvement Commission are writing letters to legislators and judges to convince more states to reconsider their copyright positions and to work with us to make their law more accessible and usable. For example, Berkeley Law Clinic prepared a letter to the California Judicial Council and Vanderbilt Law School clinic sent a letter to the Tennessee Code Authority.
In addition to providing the state codes in beautiful HTML, we’ve also provided the code behind the .rtf parser we created to transform the official codes from .rtf files into usable HTML. Our parser puts structure, meta data, and accessibility back into the code, and we’ve made it available for everyone to use, so you can structure the codes exactly how you want and add the metadata most important to you.
So far we’ve posted the state codes for Arkansas, Georgia, Mississippi, Tennessee, and Kentucky. And as for what’s next in the pipeline, we’re now working on transforming the codes for Colorado and Idaho over the next couple months, and we’re also working on developing a “redline” capability to show differences in the codes from previous versions.
UniCourt is a strong proponent of open access to the law, because it belongs to us all. The law in all its forms belongs to the People, regardless of whether it’s a court record or a state code. We’re thrilled to keep working on behalf of Public.Resource.Org and with our other partners to ensure the law is accessible to all.
UniCourt’s Origin Story: A Business Development Tool from the Beginning
In 2008, Josh Blandi and others founded the debt relief company, CountryWide Debt Relief, which went on to develop an incredibly successful high-volume, low cost legal services model through partnerships with law firms across the country to help consumers consolidate and ultimately eliminate their debt.
After seeing success in helping consumers with debt relief, in 2012, Josh and team started looking at court data as a potential way to find more consumers involved in collection lawsuits to market to them and help them get the debt relief they needed.
In the beginning, they looked at the available offerings from LexisNexis, Westlaw, and Bloomberg Law to try getting bulk access to the court data they needed, but soon found there were no real options for acquiring court data in bulk. Further, even in the instances where data was available, it was exorbitantly expensive and there was no opportunity for data streaming in real-time to allow the team at CountryWide to consume the data and leverage it for business development.
Having found no viable options for bulk access to the court data they needed, in 2012, they started acquiring data by building extractors to grab it from online court portals. By 2014, what started out as just a project to gain more business grew to be more than just a project, and had morphed into its own venture.
Seeing the “tremendous need” in the market for Legal Data as a Service (LDaaS), and the ability to get court data in a streamed, real-time, structured fashion, UniCourt was officially formed in 2014. Seven years later, UniCourt is now a leading provider of LDaaS with industry standard APIs and artificial intelligence normalization that cleans and structures court data.
APIs and Normalization: “Let’s talk about fractured data.”
To start the conversation on the fractured nature of state court docket data during their chat on the Geek In Review, Marlene set the stage by saying that “The state courts are just all over the place in how data is gathered, how it’s processed, and how it’s displayed.”
Josh also echoed Marlene noting that the foundational problem with court data is that it’s siloed and each court does things their own way. Each court stores and organizes records differently, uses different technologies, and has its own nomenclature.
Another key problem Josh articulated is that the handful of large technology vendors providing the software courts use to make court data available online largely focus on offering courts customization instead of standardization. More specifically, these large vendors have a vested interest in allowing courts to tailor things to their own jurisdictional peculiarities, instead of pushing for universal court data standards, as it allows courts to continue operating in silos and makes them dependent on these vendors.
A lot of UniCourt customers, especially AmLaw firms and Fortune 500 companies, are interested in getting court data on a large scale in an automated fashion, and in a standard format. To do that, they use our APIs or “application programming interfaces” to integrate court data and other public data with their systems.
“One of the advantages of using an API” Josh notes, “is it provides a standardized way to access court data across all state and federal courts.” Once UniCourt ingests data from state court portals, we standardize key data points like case types, party types, case statuses, representation types, docket entries, documents, and more to remove the unnecessary idiosyncrasies from individual courts. This data is then made available for download in bulk via our APIs.
On top of the standardization and structure we add to court data, we also normalize it with artificial intelligence. In court data, there are countless names and variations of names of attorneys, law firms, parties and judges, but they are just names in court records until they are connected to the entities behind them.
We connect those surface names to real-world entities by using baseline datasets from public records like Bar Data and Secretary of State records to tell exactly which John Smith, Esq. or corporation is involved in litigation. This also allows for deeper levels of research, creating custom reporting on individual entities of interest, and locating and tracking all of the litigation involving those entities.
Pain Points Connected to Court Data Accessibility
After touching on the importance of APIs and normalization when it comes to court data, Greg Lambert asked Josh whether courts were becoming more open, whether they are making data more accessible, and if the pandemic made a difference in ramping up any accessibility efforts?
The central problem Josh alluded to in answering Greg is that courts face a very difficult problem in that they don’t have flexibility in their technology to make data more accessible and useful for the public. Vendors like Tyler Technologies and Journal do not provide APIs to access court data because they have a vested interest in not having competitors pop up and provide court data in a more usable format.
From Josh’s perspective, one of the only ways court data truly gets freed from court tech vendor monetization is through advancements in open source software (OSS) for court case management tools, where courts can leverage OSS for the majority of what’s needed to form the basis of their case management tools, and build the last mile on their own to tailor it to their needs.
If courts own their own code built off of OSS, they’re no longer dependent on tech vendors and can also move from one vendor to the next without fear of not being able to access and provide the public with access to their data.
Josh was also asked to share some good examples of courts who are doing things right, and mentioned Miami-Dade as doing a good job with providing a Court Data API. However, Miami is also a good example of some of the common problems with these forward facing courts.
While Miami-Dade provides APIs for docket level data, they do not provide access to court documents through their API. Meaning that if you want bulk access to court documents for Miami-Dade, you need to aggregate those documents on your own and cobble together the docket data and the court documents to build a court records “Frankenstein” as Greg so aptly termed it.
In stark contrast to state courts, Josh shared with Greg and Marlene that, “As much as we all talk bad about PACER, PACER is the most forward system out there.” However, “There is a ton of room for improvement.” A great example of this is that even though PACER has APIs available for use, UniCourt does not use them because they don’t provide the full data that’s available in PACER’s database. As we’ve found, much more information can be gathered through parsing HTML files from PACER, so we parse instead to provide our clients with the full range of available data.
Upcoming Release of Apollo APIs
When we initially released our first suite of Legal Data APIs, we wanted to test the market on the real need for accessing structured court data in bulk, and we found an incredible appetite for not only the data, but also the ability to stream it in real-time.
As Josh shared in the Geek In Review podcast, since the release of our first v1 APIs, we’ve leveraged two years worth of feedback from AmLaw firms and Fortune 500 companies in insurance, finance, background checks and due diligence, and news agencies, and we’ve spent the better part of a year and a half developing v2 of our APIs, codenamed Apollo.
These new APIs integrate our analytics and normalization technology on top of our original APIs to combine docket data and documents with analytical data, so you can hop from data point to data point and uncover critical connections and insights more easily.
With our original APIs, you might have needed a multi-step workflow to get the data you needed across our different databases. However, with Apollo, all of our data is integrated together in one place, so you can seamlessly move from pulling data for a specific case, to then seeing the bar data for the Plaintiff’s attorney involved in that case, to getting the court data for all of the cases that attorney has handled, and see which judges and/or opposing counsel that attorney has faced.
We’re also switching to a very powerful query language that will allow you to construct far more detailed queries than in our v1 API to pinpoint the exact data you need for your particular use case. And, one of the most important aspects of Apollo is that we will also be releasing sample code bases for the most common use cases for court data that can be downloaded to make integration with our APIs faster and easier for our clients. This will greatly reduce the dev time required to make full use of our APIs and speed up innovation projects dependent on access to data.
So, where are we now? We’ve released our Alpha documentation to key stakeholders and are taking comments and feedback from them until August 1st. While we’re gathering feedback and seeking input from clients, we’re finishing the development of the APIs, and plan on an Alpha release in the first part of the third quarter this year, with a general availability release in early 2022.
The PACER Collective: “There’s a better way to do this.”
To close out our conversation with Greg and Marlene, we were happy that Josh was able to share more about our PACER Collective and how UniCourt and others are making strides to increase access to the public court data locked behind PACER’s paywall.
While we would love to see the Open Courts Act pass in Congress and become a reality, we’re not holding our breath, as there is serious entrenchment within the federal judiciary, which has become reliant on collecting PACER fees. We’re pushing forward, and the PACER Collective is our way of normalizing the costs of accessing PACER data.
The PACER Collective was initially established as a partnership between Justia and UniCourt to share the costs of paying for the same PACER data that we were both aggregating, to allow for increased frequencies at which we update cases to get new information, and to standardize the libraries of information collect and schedule.
Originally, we were both paying hundreds of thousands of dollars to buy PACER data every year, and overnight we reduced our PACER costs by 50%. We very much share Greg’s sentiments when he said, “Anything to reduce my PACER costs would be great.”
What has been a successful partnership between Justia and UniCourt has now grown to include other leaders in the Open PACER Data movement, including Fastcase and DocketAlarm, as well as others in the legal research and legal analytics space.
The main goals of the PACER Collective are to:
- Reduce PACER costs – each company is already paying for the same PACER data separately, and there is no real competitive advantage to pulling separately vs. together.
- Standardize the way PACER data is pulled and structured to ensure data integrity and reduce extraction costs.
- Increase the availability of PACER data, as access to more PACER data will help fuel change and innovation.
As a group, we can make a serious impact on changing the current PACER situation. We are not looking for any member to gain a major competitive advantage as a result of the PACER Collective, but rather we want to use our combined force to spur innovation and change.
So who can join? The PACER Collective is open to everyone who is looking to access PACER data in bulk, including legal tech companies, law libraries, law firms, law schools, legal aid organizations, and nonprofits.
Get In Touch With Us and Learn More
Big thanks to Greg and Marlene for having Josh on Geek In Review. We really enjoyed getting the opportunity to share more about UniCourt, and some of our projects in the works. You can read the full transcript of the interview on their website, and also check out the many other great interviews they’ve done over the years.
If you’re interested in learning more about the PACER Collective, the Code Improvement Commission, the upcoming Apollo release, or Legal Data as a Service, Contact Us and tell us what you’re interested in.