-
Google BERT for Patents
A BERT (bidirectional encoder representation from transformers) model pretrained on over 100 million patent publications from the U.S. and other countries using open-source tooling. The trained model can be used for a number of use cases, includin...
-
Citation Chaser
In systematic reviews, we often want to obtain lists of references from across studies: forward citation chasing looks for all records citing one or more articles of known relevance; backward ciation chasing looks for all records referenced in one...
-
CiteSpace
CiteSpace generates interactive visualizations of structural and temporal patterns and trends of a scientific field. It facilitates a systematic review of a knowledge domain through an in-depth visual analytic process. It can process citation data...
-
Mediawiki Citation API
Citoid is an auto-filled citation generator which automatically creates a citation template from online sources based on a URL or some academic reference identifiers like DOIs, PMIDs, PMCIDs and ISBNs. Mediawiki hosts a citoid API, which it's poss...
-
Claim Breadth Model
We demonstrate a machine learning (ML) based approach to estimating claim breadth, which has the ability to capture more nuance than a simple word count model. While our approach may be an improvement over simpler methods, it is still imperfect an...
-
Claim Text Extraction
Imagine you're analyzing a subset of patents and want to do some text analysis of the first independent claim. To do this, you'd need to be able to join your list of patent publication numbers with a dataset containing the patent text. Additionall...
-
Cooperative Patent Classification Scheme
CPC is the outcome of an ambitious harmonization effort to bring the best practices from the EPO and USPTO together. In fact, most U.S. patent documents are already classified in ECLA. The conversion from ECLA to CPC at the EPO will ensure IPC com...
-
Frictionless Framework
Frictionless is a framework to describe, extract, validate, and transform tabular data, available as a Python library. It supports working with data in a standardised and reproducible way by improving data quality and consistency.
-
Google Patents match API
Resolves messy patent publication and application numbers to DOCDB publication number format.
-
Grobid
GROBID (or Grobid, but not GroBid nor GroBiD) means GeneRation Of BIbliographic Data.
GROBID is a machine learning library for extracting, parsing and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a parti...
-
Tools for Harmonizing County Boundaries
This tool creates the csv tables that allow county boundaries to be synchronized to a base year, exported to the directory you run this from. While this code takes shape files of any type and preforms an intersect, it was written to follow the met...
-
Manual of Patent Examining Procedure
This Manual is published to provide U.S. Patent and Trademark Office (USPTO) patent examiners, applicants, attorneys, agents, and representatives of applicants with a reference work on the practices and procedures relative to the prosecution of pa...
-
OpenRefine
OpenRefine is a desktop application that uses your web browser as a graphical interface. It is described as “a power tool for working with messy data”. OpenRefine is most useful where you have data in a simple tabular format such as a spreadsheet,...
-
Automated Patent Landscaping
Patent landscaping is the process of finding patents related to a particular topic. It is important for companies, investors, governments, and academics seeking to gauge innovation and assess risk. However, there is no broadly recognized best appr...
-
PatentsView API
The PatentsView platform is built on a newly developed database that longitudinally links inventors, organizations, locations, and patenting activity since 1976. The data visualization tool, query tool, and flexible API enable a broad spectrum of ...
-
Trademark Manual of Examining Procedure
The Manual is published to provide trademark examining attorneys in the USPTO, trademark applicants, and attorneys and representatives for trademark applicants with a reference work on the practices and procedures relative to prosecution of applic...
-
Wellcome Trust data tools
Machine Learning tools, other scripts they use to analyze + visualize grant proposals and outcomes from their public data
-
WIPO Guidelines for Preparing Patent Landscape Reports
These Guidelines are designed both for general users of patent information, as well as for those involved in producing Patent Landscape Reports (PLRs). They provide step-by-step instructions on how to prepare a PLR, as well as background informati...