decorative thumbnail

Reliance on Science in Patenting


contributors: Matt Marx, Aaron Fuegi

tags: citation, scholarly literature, front-page, error metrics

related projects:


timeframe: 1834-2019

terms of_use: Open Data Commons Attribution License v1.0


description: We introduce an open-access dataset of references from the front pages of patents granted worldwide to scientific papers published since 1800. Each patent-paper linkage is assigned a confidence score, which is characterized in a random sample by false negatives versus false positives. All matches are available for download at We outline several avenues for strategy research enabled by these new data. This contains citations from the front pages of worldwide patents to articles in the Microsoft Academic Graph (MAG) from 1800-2020.

last edit: Mon, 19 Jun 2023 16:35:24 GMT


Reliance on Science links U.S. Patent & Trademark Office data to a broad set of scientific articles not limited by industry or field. These linkages involve not only proprietary article databases, which cannot be shared, but also the Microsoft Academic Graph which permits us to post the resulting PCS for public use. Based on third-party assessment, we estimate that our algorithm can capture up to 93% of patent citations to science with an accuracy rate of 99% or higher. We believe this to be the longest panel of patent-to-paper citations (spanning more than seven decades) that is publicly available and is accompanied by rigorous performance metrics. We also provide matches from worldwide patents to PubMed.