Artificial Intelligence Patent Dataset


contributors: Alexander Giczy, Nicholas Pairolero, Andrew Toole

tags: AI, validation, patents


timeframe: 1976-2020

description: The Artificial Intelligence Patent Dataset consists of two files, both released by the OCE. The first data file identifies United States (U.S.) patents issued between 1976 and 2020 and pre-grant publications (PGPubs) published through 2020 that contain one or more of several AI technology components (including machine learning, natural language processing, computer vision, speech, knowledge processing, AI hardware, evolutionary computation, and planning and control). OCE generated this data file using a machine learning (ML) approach that analyzed patent text and citations to identify AI in U.S. patent documents. The second data file contains the patent documents used to train the ML models.

last edit: Sun, 11 Feb 2024 16:27:14 GMT