tags: global, patents

timeframe: 1976-2008

terms of_use: Creative Commons Attribution NonCommercial ShareAlike 3.0 Unported License

description: MAREC Data is a static collection of over 19 million patent applications and granted patents in a unified file format normalized from EP, WO, US, and JP sources, spanning a range from 1976 to June 2008. In MAREC, the documents from different countries and sources are normalized to a common XML format with a uniform patent numbering scheme and citation format. The standardized fields include dates, countries, languages, references, person names, and companies as well as rich subject classifications. It is a comparable corpus, where many documents are available in similar versions in other languages.

