decorative thumbnail

Google Patent Phrase Similarity Dataset

location: https://www.kaggle.com/datasets/google/google-patent-phrase-similarity-dataset

contributors: Grigor Aslanyan, Ian Wetherbee

tags: phrases, similarity, semantic matching, validation

documentation: https://www.kaggle.com/datasets/google/google-patent-phrase-similarity-dataset

code: https://www.kaggle.com/competitions/us-patent-phrase-to-phrase-matching/data

terms of_use: Please cite the paper if you use the dataset.

description: This is a human rated contextual phrase to phrase matching dataset focused on technical terms from patents. In addition to similarity scores that are typically included in other benchmark datasets we include granular rating classes similar to WordNet, such as synonym, antonym, hypernym, hyponym, holonym, meronym, domain related. The dataset was used in the U.S. Patent Phrase to Phrase Matching competition.

last edit: Mon, 19 Jun 2023 16:47:03 GMT