Google Patent Phrase Similarity Dataset
location: https://www.kaggle.com/datasets/google/google-patent-phrase-similarity-dataset
contributors: Grigor Aslanyan, Ian Wetherbee
tags: phrases, similarity, semantic matching, validation
documentation: https://www.kaggle.com/datasets/google/google-patent-phrase-similarity-dataset
code: https://www.kaggle.com/competitions/us-patent-phrase-to-phrase-matching/data
terms of_use: Please cite the paper if you use the dataset.
description: This is a human rated contextual phrase to phrase matching dataset focused on technical terms from patents. In addition to similarity scores that are typically included in other benchmark datasets we include granular rating classes similar to WordNet, such as synonym, antonym, hypernym, hyponym, holonym, meronym, domain related. The dataset was used in the U.S. Patent Phrase to Phrase Matching competition.
last edit: Mon, 19 Jun 2023 16:47:03 GMT