A Distant Supervision Method based on Paradigmatic Relations for Learning Word Embeddings

Benyou Wang

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Word embeddings learned on external resources have succeeded in improving many NLP tasks. However, existing embedding models still face challenges in situations where fine-gained semantic information is required, e.g. distinguishing antonyms from synonyms. In this paper, a distant supervision method is proposed to guide the training process by introducing semantic knowledge in a thesaurus. Specifically, the proposed model shortens the distance between target word and its synonyms by controlling the movements of them in both unidirectional and bidirectional, yielding three different models, namely, Unidirectional Movement of Target Model (UMT), Unidirectional Movement of Synonyms Model (UMS) and Bidirectional Movement of Target and Synonyms Model (BMTS). Extensive computational experiments have been conducted and results are collected for analysis purpose. The results show that the proposed models not only efficiently capture semantic information of antonyms but also achieve significant improvements in both intrinsic and extrinsic evaluation tasks. To validate the performance of the proposed models (UMT, UMS and BMTS), results are compared against well known models, namely, Skip-gram, JointRCM, WE-TD and dict2vec. The performances of the proposed models are evaluated on 4-tasks (benchmarks): word analogy (intrinsic), synonym-antonym detection (intrinsic), sentence matching (extrinsic) and text classification (extrinsic). A case study is provided to illustrate the working of the proposed models in an effective manner. Overall, a distant supervision method based on paradigmatic relations is proposed for learning word embeddings and it outperformed when compared against other existing models.
Original languageEnglish
JournalNeural Computing and Applications
Early online date21 Feb 2019
DOIs
Publication statusE-pub ahead of print - 21 Feb 2019

Fingerprint

Semantics
Thesauri
Experiments

Cite this

@article{5d138c95ad414f34a3bbbeba5afa6156,
title = "A Distant Supervision Method based on Paradigmatic Relations for Learning Word Embeddings",
abstract = "Word embeddings learned on external resources have succeeded in improving many NLP tasks. However, existing embedding models still face challenges in situations where fine-gained semantic information is required, e.g. distinguishing antonyms from synonyms. In this paper, a distant supervision method is proposed to guide the training process by introducing semantic knowledge in a thesaurus. Specifically, the proposed model shortens the distance between target word and its synonyms by controlling the movements of them in both unidirectional and bidirectional, yielding three different models, namely, Unidirectional Movement of Target Model (UMT), Unidirectional Movement of Synonyms Model (UMS) and Bidirectional Movement of Target and Synonyms Model (BMTS). Extensive computational experiments have been conducted and results are collected for analysis purpose. The results show that the proposed models not only efficiently capture semantic information of antonyms but also achieve significant improvements in both intrinsic and extrinsic evaluation tasks. To validate the performance of the proposed models (UMT, UMS and BMTS), results are compared against well known models, namely, Skip-gram, JointRCM, WE-TD and dict2vec. The performances of the proposed models are evaluated on 4-tasks (benchmarks): word analogy (intrinsic), synonym-antonym detection (intrinsic), sentence matching (extrinsic) and text classification (extrinsic). A case study is provided to illustrate the working of the proposed models in an effective manner. Overall, a distant supervision method based on paradigmatic relations is proposed for learning word embeddings and it outperformed when compared against other existing models.",
author = "Jianquan Li and Renfen Hu and Xiaokang Liu and Prayag Tiwari and Hari Pandey and Wei Chen and Yaohong Jing and Kaicheng Yang and {Benyou Wang}",
year = "2019",
month = "2",
day = "21",
doi = "10.1007/s00521-019-04071-6",
language = "English",
journal = "Neural Computing and Applications",
issn = "0941-0643",
publisher = "springer",

}

A Distant Supervision Method based on Paradigmatic Relations for Learning Word Embeddings. / Benyou Wang.

In: Neural Computing and Applications, 21.02.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A Distant Supervision Method based on Paradigmatic Relations for Learning Word Embeddings

AU - Li, Jianquan

AU - Hu, Renfen

AU - Liu, Xiaokang

AU - Tiwari, Prayag

AU - Pandey, Hari

AU - Chen, Wei

AU - Jing, Yaohong

AU - Yang, Kaicheng

AU - Benyou Wang

PY - 2019/2/21

Y1 - 2019/2/21

N2 - Word embeddings learned on external resources have succeeded in improving many NLP tasks. However, existing embedding models still face challenges in situations where fine-gained semantic information is required, e.g. distinguishing antonyms from synonyms. In this paper, a distant supervision method is proposed to guide the training process by introducing semantic knowledge in a thesaurus. Specifically, the proposed model shortens the distance between target word and its synonyms by controlling the movements of them in both unidirectional and bidirectional, yielding three different models, namely, Unidirectional Movement of Target Model (UMT), Unidirectional Movement of Synonyms Model (UMS) and Bidirectional Movement of Target and Synonyms Model (BMTS). Extensive computational experiments have been conducted and results are collected for analysis purpose. The results show that the proposed models not only efficiently capture semantic information of antonyms but also achieve significant improvements in both intrinsic and extrinsic evaluation tasks. To validate the performance of the proposed models (UMT, UMS and BMTS), results are compared against well known models, namely, Skip-gram, JointRCM, WE-TD and dict2vec. The performances of the proposed models are evaluated on 4-tasks (benchmarks): word analogy (intrinsic), synonym-antonym detection (intrinsic), sentence matching (extrinsic) and text classification (extrinsic). A case study is provided to illustrate the working of the proposed models in an effective manner. Overall, a distant supervision method based on paradigmatic relations is proposed for learning word embeddings and it outperformed when compared against other existing models.

AB - Word embeddings learned on external resources have succeeded in improving many NLP tasks. However, existing embedding models still face challenges in situations where fine-gained semantic information is required, e.g. distinguishing antonyms from synonyms. In this paper, a distant supervision method is proposed to guide the training process by introducing semantic knowledge in a thesaurus. Specifically, the proposed model shortens the distance between target word and its synonyms by controlling the movements of them in both unidirectional and bidirectional, yielding three different models, namely, Unidirectional Movement of Target Model (UMT), Unidirectional Movement of Synonyms Model (UMS) and Bidirectional Movement of Target and Synonyms Model (BMTS). Extensive computational experiments have been conducted and results are collected for analysis purpose. The results show that the proposed models not only efficiently capture semantic information of antonyms but also achieve significant improvements in both intrinsic and extrinsic evaluation tasks. To validate the performance of the proposed models (UMT, UMS and BMTS), results are compared against well known models, namely, Skip-gram, JointRCM, WE-TD and dict2vec. The performances of the proposed models are evaluated on 4-tasks (benchmarks): word analogy (intrinsic), synonym-antonym detection (intrinsic), sentence matching (extrinsic) and text classification (extrinsic). A case study is provided to illustrate the working of the proposed models in an effective manner. Overall, a distant supervision method based on paradigmatic relations is proposed for learning word embeddings and it outperformed when compared against other existing models.

U2 - 10.1007/s00521-019-04071-6

DO - 10.1007/s00521-019-04071-6

M3 - Article

JO - Neural Computing and Applications

JF - Neural Computing and Applications

SN - 0941-0643

ER -