이커머스 후기글 평가분석 트리플에 기반한 평가주석 데이터셋 EVAD 구축 방법론
A Methodology of Building Evaluation-Annotated Datasets (EVAD) Based on the Evaluation-Triple in E-commerce Reviews
최수원(한국외국어대학교); 남지순(한국외국어대학교)
99호, 245~272쪽
초록
This study aims to introduce a methodology of building evaluation-annotated datasets based on the evaluation-triple schema that we extracted from the review texts of fashion e-commerce apps. The evaluation triple defined in this study consists of ‘TARGET’, ‘ASPECT’, and ‘VALUE’. We classified the pair {ASPECT-VALUE} into 35 categories, which were characterized as three categories: ‘Information-, Judgment- and Suggestion- related' types. The triple elements were represented under a set of Local Grammar Graphs based on the domain-specific dictionary DECO-DOM. By applying these linguistic resources to a target review text, we generated a large-scale annotated dataset. In this study, the Semi-automatic Symbolic Propagation (SSP) methodology proposed by Nam (2021) was adopted to generate the EVAD dataset of about 200,000 review texts. The evaluation of the SSP approach shows the performance of the F1- Score 0.91, which confirms the reliability of the process that we propose in this study.
Abstract
This study aims to introduce a methodology of building evaluation-annotated datasets based on the evaluation-triple schema that we extracted from the review texts of fashion e-commerce apps. The evaluation triple defined in this study consists of ‘TARGET’, ‘ASPECT’, and ‘VALUE’. We classified the pair {ASPECT-VALUE} into 35 categories, which were characterized as three categories: ‘Information-, Judgment- and Suggestion- related' types. The triple elements were represented under a set of Local Grammar Graphs based on the domain-specific dictionary DECO-DOM. By applying these linguistic resources to a target review text, we generated a large-scale annotated dataset. In this study, the Semi-automatic Symbolic Propagation (SSP) methodology proposed by Nam (2021) was adopted to generate the EVAD dataset of about 200,000 review texts. The evaluation of the SSP approach shows the performance of the F1- Score 0.91, which confirms the reliability of the process that we propose in this study.
- 발행기관:
- 언어과학회
- 분류:
- 언어학