N-gram을 활용한 중국 특허 명세서의 정형 표현 분석: KIPRIS 중한 코퍼스를 중심으로
N-Gram Analysis of Formulaic Expression in Chinese Patent Specifications: Focusing on the KIPRIS Patent Chinese-Korean Corpus
정미선(이화여자대학교); 김혜림(이화여자대학교)
22권 2호, 157~187쪽
초록
This study aims to create basic data to enhance the quality of machine translation by analyzing and categorizing formulaic expressions in Chinese patent specifications. By focusing on section G, which represents the largest portion of the Korean-Chinese corpus data(Sections A to H) released by KIPRIS, this study analyzed formulaic expressions. The analysis encompasses examples of free combination-based formulaic expressions and expression patterns ranging from 2-gram to 5-gram types. By accurately extracting these recurrent formulaic expressions from Chinese patent specifications and constructing a database for machine translation, it is anticipated that the quality of Korean-Chinese patent machine translation can be significantly improved. A more objective outcome is expected from a comparative analysis of a large-scale Korean-Chinese specification corpus across Sections A to H, as classified according to IPC. It is hoped that this study will serve as a catalyst for further research on Korean-Chinese patent translation.
Abstract
This study aims to create basic data to enhance the quality of machine translation by analyzing and categorizing formulaic expressions in Chinese patent specifications. By focusing on section G, which represents the largest portion of the Korean-Chinese corpus data(Sections A to H) released by KIPRIS, this study analyzed formulaic expressions. The analysis encompasses examples of free combination-based formulaic expressions and expression patterns ranging from 2-gram to 5-gram types. By accurately extracting these recurrent formulaic expressions from Chinese patent specifications and constructing a database for machine translation, it is anticipated that the quality of Korean-Chinese patent machine translation can be significantly improved. A more objective outcome is expected from a comparative analysis of a large-scale Korean-Chinese specification corpus across Sections A to H, as classified according to IPC. It is hoped that this study will serve as a catalyst for further research on Korean-Chinese patent translation.
- 발행기관:
- 한국통번역교육학회
- 분류:
- 통역번역