Improving Reliability and Consistency in Korean–Chinese Legal Translation: Terminology Control and Error Analysis in Neural Machine Translation
Improving Reliability and Consistency in Korean–Chinese Legal Translation: Terminology Control and Error Analysis in Neural Machine Translation
장아남(영산대학교 컴퓨터정보공학과); 소길자(영산대학교)
6권 3호, 119~131쪽
초록
Legal translation between Korean and Chinese poses unique challenges due to the complexity of legal terminology, strict requirements for consistency, and the scarcity of domain-specific bilingual resources. While recent advances in neural machine translation (NMT) have improved overall translation performance, issues of terminology reliability and error distribution remain unresolved, limiting the practical applicability of such systems in legal contexts. This study proposes a terminology-aware translation framework that integrates a bilingual legal glossary into both the decoding and post-editing stages of an mBART-based NMT model. In addition, a comprehensive error analysis framework is introduced to classify and quantify common legal translation errors, including semantic inconsistency, grammatical inaccuracies, and terminological deviations. Experiments on a Korean–Chinese legal corpus demonstrate that the proposed approach not only enhances conventional evaluation metrics such as BLEU and TER, but also achieves higher terminology accuracy and consistency compared with baseline systems. These findings highlight the importance of domain-specific terminology control and systematic error analysis in improving the reliability of machine-assisted legal translation, thereby contributing to more trustworthy applications in judicial and legislative contexts.
Abstract
Legal translation between Korean and Chinese poses unique challenges due to the complexity of legal terminology, strict requirements for consistency, and the scarcity of domain-specific bilingual resources. While recent advances in neural machine translation (NMT) have improved overall translation performance, issues of terminology reliability and error distribution remain unresolved, limiting the practical applicability of such systems in legal contexts. This study proposes a terminology-aware translation framework that integrates a bilingual legal glossary into both the decoding and post-editing stages of an mBART-based NMT model. In addition, a comprehensive error analysis framework is introduced to classify and quantify common legal translation errors, including semantic inconsistency, grammatical inaccuracies, and terminological deviations. Experiments on a Korean–Chinese legal corpus demonstrate that the proposed approach not only enhances conventional evaluation metrics such as BLEU and TER, but also achieves higher terminology accuracy and consistency compared with baseline systems. These findings highlight the importance of domain-specific terminology control and systematic error analysis in improving the reliability of machine-assisted legal translation, thereby contributing to more trustworthy applications in judicial and legislative contexts.
- 발행기관:
- 한국인공지능교육학회
- 분류:
- 교육학