애스크로AIPublic Preview
← 학술논문 검색
학술논문언어과학2022.11 발행

법률상담 도메인의 자연어이해 모델 학습을 위한언어자원 구축 방법론

A Methodology for Building Linguistic Resources for Natural Language Understanding Model Training in a Legal Counseling Domain

황창회(한국외국어대학교); 남지순(한국외국어대학교)

29권 4호, 181~212쪽

초록

This study proposes a methodology for constructing linguistic resources to train Natural Language Understanding (NLU) models for the legal counseling service. A dataset based on the language resources we propose is essential for developing non-face-to-face legal services that provide information related to legal problems. The linguistic resources were constructed through a bottom-up analysis of linguistic patterns of legal expressions, background descriptions, and discourse types in online legal counseling texts. Moreover, we analyzed the hierarchical classification of keywords in existing legal service systems and newly determined 20 keywords that belong to 4 representative legal categories. Local Grammar Graphs (LGGs), effective in describing local linguistic phenomena, were adopted to describe various linguistic patterns in this domain. These local language patterns, modularized in LGG format, are converted into Finite State Transducers (FSTs) and generate datasets required for training a language model for NLU. To evaluate this processing, we trained an NLU model of the open-source chatbot architecture Rasa with our dataset. The model performance shows a 0.91 f1-score, which affirms that the linguistic resources and the methodology proposed in this study can be practically applied in developing legal counseling chatbot systems.

Abstract

This study proposes a methodology for constructing linguistic resources to train Natural Language Understanding (NLU) models for the legal counseling service. A dataset based on the language resources we propose is essential for developing non-face-to-face legal services that provide information related to legal problems. The linguistic resources were constructed through a bottom-up analysis of linguistic patterns of legal expressions, background descriptions, and discourse types in online legal counseling texts. Moreover, we analyzed the hierarchical classification of keywords in existing legal service systems and newly determined 20 keywords that belong to 4 representative legal categories. Local Grammar Graphs (LGGs), effective in describing local linguistic phenomena, were adopted to describe various linguistic patterns in this domain. These local language patterns, modularized in LGG format, are converted into Finite State Transducers (FSTs) and generate datasets required for training a language model for NLU. To evaluate this processing, we trained an NLU model of the open-source chatbot architecture Rasa with our dataset. The model performance shows a 0.91 f1-score, which affirms that the linguistic resources and the methodology proposed in this study can be practically applied in developing legal counseling chatbot systems.

발행기관:
한국언어과학회
DOI:
http://dx.doi.org/10.14384/kals.2022.29.4.181
분류:
언어학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
법률상담 도메인의 자연어이해 모델 학습을 위한언어자원 구축 방법론 | 언어과학 2022 | AskLaw | 애스크로 AI