학술논문지식경영연구2024.12 발행KCI 피인용 1

대규모 언어 모델을 활용한 한국어 가짜뉴스 탐지: 한계와 가능성

Detecting Fake News in Korean Using Large Language Models: Limitations and Possibilities

고상훈(국민대학교 비즈니스IT전문대학원); 안현철(국민대학교 비즈니스IT전문대학원)

25권 4호, 113~127쪽

초록

가짜뉴스는 디지털 플랫폼과 소셜 미디어를 통해 빠르게 확산되며, 사회적 신뢰와 공적 담론에 부정적인 영향을 미치 는 중요한 문제로 대두되고 있다. 그런 가운데 최근 대규모 언어 모델(Large Language Models)의 발전은 자연어 처리 기 술의 새로운 가능성을 열었으며, 이러한 모델의 활용은 가짜뉴스 탐지와 같은 중요한 사회적 문제 해결에도 기여하고 있 다. 본 연구는 가짜뉴스의 여러 유형 중 허위정보(misinformation)를 중심으로 한국어 환경에서 LLM 기반 가짜뉴스 탐지 의 가능성과 한계를 분석하고자 한다. 이를 위해 서울대학교 SNU팩트체크에서 수집한 500개의 한국어 뉴스 기사를 기 반으로 한 벤치마크 데이터셋을 구축하였으며, 기사 요약 방식을 적용하여 추출형 및 생성형 데이터셋을 추가로 설계하 였다. 본 연구는 세 가지 연구 질문을 중심으로 진행되었다: (1) 한국어 환경에서 LLM 기반 가짜뉴스 탐지가 효과적인 가, (2) 어떤 요약방식이 LLM을 활용한 가짜뉴스 탐지에 효과적인가?, (3) 탐지 성능을 향상시키기 위한 최적의 방법은 무엇인가? 분석 결과, 한국어 데이터셋에서 LLM 기반 탐지 성능은 영어 중심 연구 대비 낮은 정확도(59.8%)를 보였다. 또한, 요약 텍스트를 활용한 실험에서는 문장이 짧아질수록 탐지 성능이 감소하였으며, 생성형 요약이 추출형 요약보다 다소 우수한 성능을 보였다. 마지막으로, 가짜뉴스 탐지의 기준이 되는 7가지 이유를 반영한 개선된 프롬프트를 도입하 여 탐지 정확도를 소폭(62.1%) 개선할 수 있었다. 본 연구는 한국어 환경에서의 LLM 기반 가짜뉴스 탐지 연구를 확장하 며, 고도화된 프롬프트 설계와 맥락적 요인을 고려한 접근이 탐지 성능 향상에 기여할 수 있음을 입증하였다. 연구 결과 는 가짜뉴스 탐지뿐 아니라 다양한 언어 및 문화적 맥락에서의 LLM 활용 가능성을 제시하며, 향후 연구 및 실무적 응용에 중요한 시사점을 제공한다.

Abstract

Fake news spreads rapidly through digital platforms and social media and has become an important problem that negatively impacts social trust and public discourse. Recent advances in Large Language Models (LLMs) have opened up new possibilities for natural language processing techniques, and their use is also contributing to solving important social problems such as fake news detection. This study aims to analyze the possibilities and limitations of LLM-based fake news detection in the Korean environment, focusing on misinformation among the different types of fake news. To this end, we constructed a benchmark dataset based on 500 Korean news articles collected from SNU FactCheck. Further, we designed extracted and generated datasets by applying the article summarization method. This study centered on three research questions: (1) is LLM-based fake news detection effective in the Korean environment? (2) which summarization method is effective in detecting fake news using LLM, and (3) what is the optimal way to improve detection performance? The results showed that LLM-based detection accuracy in the Korean dataset was lower (59.8%) compared to English-focused studies. In addition, detection performance in experiments using summarized text decreased as sentences became shorter, and generated summaries performed slightly better than extracted summaries. Finally, we improved the detection accuracy slightly (62.1%) by introducing an improved prompt that reflects the seven reasons for fake news detection. This study extends the LLM-based fake news detection research in the Korean environment. It demonstrates that an advanced prompt design and an approach considering contextual factors can improve detection performance. The findings suggest the feasibility of utilizing LLM not only for fake news detection but also in various linguistic and cultural contexts, which have important implications for future research and practical applications.

발행기관:: 한국지식경영학회
DOI:: http://dx.doi.org/10.15813/kmr.2024.25.4.006
분류:: 경영학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작