애스크로AIPublic Preview
← 학술논문 검색
학술논문한국산업융합학회논문집2025.10 발행

LLM 기반 의미론적 특허 데이터 노이즈 필터링 방법론 연구

LLM-Based Semantic Noise Filtering Method for Patent Text Data

임진성(경상국립대학교 대학원 기술경영학과); 송지훈(경상국립대학교)

28권 5호, 1379~1389쪽

초록

Patent data are essential for tracking technological progress, assessing competitiveness, and forecasting future developments. However, the rapid evolution of technology and the rise of convergent fields make filtering irrelevant data a persistent challenge. Traditional statistical models and manual preprocessing by researchers require substantial time and effort, prompting continuous research on efficient information structuring. In particular, filtering methods based on statistics or keywords have limitations in fully capturing subtle technical nuances and complex contexts. To address these limitations, this study proposes a semantic noise filtering methodology for patent data leveraging the contextual understanding capabilities of large language models (LLMs). The approach integrates LLM-based classification, statistical stability analysis, and cross-LLM review procedures to enhance the consistency and reliability of the filtering results. Applied to 1,930 domestic patents in the bio-artificial organ domain from 2000 to 2024, the method identified 55.4% as noise. The results demonstrate the method’s potential as an effective tool for technology policy formulation and strategic decision-making support.

Abstract

Patent data are essential for tracking technological progress, assessing competitiveness, and forecasting future developments. However, the rapid evolution of technology and the rise of convergent fields make filtering irrelevant data a persistent challenge. Traditional statistical models and manual preprocessing by researchers require substantial time and effort, prompting continuous research on efficient information structuring. In particular, filtering methods based on statistics or keywords have limitations in fully capturing subtle technical nuances and complex contexts. To address these limitations, this study proposes a semantic noise filtering methodology for patent data leveraging the contextual understanding capabilities of large language models (LLMs). The approach integrates LLM-based classification, statistical stability analysis, and cross-LLM review procedures to enhance the consistency and reliability of the filtering results. Applied to 1,930 domestic patents in the bio-artificial organ domain from 2000 to 2024, the method identified 55.4% as noise. The results demonstrate the method’s potential as an effective tool for technology policy formulation and strategic decision-making support.

발행기관:
한국산업융합학회
분류:
기타공학일반

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
LLM 기반 의미론적 특허 데이터 노이즈 필터링 방법론 연구 | 한국산업융합학회논문집 2025 | AskLaw | 애스크로 AI