애스크로AIPublic Preview
← 학술논문 검색
학술논문정보시스템연구2025.03 발행

트랜스포머 기반 BERT를 활용한 비특허 문헌 자동 분류의 성능 향상 방안 연구

Using Transformer-Based BERT for Improving the Performance of Automatic Non-Patent Literature Classification

김성원(경상국립대학교); 안민영(경상국립대학교); 유동희(경상국립대학교)

34권 1호, 155~170쪽

초록

Purpose Non-Patent Literature (NPL) plays a crucial role in patent examination but is difficult to classify due to its vast volume and diverse formats. This study proposes an approach utilizing BERT-based Natural Language Processing (NLP) techniques to automatically classify NPL and assign Cooperative Patent Classification (CPC) codes. Design/methodology/approach NPL abstracts cited in U.S. patents were collected from KIPRIS Plus. The study applied vectorization techniques such as TF-IDF, SBERT, and anferico/bert-for-patents, and compared classification performance using Logistic Regression, XGBoost, LightGBM, BERT, RoBERTa, and anferico/bert-for-patents models. Findings The anferico/bert-for-patents model, specialized for patent documents, achieved the highest classification accuracy (56.3%) and effectively captured the semantic representation of NPL. This study contributes to improving NPL search and classification efficiency, enhancing the prior art search process in patent examination.

Abstract

Purpose Non-Patent Literature (NPL) plays a crucial role in patent examination but is difficult to classify due to its vast volume and diverse formats. This study proposes an approach utilizing BERT-based Natural Language Processing (NLP) techniques to automatically classify NPL and assign Cooperative Patent Classification (CPC) codes. Design/methodology/approach NPL abstracts cited in U.S. patents were collected from KIPRIS Plus. The study applied vectorization techniques such as TF-IDF, SBERT, and anferico/bert-for-patents, and compared classification performance using Logistic Regression, XGBoost, LightGBM, BERT, RoBERTa, and anferico/bert-for-patents models. Findings The anferico/bert-for-patents model, specialized for patent documents, achieved the highest classification accuracy (56.3%) and effectively captured the semantic representation of NPL. This study contributes to improving NPL search and classification efficiency, enhancing the prior art search process in patent examination.

발행기관:
한국정보시스템학회
분류:
경영학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
트랜스포머 기반 BERT를 활용한 비특허 문헌 자동 분류의 성능 향상 방안 연구 | 정보시스템연구 2025 | AskLaw | 애스크로 AI