애스크로AIPublic Preview
← 학술논문 검색
학술논문지능정보연구2014.03 발행KCI 피인용 2

Smarter Classification for Imbalanced Data Set and Its Application to Patent Evaluation

Smarter Classification for Imbalanced Data Set and Its Application to Patent Evaluation

권오병(경희대학교); 이상연(경희대학교)

20권 1호, 15~34쪽

초록

Overall, accuracy as a performance measure does not fully consider modular accuracy: the accuracy of classifying 1 (or true) as 1 is not same as classifying 0 (or false) as 0. A smarter classification algorithm would optimize the classification rules to match the modular accuracies’ goals according to the nature of problem. Correspondingly, smarter algorithms must be both more generalized with respect to the nature of problems, and free from decretization, which may cause distortion of the real performance. Hence, in this paper, we propose a novel vertical boosting algorithm that improves modular accuracies. Rather than decretizing items, we use simple classifiers such as a regression model that accepts continuous data types. To improve the generalization, and to select a classification model that is well-suited to the nature of the problem domain, we developed a model selection algorithm with smartness. To show the soundness of the proposed method, we performed an experiment with a real-world application: predicting the intellectual properties of e-transaction technology, which had a 47,000+ record data set.

Abstract

Overall, accuracy as a performance measure does not fully consider modular accuracy: the accuracy of classifying 1 (or true) as 1 is not same as classifying 0 (or false) as 0. A smarter classification algorithm would optimize the classification rules to match the modular accuracies’ goals according to the nature of problem. Correspondingly, smarter algorithms must be both more generalized with respect to the nature of problems, and free from decretization, which may cause distortion of the real performance. Hence, in this paper, we propose a novel vertical boosting algorithm that improves modular accuracies. Rather than decretizing items, we use simple classifiers such as a regression model that accepts continuous data types. To improve the generalization, and to select a classification model that is well-suited to the nature of the problem domain, we developed a model selection algorithm with smartness. To show the soundness of the proposed method, we performed an experiment with a real-world application: predicting the intellectual properties of e-transaction technology, which had a 47,000+ record data set.

발행기관:
한국지능정보시스템학회
DOI:
http://dx.doi.org/10.13088
분류:
산업공학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
Smarter Classification for Imbalanced Data Set and Its Application to Patent Evaluation | 지능정보연구 2014 | AskLaw | 애스크로 AI