Applications of Big Data and AI-Driven Technologies in High-Dimensional Data Analysis: Taiwanese Bankruptcy Prediction Using Machine Learning Models with Factor Analysis
Applications of Big Data and AI-Driven Technologies in High-Dimensional Data Analysis: Taiwanese Bankruptcy Prediction Using Machine Learning Models with Factor Analysis
고주용(고려대학교); 이재우(고려대학교)
28권 4호, 286~302쪽
초록
Artificial intelligence techniques have been developed in the prediction of corporate bankruptcy over time. The first step in the analysis of real-world data on bankrupt companies is to include multicollinearity, which has an adverse impact on biased estimation and possibly causes large errors. Studies in finance have presented potential problems affected by a strong association between the features and the outcome in the dataset, but examining the role of the factor analysis techniques which handle bias in the corporate data is at a nascent stage. The integrative big data analytics can be utilized to combine unsupervised learning to understand the structures of high-dimensional data with supervised learning to classify the target outcome efficiently. In this study, the results of the big data analytics show that random forest classification with factor analysis outperforms other big data analysis techniques in terms of predictive accuracy. The goal of this study is to minimize the gap between the theoretical concepts of artificial intelligence techniques and the analysis of real-world financial data, in addition to developing big data analytics methods on high-dimensional data with strongly associated corporate features. The method proposed in this study can be applied to similarly structured data which may contribute to understanding the interplay between corporate bankruptcy and financial features.
Abstract
Artificial intelligence techniques have been developed in the prediction of corporate bankruptcy over time. The first step in the analysis of real-world data on bankrupt companies is to include multicollinearity, which has an adverse impact on biased estimation and possibly causes large errors. Studies in finance have presented potential problems affected by a strong association between the features and the outcome in the dataset, but examining the role of the factor analysis techniques which handle bias in the corporate data is at a nascent stage. The integrative big data analytics can be utilized to combine unsupervised learning to understand the structures of high-dimensional data with supervised learning to classify the target outcome efficiently. In this study, the results of the big data analytics show that random forest classification with factor analysis outperforms other big data analysis techniques in terms of predictive accuracy. The goal of this study is to minimize the gap between the theoretical concepts of artificial intelligence techniques and the analysis of real-world financial data, in addition to developing big data analytics methods on high-dimensional data with strongly associated corporate features. The method proposed in this study can be applied to similarly structured data which may contribute to understanding the interplay between corporate bankruptcy and financial features.
- 발행기관:
- 한국산업응용수학회
- 분류:
- 수학