Corporate bankruptcy prediction is an important research direction in finance. Building a robust prediction scheme for bankruptcy can be beneficial to several stakeholders, including management organizations, government and stockholders. Ensemble learning is a well-known technique to improve the predictive performance of classification algorithms by decreasing the generalization error and enhancing the classification accuracy. It has been a well-established technique in bankruptcy prediction to enhance the predictive performance. Diversity plays an essential role in constructing robust ensemble classification schemes. In this paper, a clustering based classifier ensemble approach is presented for corporate bankruptcy prediction. In this scheme, k-means algorithm is utilized to obtain diversified training subsets. Based on the subsets, each base learning algorithms are trained and the predictions of base learning algorithms are combined by a majority voting scheme. In the empirical analysis, four classification algorithms (namely, C4.5 algorithm, k-nearest neighbour algorithm, support vector machines and logistic regression) and three ensemble learning methods (Bagging, AdaBoost and Random Subspace) are evaluated.
Firma Başarısızlığının Tahmin Edilmesi İçin Kümelemeye Dayalı Bir Sınıflandırıcı Topluluğu Yaklaşımı
Öz
Firma başarısızlıklarının tahmin edilmesi, finansta önemli bir araştırma yönüdür. Güvenilir başarısızlık tahmin etme modellerinin geliştirilmesi, aralarında yönetim organizasyonlarının, devlet kurumlarının ve hisse senedi sahiplerinin de yer aldığı birçok farklı paydaş için oldukça yararlı olabilmektedir. Topluluk öğrenmesi yöntemi, genelleştirme hatasını azaltarak ve doğru sınıflandırma oranını artırarak, sınıflandırma algoritmalarının tahmin etme başarımını artıran önemli bir tekniktir. Topluluk öğrenmesi, firma başarısızlıklarının tahmin edilmesinde kullanılan yaygın kullanıma sahip bir yöntemdir. Yüksek başarımlı sınıflandırıcı topluluklarının oluşturulmasında çeşitlilik önemli bir rol oynamaktadır. Bu çalışmada, firma başarısızlıkların tahmin edilmesi için kümelemeye dayalı bir sınıflandırıcı topluluğu yaklaşımı sunulmaktadır. Önerilen tasarıda, k-ortalama algoritması kullanılarak, çeşitlendirilmiş eğitim alt kümeleri oluşturulmaktadır. Bu eğitim alt kümelerine dayalı olarak, sınıflandırıcı topluluğunda yer alan her bir temel öğrenme algoritması eğitilmekte ve temel öğrenme yöntemlerinin bireysel çıktıları çoğunluk oylaması aracılığıyla birleştirilmektedir. Deneysel analizlerde, dört sınıflandırma algoritması (C4.5 algoritması, k-en yakın komşu algoritması, destek vektör makineleri ve lojistik regresyon) ve üç topluluk öğrenmesi yöntemi (Bagging, AdaBoost ve rastgele alt uzay) değerlendirilmiştir.
Alfaro, E., García, N., Gámez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110-122.
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The journal of finance, 23(4), 589-609.
Altman, E. I., Haldeman, R. G., & Narayanan, P. (1977). ZETATM analysis A new model to identify bankruptcy risk of corporations. Journal of banking & finance, 1(1), 29-54.
Andreev, Y.A. (2006). Predicting financial distress of Spanish companies. Jornada De Pre-Comunicaciones A Congresos De Economia Y Administración De Empresas, 1-22.
Balcaen, S., & Ooghe, H. (2006). 35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems. The British Accounting Review, 38(1), 63-93.
Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems with Applications, 83, 405-417.
Blum, M. (1974). Failing company discriminant analysis. Journal of accounting research, 1-25.
Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
Brigham, E. F., & Ehrhardt, M. C. (2013). Financial management: Theory & practice. Cengage Learning.
Catal, C., Tufekci, S., Pirmit, E., & Kocabag, G. (2015). On the use of ensemble of classifiers for accelerometer-based activity recognition. Applied Soft Computing, 37, 1018-1022.
Chou, C. H., Hsieh, S. C., & Qiu, C. J. (2017). Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction. Applied Soft Computing, 56, 298-316.
Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple classifier systems, 1857, 1-15.
Freund, Y., & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. In Icml (Vol. 96, pp. 148-156).
Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier.
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence, 20(8), 832-844.
Hsieh, N. C., & Hung, L. P. (2010). A data driven ensemble classifier for credit scoring analysis. Expert systems with Applications, 37(1), 534-545.
Kim, H. J., Jo, N. O., & Shin, K. S. (2016). Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction. Expert Systems with Applications, 59, 226-234.
Kim, M. J., & Kang, D. K. (2012). Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction. Expert Systems with applications, 39(10), 9308-9314.
Koutanaei, F. N., Sajedi, H., & Khanbabaei, M. (2015). A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. Journal of Retailing and Consumer Services, 27, 11-23.
Kuncheva, L. I. (2004). Combining pattern classifiers: methods and algorithms. John Wiley & Sons.
Lau, A. H. L. (1987). A five-state financial distress prediction model. Journal of accounting research, 127-138.
Marques, A. I., Garcia, V., & Sanchez, J. S. (2012). Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Systems with Applications, 39(11), 10244-10250.
Mendes-Moreira, J., Soares, C., Jorge, A. M., & Sousa, J. F. D. (2012). Ensemble approaches for regression: A survey. ACM Computing Surveys (CSUR), 45(1), 10.
Nanni, L., & Lumini, A. (2009). An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert systems with applications, 36(2), 3028-3033.
Olson, D. L., Delen, D., & Meng, Y. (2012). Comparative analysis of data mining methods for bankruptcy prediction. Decision Support Systems, 52(2), 464-473.
Onan, A. (2015). Şirket iflaslarının tahmin edilmesinde karar ağacı algoritmalarının karşılaştırmalı başarım analizi. Bilişim Teknolojileri Dergisi, 8(1), 9-19.
Onan, A. (2016). Classifier and feature set ensembles for web page classification. Journal of Information Science, 42(2), 150-165.
Onan, A. (2017). Hybrid supervised clustering based ensemble scheme for text classification. Kybernetes, 46(2), 330-348.
Onan, A., Bulut, H., & Korukoglu, S. (2017). An improved ant algorithm with LDA-based representation for text document clustering. Journal of Information Science, 43(2), 275-292.
Pantalone, C. C., & Platt, M. B. (1987). Predicting commercial bank failure since deregulation. New England Economic Review, (Jul), 37-47.
Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1), 1-39.
Shatkay, H., & Craven, M. (2012). Mining the biomedical literature. MIT Press.
Tsai, C. F., Hsu, Y. F., & Yen, D. C. (2014). A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing, 24, 977-984.
Wang, G., Ma, J., & Yang, S. (2014). An improved boosting based on feature selection for corporate bankruptcy prediction. Expert Systems with Applications, 41(5), 2353-2361.
Xiao, H., Xiao, Z., & Wang, Y. (2016). Ensemble classification based on supervised clustering for credit scoring. Applied Soft Computing, 43, 73-86.
Yang, P., Hwa Yang, Y., B Zhou, B., & Y Zomaya, A. (2010). A review of ensemble methods in bioinformatics. Current Bioinformatics, 5(4), 296-308.
Zhou, Z-H. (2012), Ensemble methods: foundations and algorithms, Chapman and Hall, New York, NY.
Zięba, M., Tomczak, S. K., & Tomczak, J. M. (2016). Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Systems with Applications, 58, 93-101.
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence, which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
scan QR code to access this article from your mobile device
Contact Us
Faculty of Transportation and Logistics, Istanbul University Beyazit Campus 34452 Fatih/Istanbul/TURKEY
alphanumeric journal has been publishing as "International Peer-Reviewed Journal" every six months since 2013. alphanumeric serves as a vehicle for researchers and practitioners in the field of quantitative methods, and is enabling a process of sharing in all fields related to the operations research, statistics, econometrics and management informations systems in order to enhance the quality on a globe scale.