In parallel with the advances in technology, digital journalism is preferred more than printed journalism day by day. Due to the fast and up-to-date sense of journalism provided by digital journalism and its ubiquitous accessibility features, it is read more by users. In addition to these advantages provided by digital journalism, it also has some difficulties compared to printed journalism. The stage of preparation and delivery of the news to the user requires more technological knowledge and equipment compared to printed journalism. The processes of title selection, text creation, photo selection and determination of the appropriate news category in the preparation phase of the news are designed to be both faster and user-friendly compared to printed publishing. The news created to be presented to the target audience may belong to one or more of different categories such as economy, politics, sports, technology, and health. The inclusion of the news in the appropriate category provides convenience in terms of reaching the right audience and archiving the news correctly. In this study, news texts were classified according to their categories based on the machine learning methods. In the study, news of five newspapers in three different categories were used. Bayesian classifier and decision tree methods were used to classify the news in the dataset including a total of 10.500 news. In the results of the study, it was observed that the Bayesian classifier classified the news more successfully according to their categories.
Keywords: Category, Classification, Machine Learning, News
Jel Classification: C46
Kayakuş, M., Yiğit Açıkgöz, F. (2022). Classification of News Texts by Categories Using Machine Learning Methods. Alphanumeric Journal, 10(2), 155-166. https://doi.org/10.17093/alphanumeric.1149753
Received: July 27, 2022
Accepted: Oct. 20, 2022
Published: Dec. 31, 2022
2022 Kayakuş, M., Yiğit Açıkgöz, F.
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence, which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
scan QR code to access this article from your mobile device
School of Transportation and Logistics, Istanbul University
Avcilar Campus 34320 Avcilar/Istanbul/TURKEY
Bahadır Fatih Yıldırım, Ph.D.
+ 90 (212) 473 70 00 - 19263
alphanumeric journal has been publishing as "International Peer-Reviewed Journal" every six months since 2013. alphanumeric serves as a vehicle for researchers and practitioners in the field of quantitative methods, and is enabling a process of sharing in all fields related to the operations research, statistics, econometrics and management informations systems in order to enhance the quality on a globe scale.