Optimizing Text Classification Using Artificial Neural Networks and K-Fold Cross-Validation in High-Volume Data Processing

Callum Revere

pdf

Published: Nov 1, 2024

Callum Revere

Abstract

With the rapid expansion of internet resources and advancements in big data and cloud computing, effective text classification has become essential for managing and extracting meaningful information from vast text datasets. Traditional, labor-intensive text categorization methods are increasingly inadequate for handling modern data demands. This study develops a robust automatic text classification method based on artificial neural networks (ANN) to enhance accuracy and efficiency. By employing web-crawling techniques, 1600 categorized articles across technology, finance, culture, and politics were gathered, and a vector space was created using the bag-of-words model. The ANN model was trained on these features, and K-fold cross-validation was applied to assess the model's performance. The resulting classification model demonstrated acceptable accuracy with a 15% error rate, suggesting that further refinement through K-fold cross-validation could enhance reliability. This approach has significant implications for large-scale, automated text processing across sectors, minimizing manual intervention and optimizing information retrieval.

How to Cite

Revere, C. (2024). Optimizing Text Classification Using Artificial Neural Networks and K-Fold Cross-Validation in High-Volume Data Processing. Journal of Computer Science and Software Applications, 4(7), 15–18. Retrieved from https://mfacademia.org/index.php/jcssa/article/view/169

Issue

Vol. 4 No. 7 (2024)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Mind forge Academia also operates under the Creative Commons Licence CC-BY 4.0. This allows for copy and redistribute the material in any medium or format for any purpose, even commercially. The premise is that you must provide appropriate citation information.

Article Sidebar

Main Article Content

Abstract

Article Details