Optimizing Text Classification Using Artificial Neural Networks and K-Fold Cross-Validation in High-Volume Data Processing

Main Article Content

Callum Revere

Abstract

With the rapid expansion of internet resources and advancements in big data and cloud computing, effective text classification has become essential for managing and extracting meaningful information from vast text datasets. Traditional, labor-intensive text categorization methods are increasingly inadequate for handling modern data demands. This study develops a robust automatic text classification method based on artificial neural networks (ANN) to enhance accuracy and efficiency. By employing web-crawling techniques, 1600 categorized articles across technology, finance, culture, and politics were gathered, and a vector space was created using the bag-of-words model. The ANN model was trained on these features, and K-fold cross-validation was applied to assess the model's performance. The resulting classification model demonstrated acceptable accuracy with a 15% error rate, suggesting that further refinement through K-fold cross-validation could enhance reliability. This approach has significant implications for large-scale, automated text processing across sectors, minimizing manual intervention and optimizing information retrieval.

Article Details

How to Cite
Revere, C. (2024). Optimizing Text Classification Using Artificial Neural Networks and K-Fold Cross-Validation in High-Volume Data Processing. Journal of Computer Science and Software Applications, 4(7), 15–18. Retrieved from https://mfacademia.org/index.php/jcssa/article/view/169
Section
Articles