Efficient document streaming requires robust preprocessing and semantic modeling to handle noise, redundancy, and morphological variations in large-scale text data. Existing stemming and document processing techniques often fail to preserve contextual relevance, leading to reduced classification and retrieval performance. In a bid to overcome this drawback, this paper hypothesizes a Rule-based Pre-processing Iterative Stripping model coupled with a Deep Siamese GRU-BiLSTM model. The RPIS systematically eliminates affixes based on linguistic principles and so does the Siamese GRU -BiLSTM model that obtains the bidirectional semantic dependencies between segments of the text. Experiments conducted on benchmark datasets demonstrate that the proposed model achieves 95% training accuracy and 93% validation accuracy, outperforming traditional stemmers and standalone deep learning models. Error statistics values are also much lower, and MSE is 0.012, MAE is 0.008, and RMSE is 0.109. These findings verify that rule-based preprocessing and deep semantic learning are complementary to each other in document streaming accuracy and resilience, which makes the method appropriate to the large-scale management system of documents.
Jauhar SK, Priyadarshini S, Pratap S, Paul SK. A literature review on applications of Industry 4.0 in Project Management. Operations Management Research. 2023 Dec;16(4):1858-85.
2.
Lee U, Han A, Lee J, Lee E, Kim J, Kim H, Lim C. Prompt Aloud!: Incorporating image-generative AI into STEAM class with learning analytics using prompt data. Education and Information Technologies. 2024 Jun;29(8):9575-605.
3.
Mustoip S, Lestari D, Purwati R. Implementation of STEAM Learning Methods to Develop Collaborative and Creative Characters of Elementary School Students. JPS: Journal of Primary School. 2024 Sep 3;1(2):13-20.
4.
Seydali M, Khunjush F, Dogani J. Streaming traffic classification: a hybrid deep learning and big data approach. Cluster Computing. 2024 Jul;27(4):5165-93.
5.
Fei Z, West GM, Murray P, Dobie G. CNN-based automated approach to crack-feature detection in steam cycle components. International Journal of Pressure Vessels and Piping. 2024 Feb 1;207:105112.
6.
Arjunan T. Real-time detection of network traffic anomalies in big data environments using deep learning models. International Journal for Research in Applied Science and Engineering Technology. 2024 Mar;12(9):10-22214.
7.
Duda P, Wojtulewicz M, Rutkowski L. Accelerating deep neural network learning using data stream methodology. Information Sciences. 2024 May 1;669:120575.
8.
Babooram L, Fowdur TP. Performance analysis of collaborative real-time video quality of service prediction with machine learning algorithms. International Journal of Data Science and Analytics. 2025 Aug;20(2):1513-45.
9.
Pookpanich P, Siriborvornratanakul T. Offensive language and hate speech detection using deep learning in football news live streaming chat on YouTube in Thailand. Social Network Analysis and Mining. 2024 Jan 3;14(1):18.
10.
Roth HR, Xu Z, Hsieh YT, Renduchintala A, Yang IT, Zhang Z, Wen Y, Yang S, Lu K, Kersten K, Ricketts C. Empowering federated learning for massive models with NVIDIA flare. In Federated Learning Systems: Towards Privacy-Preserving Distributed AI 2025 Apr 27 (pp. 1-17). Cham: Springer Nature Switzerland.
11.
Wen Y, Liu X, Yu H. Adaptive tree-like neural network: Overcoming catastrophic forgetting to classify streaming data with concept drifts. Knowledge-Based Systems. 2024 Jun 7;293:111636.
12.
Xue P, Chen T, Huang X, Hu Q, Hu J, Zhang H, Yang H, Chen H. Prediction of syngas properties of biomass steam gasification in fluidized bed based on machine learning method. International Journal of Hydrogen Energy. 2024 Jan 2;49:356-70.
13.
Baseer KK, Sivakumar K, Veeraiah D, Chhabra G, Lakineni PK, Pasha MJ, Gandikota R, Harikrishnan G. Healthcare diagnostics with an adaptive deep learning model integrated with the Internet of medical Things (IoMT) for predicting heart disease. Biomedical Signal Processing and Control. 2024 Jun 1;92:105988.
14.
Yaqub ZT, Oboirien BO, Leion H. Process optimization of chemical looping combustion of solid waste/biomass using machine learning algorithm. Renewable Energy. 2024 May 1;225:120298.
15.
Khan K. Addressing fairness and bias in machine learning for adaptive video streaming: strategies for enhancing user experience and mitigating algorithmic discrimination.International Journal of Multidisciplinary Research and Publications (IJMRAP). 2024;6(7):113–120.
16.
Zhou P, Wang L, Liu Z, Hao Y, Hui P, Tarkoma S, Kangasharju J. A survey on generative ai and llm for video generation, understanding, and streaming. arXiv preprint arXiv:2404.16038. 2024 Jan 30.
17.
Mohandas R, Southern M, O’Connell E, Hayes M. A survey of incremental deep learning for defect detection in manufacturing. Big Data and Cognitive Computing. 2024 Jan 5;8(1):7.
18.
Horiguchi S, Dohi K, Kawaguchi Y. Streaming active learning for regression problems using regression via classification. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024 Apr 14 (pp. 4955-4959). IEEE.
19.
Chang CC, Su L. Beast: Online joint beat and downbeat tracking based on streaming transformer. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024 Apr 14 (pp. 396-400). IEEE.
20.
Raca D, Zahran AH, Sreenan CJ, Sinha RK, Halepovic E, Gopalakrishnan V. Device-based cellular throughput prediction for video streaming: lessons from a real-world evaluation. IEEE Transactions on Machine Learning in Communications and Networking. 2024 Mar 1;2:318-34.
21.
Zheng HS, Liu YY, Hsu CF, Yeh TT. Streamnet: memory-efficient streaming tiny deep learning inference on the microcontroller. Advances in Neural Information Processing Systems. 2023 Dec 15;36:37160-72.
Chen Y, Ma L, Jing L, Yu J. BSDP: Brain-inspired streaming dual-level perturbations for online open world object detection. Pattern Recognition. 2024 Aug 1;152:110472.
24.
Selmy HA, Mohamed HK, Medhat W. A predictive analytics framework for sensor data using time series and deep learning techniques. Neural Computing and Applications. 2024 Apr;36(11):6119-32.
25.
Huang X, Qiao C. Enhancing computational thinking skills through artificial intelligence education at a STEAM high school. Science & Education. 2024 Apr;33(2):383-403.
The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.