×
Home Current Archive Editorial board
Instructions for papers
For Authors Aim & Scope Contact
Original scientific article

RULE-BASED ITERATIVE PREPROCESSING WITH DEEP SIAMESE GRU–BILSTM FOR EFFICIENT DOCUMENT STREAMING

By
K. Ranjit Kumar Orcid logo ,
K. Ranjit Kumar

Assistant Professor, Department of Computer and Information Science, Annamalai University , Annamalai Nagar, Chidambaram, Tamil Nadu , India

S. Thirumaran Orcid logo
S. Thirumaran

Assistant Professor, Department of Computer Application, Alagappa Government Arts College , Karaikudi, Tamil Nadu , India

Abstract

Efficient document streaming requires robust preprocessing and semantic modeling to handle noise, redundancy, and morphological variations in large-scale text data. Existing stemming and document processing techniques often fail to preserve contextual relevance, leading to reduced classification and retrieval performance. In a bid to overcome this drawback, this paper hypothesizes a Rule-based Pre-processing Iterative Stripping model coupled with a Deep Siamese GRU-BiLSTM model. The RPIS systematically eliminates affixes based on linguistic principles and so does the Siamese GRU -BiLSTM model that obtains the bidirectional semantic dependencies between segments of the text. Experiments conducted on benchmark datasets demonstrate that the proposed model achieves 95% training accuracy and 93% validation accuracy, outperforming traditional stemmers and standalone deep learning models. Error statistics values are also much lower, and MSE is 0.012, MAE is 0.008, and RMSE is 0.109. These findings verify that rule-based preprocessing and deep semantic learning are complementary to each other in document streaming accuracy and resilience, which makes the method appropriate to the large-scale management system of documents.

References

1.
Jauhar SK, Priyadarshini S, Pratap S, Paul SK. A literature review on applications of Industry 4.0 in Project Management. Operations Management Research. 2023;16(4):1858–85.
2.
Lee U, Han A, Lee J, Lee E, Kim J, Kim H, et al. Prompt Aloud!: Incorporating image-generative AI into STEAM class with learning analytics using prompt data. Education and Information Technologies. 2023;29(8):9575–605.
3.
Mustoip S, Lestari D, Purwati R. Implementation of STEAM Learning Methods to Develop Collaborative and Creative Characters of Elementary School Students. JPS: Journal of Primary School. 2024;(2):13–20.
4.
Seydali M, Khunjush F, Dogani J. Streaming traffic classification: a hybrid deep learning and big data approach. Cluster Computing. 2024;27(4):5165–93.
5.
Fei Z, West GM, Murray P, Dobie G. CNN-based automated approach to crack-feature detection in steam cycle components. International Journal of Pressure Vessels and Piping. 2024;207:105112.
6.
Arjunan T. Real-Time Detection of Network Traffic Anomalies in Big Data Environments Using Deep Learning Models. International Journal for Research in Applied Science and Engineering Technology. 2024;12(3):844–50.
7.
Duda P, Wojtulewicz M, Rutkowski L. Accelerating deep neural network learning using data stream methodology. Information Sciences. 2024;669:120575.
8.
Babooram L, Fowdur TP. Performance analysis of collaborative real-time video quality of service prediction with machine learning algorithms. International Journal of Data Science and Analytics. 2024;20(2):1513–45.
9.
Pookpanich P, Siriborvornratanakul T. Offensive language and hate speech detection using deep learning in football news live streaming chat on YouTube in Thailand. Social Network Analysis and Mining. 2024;14(1).
10.
Roth HR, Xu Z, Hsieh YT, Renduchintala A, Yang ITC, Zhang Z, et al. Empowering Federated Learning for Massive Models with NVIDIA FLARE. Studies in Computational Intelligence. Springer Nature Switzerland; 2025. p. 1–17.
11.
Wen Y, Liu X, Yu H. Adaptive tree-like neural network: Overcoming catastrophic forgetting to classify streaming data with concept drifts. Knowledge-Based Systems. 2024;293:111636.
12.
Xue P, Chen T, Huang X, Hu Q, Hu J, Zhang H, et al. Prediction of syngas properties of biomass steam gasification in fluidized bed based on machine learning method. International Journal of Hydrogen Energy. 2024;49:356–70.
13.
Baseer KK, Sivakumar K, Veeraiah D, Chhabra G, Kumar Lakineni P, Jahir Pasha M, et al. Healthcare diagnostics with an adaptive deep learning model integrated with the Internet of medical Things (IoMT) for predicting heart disease. Biomedical Signal Processing and Control. 2024;92:105988.
14.
Yaqub ZT, Oboirien BO, Leion H. Process optimization of chemical looping combustion of solid waste/biomass using machine learning algorithm. Renewable Energy. 2024;225:120298.
15.
Khan K. Addressing Fairness and Bias in Machine Learning for Adaptive Video Streaming: Strategies for Enhancing User Experience and Mitigating Algorithmic Discrimination.
16.
Zhou P, Wang L, Liu Z, Hao Y, Hui P, Tarkoma S, et al. A survey on generative ai and llm for video generation, understanding, and streaming. 2024;
17.
Mohandas R, Southern M, O’Connell E, Hayes M. A Survey of Incremental Deep Learning for Defect Detection in Manufacturing. Big Data and Cognitive Computing. 2024;8(1):7.
18.
Horiguchi S, Dohi K, Kawaguchi Y. Streaming Active Learning for Regression Problems Using Regression via Classification. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2024. p. 4955–9.
19.
Chang CC, Su L. Beast: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2024. p. 396–400.
20.
Raca D, Zahran AH, Sreenan CJ, Sinha RK, Halepovic E, Gopalakrishnan V. Device-Based Cellular Throughput Prediction for Video Streaming: Lessons From a Real-World Evaluation. IEEE Transactions on Machine Learning in Communications and Networking. 2024;2:318–34.
21.
Zheng H, Liu Y, Hsu C, Yeh T. Streamnet: memory-efficient streaming tiny deep learning inference on the microcontroller. 2023;37160–72.
22.
Kumar T, Sharma P, Tanwar J, Alsghier H, Bhushan S, Alhumyani H, et al. Cloud‐based video streaming services: Trends, challenges, and opportunities. CAAI Transactions on Intelligence Technology. 2024;9(2):265–85.
23.
Chen Y, Ma L, Jing L, Yu J. BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection. Pattern Recognition. 2024;152:110472.
24.
Selmy HA, Mohamed HK, Medhat W. A predictive analytics framework for sensor data using time series and deep learning techniques. Neural Computing and Applications. 2024;36(11):6119–32.
25.
Huang X, Qiao C. Enhancing Computational Thinking Skills Through Artificial Intelligence Education at a STEAM High School. Science & Education. 2022;33(2):383–403.

Citation

This is an open access article distributed under the  Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.