ARCHIVES

Original Article

Scalable Email Spam Detection Using BiLSTM with Large-Scale Hybrid Datasets

Patinavalasa Durga Prasad1 Suneel Kumar Duvvuri2
1 Student, M.Sc (Computer Science), Government College (Autonomous), Rajahmundry, Andhra Pradesh, India. 2 Assistant Professor, Department of Computer Science, Government College (Autonomous), Rajahmundry, Andhra Pradesh, India.

Published Online: March-April 2026

Pages: 96-105

References

1. “Understanding Privacy,” 2003. [Online]. Available: www.ssa.gov/
2. M. Goodman, “Spam: Technologies and Politics,” ACM Queue, vol. 2, no. 4, pp. 48–57, 2004, [Online]. Available:
https://queue.acm.org/detail.cfm?id=1035623
3. T. Jagatic, N. Johnson, M. Jakobsson, and F. Menczer, “Social Phishing *,” 2005.
4. A. Ramachandran and N. Feamster, “Understanding the network-level behavior of spammers,” SIGCOMM Comput. Commun. Rev.,
vol. 36, no. 4, pp. 291–302, Aug. 2006, doi: 10.1145/1151659.1159947.
5. P. Graham, “A Plan for Spam,” 2002.
6. Androutsopoulos, J. Koutsias, K. V Chandrinos, and C. D. Spyropoulos, “An experimental comparison of naive Bayesian and keyword-
based anti-spam filtering with personal e-mail messages,” in Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, in SIGIR ’00. New York, NY, USA: Association for Computing Machinery, 2000,
pp. 160–167. doi: 10.1145/345508.345569.
7. J. Ramos, “Using TF-IDF to Determine Word Relevance in Document Queries,” 2003. [Online]. Available:
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.1424
8. Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. Neural Netw.,
vol. 5, no. 2, pp. 157–166, 1994, doi: 10.1109/72.279181.
9. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 2013, [Online].
Available: http://arxiv.org/abs/1301.3781
10. J. Pennington, R. Socher, and C. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference
on Empirical Methods in Natural Language Processing (EMNLP), A. Moschitti, B. Pang, and W. Daelemans, Eds., Doha, Qatar:
Association for Computational Linguistics, Oct. 2014, pp. 1532–1543. doi: 10.3115/v1/D14-1162.
11. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi:
10.1162/neco.1997.9.8.1735.
12. A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM networks,” in Proceedings. 2005 IEEE
International Joint Conference on Neural Networks, 2005., 2005, pp. 2047–2052 vol. 4. doi: 10.1109/IJCNN.2005.1556215.
13. M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, “A Bayesian Approach to Filtering Junk E-Mail,” 1998. [Online]. Available:
www.aaai.org
14. V. Metsis and G. Paliouras, “Spam Filtering with Naive Bayes-Which Naive Bayes? *.” [Online]. Available:
http://www.iit.demokritos.gr/skel/i-config/
15. H. Drucker, D. Wu, and V. N. Vapnik, “Support vector machines for spam categorization,” IEEE Trans. Neural Netw., vol. 10, no. 5,
pp. 1048–1054, 1999, doi: 10.1109/72.788645.
16. E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. Adetunmbi, and O. E. Ajibuwa, “Machine learning for email spam
filtering: review, approaches and open research problems,” Heliyon, vol. 5, no. 6, Jun. 2019, doi: 10.1016/j.heliyon.2019.e01802.
17. P. Schone and D. Jurafsky, “Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem?”
18. D. Debarr and H. Wechsler, “Spam Detection using Clustering, Random Forests, and Active Learning.” [Online]. Available:
http://trec.nist.gov/pubs/trec16/papers/SPAM.OVERVIEW1
19. Y. Kim, “Convolutional Neural Networks for Sentence Classification.” [Online]. Available: http://nlp.stanford.edu/sentiment/20. P. Bhuvaneshwari, A. N. Rao, and Y. H. Robinson, “Spam review detection using self attention based CNN and bi-directional LSTM,”
Multimed. Tools Appl., vol. 80, no. 12, pp. 18107–18124, 2021, doi: 10.1007/s11042-021-10602-y.
21. “Spam Detection using Recurrent Neural Networks,” International Journal for Research in Engineering Application & Management,
pp. 313–318, Apr. 2020, doi: 10.35291/2454-9150.2020.0305.
22. Y. LeCun and G. Hinton, “Deep Learning,” Nature, vol. 521, pp. 436–444, Mar. 2015, doi: 10.1038/nature14539.
23. A. Barushka and P. Hájek, “Spam filtering using integrated distribution-based balancing approach and regularized deep neural
networks,” Applied Intelligence, vol. 48, Mar. 2018, doi: 10.1007/s10489-018-1161-y.
24. A. Graves, “Supervised Sequence Labelling,” in Supervised Sequence Labelling with Recurrent Neural Networks, A. Graves, Ed.,
Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 5–13. doi: 10.1007/978-3-642-24797-2_2.
25. Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical Attention Networks for Document Classification.”
26. D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” May 2016, [Online].
Available: http://arxiv.org/abs/1409.0473
27. M. Wiechmann, “Enron Spam Dataset,” 2020.
28. S. Baheti, “Email Spam Balanced Dataset,” 2020.
29. NLTK Project, “Natural Language Toolkit (NLTK) Documentation,” 2024.
30. Keras Team, “Keras Documentation,” 2024.
31. TensorFlow Team, “TensorFlow Documentation,” 2024.
32. Scikit-learn Developers, “Scikit-learn: Machine Learning in Python Documentation,” 2024.
33. F. Chollet, Deep Learning with Python. Shelter Island, NY, USA: Manning Publications, 2017.
34. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

Related Articles

2026

A Strategic Framework for Depth-Dependent Hydroelectric Conversion along the Indian Coastline

2026

Reimagining Development in India: A Critical Analysis of the Viksit Bharat Vision

2026

AI-Enabled Image Description: Bridging the Gap for the Visually Impaired

2026

Perceived Occupational Risks of Emergency Medical Services Personnel

2026

Origin, Growth and recent Development of Integrated Reporting (IR): A theoretical Review

2026

Smart Hostel Management System

Share Article

X
LinkedIn
Facebook
WhatsApp

Or copy link

https://ijrtmr.com/archives/10.59256/ijrtmr.20260602016

*Instagram doesn't support direct link sharing from web. Copy the link and share it in your Instagram story or post.