Paper 01
A novel approach for phishing detection using NLP
University of Alabama | 2025
Phishing emails continue to pose an immediate and modern threat to global cybersecurity. This paper investigates the effectiveness of various natural language processing and machine learning techniques for detecting phishing emails. Three NLP techniques (n-grams, bag-of-words, and term frequency-inverse document frequency) are evaluated across three machine learning models: logistic regression, random forest, and support vector machine. We apply 18 unique model-vectorization combinations to a dataset of almost 82,500 emails and evaluate them on accuracy, precision, recall, and F1 score. The strongest configuration combines TF-IDF with a (1,2) n-gram range and an SVM, reaching 99.19% accuracy and reducing error rate by 38.17% compared with the second-best setup.
Index Terms - cybersecurity, detection, feature extraction, logistic regression, machine learning, natural language processing, phishing email, random forest, spam, support vector machines
Paper 02
Rapid Literature Review of Reinforcement Learning and Large Language Model Techniques for
Software Engineering Testing and Bug Detection
University of Alabama | 2025
Abstract - As software systems become more complex, effective testing requires methods that are both efficient and adaptable. This literature review synthesizes recent work on reinforcement learning and large language models for software testing and bug detection. Reinforcement learning excels in dynamic environments such as continuous integration, autonomous driving, and cybersecurity because it can prioritize tests and uncover subtle faults through adaptive exploration. Large language models, including GPT-3 and Llama-2, are strong in code-centric tasks such as test generation, oracle support, and bug detection. The review compares both approaches across environment dynamics, data availability, interpretability, and delivery constraints, and argues that hybrid strategies may deliver the strongest practical outcomes.
Index Terms - software testing methods, reinforcement learning, large language models, test automation, test case generation, test coverage, test oracle generation, bug detection, fuzz testing, mutation testing, vulnerability detection