Robust Phishing URL Detection Through Deep Learning and Domain Shift Mitigation
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Rashid, FarizaAbstract
This thesis addresses the challenge of phishing URL detection by focusing on domain shift to improve performance. We show that state-of-the-art classifiers struggle when URLs differ from their training datasets and identify features with distribution shifts through statistical ...
See moreThis thesis addresses the challenge of phishing URL detection by focusing on domain shift to improve performance. We show that state-of-the-art classifiers struggle when URLs differ from their training datasets and identify features with distribution shifts through statistical analysis. To address this, we propose an Unsupervised Domain Adaptation (UDA) framework that aligns features between source and target datasets, enhancing detection accuracy. Additionally, we leverage the reasoning capabilities of large language models (LLMs) to develop a one-shot phishing URL classification framework, demonstrating improved performance under domain shifts. Finally, we integrate these advancements into a federated learning framework, enabling secure, distributed training on private datasets from multiple organisations while overcoming domain shift and leveraging LLMs' contextual understanding.
See less
See moreThis thesis addresses the challenge of phishing URL detection by focusing on domain shift to improve performance. We show that state-of-the-art classifiers struggle when URLs differ from their training datasets and identify features with distribution shifts through statistical analysis. To address this, we propose an Unsupervised Domain Adaptation (UDA) framework that aligns features between source and target datasets, enhancing detection accuracy. Additionally, we leverage the reasoning capabilities of large language models (LLMs) to develop a one-shot phishing URL classification framework, demonstrating improved performance under domain shifts. Finally, we integrate these advancements into a federated learning framework, enabling secure, distributed training on private datasets from multiple organisations while overcoming domain shift and leveraging LLMs' contextual understanding.
See less
Date
2025Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare