Career Profile
Experiences
- Designed and implemented the first secure and usable Audio Adversarial CAPTCHA system to improve the security of real-world audio CAPTCHAs against state-of-the-art Automatic Speech Recognition (ASR) models
- Published and presented research at various top-tier computer security workshops and conferences
- Assisted students with coding assignments and projects for the Introduction to Cyber Security (CMPS 315) course
- Assisted students with coding assignments and projects for the Linux System Administration and Maintenance (INFX 450) course
- Conducted research on the robustness of widely used image CAPTCHAs against current machine learning and deep learning technologies
- Published and presented research at various top-tier computer security workshops and conferences
Projects
Privacy-Preserving Federated Learning for Minimized fNIRS Data (funded by Facebook)
- Leading a 3-person team to develop data loaders, models, and training strategies to learn from limited fNIRS data in a Federated Learning setting
- Implementing differential privacy and other privacy-preserving machine learning techniques to prevent the privacy leakage of training data
- Tech stack: PyTorch, PySyft, Opacus
Deep Learning attack against the hCaptcha system
- hCaptha system is the 2nd most widely used CAPTCHA service in the US
- Developed a system employing deep learning models and computer vision tools to break hCaptcha
- Evaluated the effectiveness and efficiency of the system by solving hCaptcha challenges automatically from live websites with over 95% accuracy
- Tech stack: Python, PyTorch, JavaScript, Puppeteer
Deep Learning attack against the Google's image reCAPTCHA v2
- reCAPTCHA v2 is the most popular CAPTCHA service on the entire Internet
- Developed a fully automated system utilizing web automation software and an object detection model to break Google’s reCAPTCHA v2 with over 83% accuracy
- Tech stack: JavaScript, Selenium/Puppeteer, Python, TensorFlow, C, Darknet
Automated Extraction of Cyber Threat Intelligence from unstructured data
- Developed a pipeline for scraping, preprocessing, and cleaning unstructured data from online hackers’ forums
- Implemented both ML (Logistic Regression, Random Forest, Decision Tree, k-NN, etc.) and DL models to classify the forum posts into different threat categories. Utilized NLP and Top Modeling to uncover current cyber threats
- Tech stack: Scikit-learn, Word2Vec, NLTK, SpaCy, Gensim, Keras
Publications
2022 IEEE European Symposium on Security and Privacy (EuroS&P), 2022.
15th IEEE Workshop on Offensive Technologies (WOOT), 2021
23rd International Symposium on Research in Attacks, Intrusions, and Defenses (RAID), 2020
BMC Medical Informatics and Decision Making, 2021