This project aims to build a smartphone app that can automatically detect deceptive text messages (such as “You have won a free flight to Dubai!! Claim it on etihad.com.xyz”) as well as raise the awareness of vulnerable customers who have increased the risk of falling into such traps. A flood of such messages may lead to a sudden burst of network traffic, which burdens the network infrastructure of the mobile operator, such as Ooredoo, our main industrial partner and co-funding provider. Further, vulnerable customers can become victims of such an attack, which often end up to be a fraud case.
Proposal Number: NPRP10-0208-170408 Program Cycle: NPRP 10 Submitting Institution Name: Hamad Bin Khalifa University Project Status: Award Active Start Date: 07 Jul 2018 Lead Investigator: Dr. Yin Yang Project Duration: 3 Year(s) End Date: 07 Jul 2021 SubmissionType: New Proposal Title: Detecting Fraudulent Messages with Strong Privacy Protection
This project aims to build a smartphone app that can automatically detect deceptive text messages (such as “You have won a free flight to Dubai!! Claim it on etihad.com.xyz”) as well as raise the awareness of vulnerable customers who have increased the risk of falling into such traps. A flood of such messages may lead to a sudden burst of network traffic, which burdens the network infrastructure of the mobile operator, such as Ooredoo, our main industrial partner and co-funding provider. Further, vulnerable customers can become victims of such an attack, which often end up to be a fraud case. However, it is not easy to determine whether a text message is authentic or not on a single device. As an anecdotal evidence, in 2015 a scientist at HBKU received a seemingly fake SMS notification about winning a lucky draw for two business-class flights to Istanbul, and that message turned out to be genuine: the scientist and his wife happily claimed the award from QNB, a major bank in Qatar, and spent a nice weekend in Istanbul. In addition, senders of fraudulent messages often adjust their behavior in an effort to defeat AI-based fraud detectors, especially lightweight ones running on a mobile device. Deceptive text messages can be more accurately recognized if we can detect that it is sent to a large number of users simultaneously. To do so, we need to collect and combine information from multiple users; this, however, will probably raise privacy concerns as users do not want to reveal the contents of their text messages. Note that although the mobile operator (Ooredoo in this project) possesses the content of certain types of text messages such as SMSes, it has no access to contents in other messaging services such as Whatsapp, which encrypts messages during transmission. Further, SMSes are highly sensitive data, and are rarely accessed even inside Ooredoo, in order to protect customers’ privacy. This project proposes a novel solution based on local differential privacy technology. In particular, this solution does not collect sensitive personal information such as users’ exact text contents, and yet it is capable of analyzing the authenticity of a message through users’ collaborative efforts. For instance, each user may send to a server randomized versions of n-grams from its text messages. An n-gram is a sequence of n characters extracted from a piece of text: for instance, 3-grams for the message “Dubai” include “Dub”, “uba” and “bai”. If common n-grams are reported from multiple users, it is likely that a text message containing many such n-grams was broadcast to these users. This information, in combination with analysis based on modern natural language processing techniques such as deep learning, can be used to more accurately predict whether a text message is fraudulent or authentic. Simultaneously, we plan to analyze Ooredoo’s data records and interview customers who filed complaints about fraudulent messages, which helps us identify vulnerable customers which could be victims of deceptive messages. For instance, customers of a certain segment (such as seniors) may be more susceptible to text message fraud. By gathering customers’ smartphone usage data (such as on-screen behavior, phone and text records) and cross-referencing with complaints related to deceptive messages, we can profile each user and assess her or his vulnerability to deceptive text messages. Such profiling and assessment can help Ooredoo run targeted awareness campaigns to specific groups of vulnerable customers. In addition, the app can also incorporate relevant parts of Qatar’s recently announced data privacy law, and educate users about their rights. The proposed differential privacy technology can be applied to eliminate privacy concerns in the user profiling and vulnerability assessment activities. The proposed project thus aims to build a client-side mobile app and server-side data analytic tools with a high technological readiness level, and evaluate the system in Ooredoo’s operational environment. Once the proposed system is validated, it can be provided as a free service to Ooredoo’s customers, which will improve Ooredoo’s customer experience, reduce text-message fraud cases, and contribute novel differential privacy technologies to telco industry as well as to the broader data privacy research community.
Research Area Keywords
Privacy Protection; Mobile Data Analytics; Mobile HCI; Distributed Machine Learning; Fraud Detection
Institution Country Institution Role Hamad Bin Khalifa University Qatar Submitting Institution ADSC - Advanced Digital Sciences Center Singapore Collaborative Institution Ooredoo Qatar Collaborative Institution National University of Singapore Singapore Collaborative Institution
Review and Notes on Prof. Yin Yang’s Research
- Build a smartphone APP that can automatically detect deceptive text messages.
- Deveptive text messages can be more accurately recognized if we can detect that it is sent to a large number of users simutalneously. BUT HOW? To have the app stay and run in the background forever?
- Local differential privacy technology. Key and the innovation part of this project.
- Modern natural language processing, predict whether a text message is fraudulent or authentic.
- Analyze Ooredo’s data records.
- The DIFFIRENTIAL privacy technology can be applied to eliminate privacy concerns in the user profiling.
- FINAL-OUTCOME: CLIENT SIDE MOBILE APP and ** SERVER SIDE DATA ANALYTIC TOOL**.
- A Chinese comment: 核心算法 差分算法。另外在分析移动运营商的客户数 据以及差分算法搜集到的数据的时候，可能会用到神经网络和深度神经网络 的AI算法。最终的工程产品是一个服务器端的程序和一个手机端的APP。两 个都比较贴合自己的专业以及背景，予以重点准备。
Key Abilities To Master
- Android programming with cloud computing. (Very Important for HBKU Application.)
- Applied Statistics and Data Science basics. Hands-on experience will be preferred.