Detecting Fraudulent Messages with Strong Privacy Protection

This project aims to build a smartphone app that can automatically
detect deceptive text messages (such as “You have won a free flight
to Dubai!! Claim it on etihad.com.xyz”) as well as raise the
awareness of vulnerable customers who have increased the risk of
falling into such traps. A flood of such messages may lead to a sudden
burst of network traffic, which burdens the network infrastructure of
the mobile operator, such as Ooredoo, our main industrial partner and
co-funding provider. Further, vulnerable customers can become victims
of such an attack, which often end up to be a fraud case.

Header Information

Proposal Number:
NPRP10-0208-170408
Program Cycle:
NPRP 10
Submitting Institution Name:
Hamad Bin Khalifa University
Project Status:
Award Active
Start Date:
07 Jul 2018
Lead Investigator:
Dr. Yin Yang
Project Duration:
3 Year(s)
End Date:
07 Jul 2021
SubmissionType:
New
Proposal Title:
Detecting Fraudulent Messages with Strong Privacy Protection

Proposal Description

This project aims to build a smartphone app that can automatically
detect deceptive text messages (such as “You have won a free flight
to Dubai!! Claim it on etihad.com.xyz”) as well as raise the
awareness of vulnerable customers who have increased the risk of
falling into such traps. A flood of such messages may lead to a sudden
burst of network traffic, which burdens the network infrastructure of
the mobile operator, such as Ooredoo, our main industrial partner and
co-funding provider. Further, vulnerable customers can become victims
of such an attack, which often end up to be a fraud case. However, it
is not easy to determine whether a text message is authentic or not on
a single device. As an anecdotal evidence, in 2015 a scientist at HBKU
received a seemingly fake SMS notification about winning a lucky draw
for two business-class flights to Istanbul, and that message turned
out to be genuine: the scientist and his wife happily claimed the
award from QNB, a major bank in Qatar, and spent a nice weekend in
Istanbul. In addition, senders of fraudulent messages often adjust
their behavior in an effort to defeat AI-based fraud detectors,
especially lightweight ones running on a mobile device. Deceptive text
messages can be more accurately recognized if we can detect that it is
sent to a large number of users simultaneously. To do so, we need to
collect and combine information from multiple users; this, however,
will probably raise privacy concerns as users do not want to reveal
the contents of their text messages. Note that although the mobile
operator (Ooredoo in this project) possesses the content of certain
types of text messages such as SMSes, it has no access to contents in
other messaging services such as Whatsapp, which encrypts messages
during transmission. Further, SMSes are highly sensitive data, and are
rarely accessed even inside Ooredoo, in order to protect customers’
privacy. This project proposes a novel solution based on local
differential privacy technology. In particular, this solution does not
collect sensitive personal information such as users’ exact text
contents, and yet it is capable of analyzing the authenticity of a
message through users’ collaborative efforts. For instance, each user
may send to a server randomized versions of n-grams from its text
messages. An n-gram is a sequence of n characters extracted from a
piece of text: for instance, 3-grams for the message “Dubai” include
“Dub”, “uba” and “bai”. If common n-grams are reported from
multiple users, it is likely that a text message containing many such
n-grams was broadcast to these users. This information, in combination
with analysis based on modern natural language processing techniques
such as deep learning, can be used to more accurately predict whether
a text message is fraudulent or authentic. Simultaneously, we plan to
analyze Ooredoo’s data records and interview customers who filed
complaints about fraudulent messages, which helps us identify
vulnerable customers which could be victims of deceptive messages. For
instance, customers of a certain segment (such as seniors) may be more
susceptible to text message fraud. By gathering customers’ smartphone
usage data (such as on-screen behavior, phone and text records) and
cross-referencing with complaints related to deceptive messages, we
can profile each user and assess her or his vulnerability to deceptive
text messages. Such profiling and assessment can help Ooredoo run
targeted awareness campaigns to specific groups of vulnerable
customers. In addition, the app can also incorporate relevant parts of
Qatar’s recently announced data privacy law, and educate users about
their rights. The proposed differential privacy technology can be
applied to eliminate privacy concerns in the user profiling and
vulnerability assessment activities. The proposed project thus aims to
build a client-side mobile app and server-side data analytic tools
with a high technological readiness level, and evaluate the system in
Ooredoo’s operational environment. Once the proposed system is
validated, it can be provided as a free service to Ooredoo’s
customers, which will improve Ooredoo’s customer experience, reduce
text-message fraud cases, and contribute novel differential privacy
technologies to telco industry as well as to the broader data privacy
research community.

Research Area Keywords

Privacy Protection; Mobile Data Analytics; Mobile HCI; Distributed Machine Learning; Fraud Detection

Institution

Institution
Country
Institution Role
Hamad Bin Khalifa University
Qatar
Submitting Institution
ADSC - Advanced Digital Sciences Center
Singapore
Collaborative Institution
Ooredoo
Qatar
Collaborative Institution
National University of Singapore
Singapore
Collaborative Institution

Review and Notes on Prof. Yin Yang’s Research

  • Build a smartphone APP that can automatically detect deceptive
    text messages.
  • Deveptive text messages can be more accurately recognized if we
    can detect that it is sent to a large number of users
    simutalneously
    . BUT HOW? To have the app stay and run in the
    background forever?
  • Local differential privacy technology. Key and the innovation
    part of this project.
  • Modern natural language processing, predict whether a text
    message is fraudulent or authentic.
  • Analyze Ooredo’s data records.
  • The DIFFIRENTIAL privacy technology can be applied to
    eliminate privacy concerns in the user profiling.
  • FINAL-OUTCOME: CLIENT SIDE MOBILE APP and ** SERVER SIDE DATA
    ANALYTIC TOOL**.
  • A Chinese comment: 核心算法 差分算法。另外在分析移动运营商的客户数
    据以及差分算法搜集到的数据的时候,可能会用到神经网络和深度神经网络
    的AI算法。最终的工程产品是一个服务器端的程序和一个手机端的APP。两
    个都比较贴合自己的专业以及背景,予以重点准备。

Key Abilities To Master

  • Android programming with cloud computing. (Very Important for HBKU
    Application.)
  • Applied Statistics and Data Science basics. Hands-on experience
    will be preferred.