TY - JOUR
T1 - Email Classification using Behavior and Time Features
AU - Shao, Yequin
AU - Shi, Quan
AU - XIAO, Yanghua
AU - Bessis, Nik
AU - Norrington, Peter
PY - 2017/5/31
Y1 - 2017/5/31
N2 - The various forms, flexible sending tricks and tremendous number of spam emails have brought great challenges to accurate email classification. In this paper, we present a behavior- and timefeature- based email classification method. Based on email logs, email social networks are built through the extraction of entities and relations from the email records using the MapReduce model. By combining behavior features from social networks and time features from email sending intervals, we adopt a Support Vector Machine based classifier to identify spammers and non-spammers. Compared with the current email classification methods, the advantages of our method are: 1) in addition to the behavior-based features, our method integrates the time feature to facilitate email classification; 2) to efficiently handle the vast number of emails, we employ the MapReduce model to extract the
behavior- and time-based features on the email social network. Experiments on real email data of three years show that the proposed method achieves better classification accuracy.
AB - The various forms, flexible sending tricks and tremendous number of spam emails have brought great challenges to accurate email classification. In this paper, we present a behavior- and timefeature- based email classification method. Based on email logs, email social networks are built through the extraction of entities and relations from the email records using the MapReduce model. By combining behavior features from social networks and time features from email sending intervals, we adopt a Support Vector Machine based classifier to identify spammers and non-spammers. Compared with the current email classification methods, the advantages of our method are: 1) in addition to the behavior-based features, our method integrates the time feature to facilitate email classification; 2) to efficiently handle the vast number of emails, we employ the MapReduce model to extract the
behavior- and time-based features on the email social network. Experiments on real email data of three years show that the proposed method achieves better classification accuracy.
KW - Classification
KW - Email spam
KW - Social network
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=85020428909&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85020428909&partnerID=8YFLogxK
U2 - 10.6138/JIT.2017.18.3.20140403
DO - 10.6138/JIT.2017.18.3.20140403
M3 - Article (journal)
SN - 1607-9264
VL - 18
SP - 463
EP - 472
JO - Journal of Internet Technology
JF - Journal of Internet Technology
IS - 3
ER -