TY - JOUR
T1 - A behaviour biometrics dataset for user identification and authentication
AU - Nnamoko, Nonso
AU - Barrowclough, Joseph
AU - Liptrott, Mark
AU - Korkontzelos, Ioannis
N1 - Funding Information:
This research has been carried out as part of the CyberSignature Project, which received two funding rounds from Innovate UK under the Cyber Security Academic Startup Accelerator Programme (CyberASAP) with Project Reference Nos. 10017354 and 10002115. Special thanks to KTN who facilitated the project delivery and to the Computer Science department at Edge Hill University, for providing time and resources to complete research. The authors would also like to acknowledge the 88 participants for their willingness to provide the KMT dynamics dataset.
Publisher Copyright:
© 2022 The Authors
PY - 2022/12/1
Y1 - 2022/12/1
N2 - As e-Commerce continues to shift our shopping preference from the physical to online marketplace, we leave behind digital traces of our personally identifiable details. For example, the merchant keeps record of your name and address; the payment processor stores your transaction details including account or card information, and every website you visit stores other information such as your device address and type. Cybercriminals constantly steal and use some of this information to commit identity fraud, ultimately leading to devastating consequences to the victims; but also, to the card issuers and payment processors with whom the financial liability most often lies. To this end, we recognise that data is generally compromised in this digital age, and personal data such as card number, password, personal identification number and account details can be easily stolen and used by someone else. However, there is a plethora of data relating to a person's behaviour biometrics that are almost impossible to steal, such as the way they type on a keyboard, move the cursor, or whether they normally do so via a mouse, touchpad or trackball. This data, commonly called keystroke, mouse and touchscreen dynamics, can be used to create a unique profile for the legitimate card owner, that can be utilised as an additional layer of user authentication during online card payments. Machine learning is a powerful technique for analysing such data to gain knowledge; and has been widely used successfully in many sectors for profiling e.g., genome classification in molecular biology and genetics where predictions are made for one or more forms of biochemical activity along the genome. Similar techniques are applicable in the financial sector to detect anomaly in user keyboard and mouse behaviour when entering card details online, such that they can be used to distinguish between a legitimate and an illegitimate card owner. In this article, a behaviour biometrics (i.e., keystroke and mouse dynamics) dataset, collected from 88 individuals, is presented. The dataset holds a total of 1760 instances categorised into two classes (i.e., legitimate and illegitimate card owners’ behaviour). The data was collected to facilitate an academic start-up project (called CyberSignature1) which received funding from Innovate UK, under the Cyber Security Academic Startup Accelerator Programme. The dataset could be helpful to researchers who apply machine learning to develop applications using keystroke and mouse dynamics e.g., in cybersecurity to prevent identity theft. The dataset, entitled ‘Behaviour Biometrics Dataset’, is freely available on the Mendeley Data repository.
AB - As e-Commerce continues to shift our shopping preference from the physical to online marketplace, we leave behind digital traces of our personally identifiable details. For example, the merchant keeps record of your name and address; the payment processor stores your transaction details including account or card information, and every website you visit stores other information such as your device address and type. Cybercriminals constantly steal and use some of this information to commit identity fraud, ultimately leading to devastating consequences to the victims; but also, to the card issuers and payment processors with whom the financial liability most often lies. To this end, we recognise that data is generally compromised in this digital age, and personal data such as card number, password, personal identification number and account details can be easily stolen and used by someone else. However, there is a plethora of data relating to a person's behaviour biometrics that are almost impossible to steal, such as the way they type on a keyboard, move the cursor, or whether they normally do so via a mouse, touchpad or trackball. This data, commonly called keystroke, mouse and touchscreen dynamics, can be used to create a unique profile for the legitimate card owner, that can be utilised as an additional layer of user authentication during online card payments. Machine learning is a powerful technique for analysing such data to gain knowledge; and has been widely used successfully in many sectors for profiling e.g., genome classification in molecular biology and genetics where predictions are made for one or more forms of biochemical activity along the genome. Similar techniques are applicable in the financial sector to detect anomaly in user keyboard and mouse behaviour when entering card details online, such that they can be used to distinguish between a legitimate and an illegitimate card owner. In this article, a behaviour biometrics (i.e., keystroke and mouse dynamics) dataset, collected from 88 individuals, is presented. The dataset holds a total of 1760 instances categorised into two classes (i.e., legitimate and illegitimate card owners’ behaviour). The data was collected to facilitate an academic start-up project (called CyberSignature1) which received funding from Innovate UK, under the Cyber Security Academic Startup Accelerator Programme. The dataset could be helpful to researchers who apply machine learning to develop applications using keystroke and mouse dynamics e.g., in cybersecurity to prevent identity theft. The dataset, entitled ‘Behaviour Biometrics Dataset’, is freely available on the Mendeley Data repository.
KW - Behaviour biometrics
KW - Cybersecurity
KW - Digital identity
KW - Identity fraud detection
KW - Keyboard and mouse behaviour
KW - KMT dataset
KW - Machine learning
KW - Payment authentication
UR - http://www.scopus.com/inward/record.url?scp=85141795567&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85141795567&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2022.108728
DO - 10.1016/j.dib.2022.108728
M3 - Article (journal)
AN - SCOPUS:85141795567
SN - 2352-3409
VL - 45
JO - Data in Brief
JF - Data in Brief
M1 - 108728
ER -