posted on 2024-12-18, 02:50authored byKalu Kaluarachchi
Data privacy and device security are critical concerns for the general public due to the large amounts of sensitive and private data stored on personal devices like desktops, tablets and smartphones. The most common way to protect this data is to secure devices using a login authentication system that uses either passwords, tokens or biometrics. However, one-time authentication has limitations. First, it is cumbersome for users to provide authentication factors on their personal devices whenever access is needed. The second critical security concern is that an attacker can take control once a device is opened after one-time authentication. Continuous authentication is the process of regularly monitoring a person's behaviour during device usage in order to verify their identity. User behaviour can be continuously monitored using the sensors available in modern personal computing devices. These behaviours include typing, swiping or moving with the devices. Such systems can provide an additional layer of security to the devices by providing security during the entire session for an individual with the devices by going beyond the login authentication.
In this thesis, I use the typing pattern of individuals on their personal computing devices, also called as keystroke dynamics, to build continuous authentication models for users on a single device and across multiple devices. This modality has been studied for the last four decades and is a viable authentication system to implement on personal computing devices.
I proposed a new set of features based on the distance between keys on the keyboard, a concept that has not been considered before in keystroke dynamics. The proposed Distance Enhanced Flight Time features (DEFT) combined with traditional keystroke dynamic features give a more comprehensive analysis of typing behaviour. The DEFT model is designed to be device-agnostic, allowing us to evaluate its effectiveness across three commonly used devices: desktop, smartphone, and tablet. The DEFT model outperforms the existing state-of-the-art methods when evaluating its effectiveness across the SU-AIS-BB-MAS and Buffalo datasets. The model achieves accuracy rates exceeding 99% and equal error rates below 10% on all three devices.
Since the cross-device keystroke dynamic is under-explored in the literature, I introduce the TEDxBC model, which utilises an inductive transfer encoder to authenticate tablet users based on their smartphone keystroke patterns. I leverage over 20 keystroke dynamic features, including the newly introduced DEFT features, to train a transfer encoder with substantial source domain (smartphone) data and minimal target domain (tablet) data. I evaluate the proposed approach on participants from the publicly available SU-AIS-BB-MAS dataset. This method achieves an average Equal Error Rate (EER) of 14%. I further apply the biometric menagerie classification, adapted for the transfer learning paradigm, to analyse user performance. Notably, the Doves category demonstrates superior performance in transfer learning, achieving an EER approximately 9% lower than the overall TEDxBC benchmark. This is the first application of the biometric menageries approach to study the performance of transfer learning for keystroke dynamics.
I also examine the capability of predicting demographic features such as age, gender, height, ethnicity, typing style and university major subjects of users using keystroke dynamic features. To achieve that, I study the predictive capability of all four keystroke dynamic feature categories for different types of text (fixed text, free text and combination) and types of devices (desktops, smartphones and tablets). I select the keystroke features using two techniques (Mutual Information and Random Forest). The Mutual Information feature selection performs much better in all the text and device combinations for demographic classification. To address the class imbalance in demographic data, I followed the approach by ensuring the classifier's integrity by preventing exposure to the test set before classification. Seven machine learning classifiers are employed for demographic prediction. The models achieve notable accuracy and F1 scores: 77% for gender prediction on desktops, 83% on smartphones, and 71% on tablets. University subjects are predicted with approximately 80% accuracy across all devices. Typing style prediction reaches 68% accuracy on desktops and 77% on smartphones and tablets. Ethnicity prediction surpasses 90% accuracy for all devices. When identifying the top five discriminating features for each demographic trait, the newly introduced DEFT features consistently emerge as significant predictors.
In summary, this work is the first to enhance keystroke dynamic performance on a single device using distance-enhanced keystroke features (DEFT), develop a cross-device keystroke authentication framework (TEDxBC) with a biometric menagerie performance analysis and a comprehensive analysis of the predictive power of keystroke features for six different demographic traits.<p></p>