Recently Coursera, the massive online open course site, announced it will be using keystroke dynamics (KD) to validate identities of users who have successfully completed a course and want to obtain a certificate of completion. That single announcement has, in my opinion, generated the most interest in keystroke dynamics as I have seen over the last 10 years. Having researched keystroke dynamics back in 2004 for my Master's thesis and being an proponent of this technology I wrote up this post as a primer on KD technology.
What is it?
KD is a behavioral biometric technology; the underlying hypothesis is that everyone acquires a unique typing rhythm over time, and this typing rhythm remains relatively stable once it is internalized. Even though KD has not received as much news coverage as other biometric technologies, it has been researched since the early 1980's with published reports and experiments publicly available. Handwriting recognition, and signature recognition are other examples of behavioral biometric technologies.
How does it work?
KD can work in two modes:
1. Fixed work/phrase: A user can only be verified against the specific word/phrase used during enrollment. Coursera is testing systems that are based on fixed word verification.
2. Free text: User verification is not dependent on what word(s) were used during enrollment.
There are two basic measurements that can be extracted when a user is typing:
1. Keystroke press time: The period of time a key is pressed (also called dwell time in literature).
2. Keystroke latency time : The period of time it takes to release one key and press the next key (also called flight time in literature).
Using these two basic measurements you can extract secondary level of detail, for e.g. the period of time it takes to press the first key and the third key, and the period of time it takes to press the first key and the fourth key, the speed of typing, etc. Several research papers ([1] ,[2]) have tested techniques that build additional details based on the two basic measurements.
Other features that can be collected includes the average number of mistakes while typing, but that does not depend on the acquired behavior that traditional KD algorithms are based on.
Why KD?
Apart from the oft stated advantages of biometrics in general, a truly unique advantage of KD is that it is entirely software based; there is no additional hardware required. This reduces the total cost of ownership for the solution. You can use physiological biometric technologies without having to remember a password or an identifier; that is not true for KD (atleast if you are using fixed word mode, as Coursera plans to do). KD should not be viewed as a replacement to passwords, but as multifactor authentication that relies on secret knowledge as well as a behavioral trait. KD does not require a user to deviate from his/her traditional login process, which reduces the burden of learning a new technique for users. The non-intrusive nature of KD also increases its user acceptability.
Challenges
KD is not without its challenges, as is true with all biometric technologies that are deployed in a complex environment.
1. Inteoperability. A user's typing rhythm is acquired over time and is also dependent on the type of keyboard he/she is used to typing on. Changing keyboards can disrupt this acquired typing rhythm and can lead to false rejects. With the increasing use of mobile devices more
2. Behavioral Aspect. Typing rhythm is an acquired behavioral trait, which means that it can be affected by a variety of physiological and psychological factors. Tiredness, anxiety or even a change in body posture can alter a person's typing rhythm.
3. Forgotten password. KD is not immune to the problem of forgotten passwords. If a user forget's his/her password they still have to go through the help desk or other process of re-enrolling.
4. Password length. Password length can have an impact on the efficacy of the recognition algorithm. Is a 7 character password weaker than a 12 character password for KD?
It will be interesting to see how effective is KD for Coursera's implementation, and this is probably be one of the largest uncontrolled pilots of this technology. One of things sorely missing in body of knowledge for KD is results from such implementations, which could be a significant contribution to advancing the state of the art.
What is it?
KD is a behavioral biometric technology; the underlying hypothesis is that everyone acquires a unique typing rhythm over time, and this typing rhythm remains relatively stable once it is internalized. Even though KD has not received as much news coverage as other biometric technologies, it has been researched since the early 1980's with published reports and experiments publicly available. Handwriting recognition, and signature recognition are other examples of behavioral biometric technologies.
How does it work?
KD can work in two modes:
1. Fixed work/phrase: A user can only be verified against the specific word/phrase used during enrollment. Coursera is testing systems that are based on fixed word verification.
2. Free text: User verification is not dependent on what word(s) were used during enrollment.
There are two basic measurements that can be extracted when a user is typing:
1. Keystroke press time: The period of time a key is pressed (also called dwell time in literature).
2. Keystroke latency time : The period of time it takes to release one key and press the next key (also called flight time in literature).
Using these two basic measurements you can extract secondary level of detail, for e.g. the period of time it takes to press the first key and the third key, and the period of time it takes to press the first key and the fourth key, the speed of typing, etc. Several research papers ([1] ,[2]) have tested techniques that build additional details based on the two basic measurements.
Other features that can be collected includes the average number of mistakes while typing, but that does not depend on the acquired behavior that traditional KD algorithms are based on.
Why KD?
Apart from the oft stated advantages of biometrics in general, a truly unique advantage of KD is that it is entirely software based; there is no additional hardware required. This reduces the total cost of ownership for the solution. You can use physiological biometric technologies without having to remember a password or an identifier; that is not true for KD (atleast if you are using fixed word mode, as Coursera plans to do). KD should not be viewed as a replacement to passwords, but as multifactor authentication that relies on secret knowledge as well as a behavioral trait. KD does not require a user to deviate from his/her traditional login process, which reduces the burden of learning a new technique for users. The non-intrusive nature of KD also increases its user acceptability.
Challenges
KD is not without its challenges, as is true with all biometric technologies that are deployed in a complex environment.
1. Inteoperability. A user's typing rhythm is acquired over time and is also dependent on the type of keyboard he/she is used to typing on. Changing keyboards can disrupt this acquired typing rhythm and can lead to false rejects. With the increasing use of mobile devices more
2. Behavioral Aspect. Typing rhythm is an acquired behavioral trait, which means that it can be affected by a variety of physiological and psychological factors. Tiredness, anxiety or even a change in body posture can alter a person's typing rhythm.
3. Forgotten password. KD is not immune to the problem of forgotten passwords. If a user forget's his/her password they still have to go through the help desk or other process of re-enrolling.
4. Password length. Password length can have an impact on the efficacy of the recognition algorithm. Is a 7 character password weaker than a 12 character password for KD?
It will be interesting to see how effective is KD for Coursera's implementation, and this is probably be one of the largest uncontrolled pilots of this technology. One of things sorely missing in body of knowledge for KD is results from such implementations, which could be a significant contribution to advancing the state of the art.