Pages

Friday, January 11, 2013

You Are How You Type

Recently Coursera, the massive online open course site, announced it will be using keystroke dynamics (KD) to validate identities of users who have successfully completed a course and want to obtain a certificate of completion. That single announcement has, in my opinion, generated the most interest in keystroke dynamics as I have seen over the last 10 years. Having researched keystroke dynamics back in 2004 for my Master's thesis and being an proponent of this technology I wrote up this post as a primer on KD technology.

What is it? 
KD is a behavioral biometric technology; the underlying  hypothesis is that everyone acquires a unique typing rhythm over time, and this typing rhythm remains relatively stable once it is internalized. Even though KD has not received as much news coverage as other biometric technologies, it has been researched since the early 1980's with published reports and experiments publicly available. Handwriting recognition, and signature recognition are other examples of behavioral biometric technologies.

How does it work?
KD can work in two modes:
1. Fixed work/phrase: A user can only be verified against the specific word/phrase used during enrollment. Coursera is testing systems that are based on fixed word verification.
2. Free text:  User verification is not dependent on what word(s) were used during enrollment.

There are two basic measurements that can be extracted when a user is typing:
1. Keystroke press time: The period of time  a key is pressed (also called dwell time in literature).
2. Keystroke latency time : The period of time it takes to release one key and press the next key (also called flight time in literature).

Using these two basic measurements you can extract secondary level of detail, for e.g. the period of time it takes to press the first key and the third key, and the period of time it takes to press the first key and the fourth key, the speed of typing, etc. Several  research papers ([1] ,[2]) have tested techniques that build additional details based on the two basic measurements.

Other features that can be collected includes the average number of mistakes while typing, but that does not depend on the acquired behavior that traditional KD algorithms are based on.

Why KD?
Apart from the oft stated advantages of biometrics in general, a truly unique advantage of KD is that it is entirely software based; there is no additional hardware required. This reduces the total cost of ownership for the solution. You can use physiological biometric technologies without having to remember a password or an identifier; that is not true for KD (atleast if you are using fixed word mode, as Coursera plans to do). KD should not be viewed as a replacement to passwords, but as multifactor authentication that relies on secret knowledge as well as a behavioral trait. KD does not require a user to deviate from his/her traditional login process, which reduces the burden of learning a new technique for users. The non-intrusive nature of KD also increases its user acceptability.

Challenges
KD is not without its challenges, as is true with all biometric technologies that are deployed in a complex environment.
1. Inteoperability. A user's typing rhythm is acquired over time and is also dependent on the type of keyboard he/she is used to typing on. Changing keyboards can disrupt this acquired typing rhythm and can lead to false rejects. With the increasing use of mobile devices more
2. Behavioral Aspect. Typing rhythm is an acquired behavioral trait, which means that it can be affected by a variety of physiological and psychological factors. Tiredness, anxiety or even a change in body posture can alter a person's typing rhythm.
3. Forgotten password. KD is not immune to the problem of forgotten passwords. If a user forget's his/her password they still have to go through the help desk or other process of re-enrolling.
4. Password length. Password length can have an impact on the efficacy of the recognition algorithm. Is a 7 character password weaker than a 12 character password for KD?

It will be interesting to see how effective is KD for Coursera's implementation, and this is probably be one of the largest uncontrolled pilots of this technology. One of things sorely missing in body of knowledge for KD is results from such implementations, which could be a significant contribution to advancing the state of the art. 

Friday, September 28, 2012

Common Vulnerabilities and Exposures (CVE) for Biometrics

Biometric technologies are attracting increasing attention as a means of enhacing security (implying authentication & authorization) for mobile devices and applications. The obvious question that comes up is how secure are these biometric technologies and what is the scope of vulnerabilities in such technologies. Several researchers have published articles and frameworks for over a decade on this topic but I could not find a standardized framework for representing them across fingerprint, face, iris and other recognition technologies or evaluating severity of these vulnerabilities. US-CERT uses the Common Vulnerabilities and Exposure (CVE) listing to standardize descriptions and evaluation of cyber security vulnerabilities and I searched their vulnerabilities database for biometrics vulnerabilities. The resulting search did not provide a single hit, and this indicates the need for a CVE style listing for biometric products and technologies. A biometrics CVE would not only serve as a public resource of known vulnerabilities, but also allow vulnerability assessments to use a common identifier and allow end-users identify patching information. The Common Vulnerability Scoring System (CVSS) is another important component necessary to communicate, in a standardized manner, severity of the vulnerability. An industry accepted framework for generating CVSS would be extremely valuable for organizations as they consider the management and upkeep aspects of  large scale biometric systems. A well understood and widely accepted framework for enumerating vulnerabilities and prioritizing them based on severity is critical to widespread adoption of biometric technologies. Over the next few postings I will write about how CVE and CVSS frameworks can be applied to biometric technologies. Comments, suggestions and any inputs on these topics are welcome!
 

Monday, August 27, 2012

Information Flow Mapping & Detection

In my previous blog posting I discussed the applicability of information flow model (IFM) for assessing security and privacy policies of biometric systems. Implementing this model requires the ability to monitor exchange of content between hosts, which can be either on internal or external networks. Once implemented the IFM can be used for: 
  • testing compliance of information exchange policies for biometric information
  • monitoring information flow path of biometric information
  • detecting unauthorized leakage of biometric
Recently I had an opportunity to use Fidelis XPS, which is designed and used for malware threat detection and prevention. One of the core product capabilities allows a user to setup a rule for detecting string patterns in the information being exchanged between two hosts. If the rule detects presence of the string pattern in any information flowing between two hosts then an alarm is generated and further preventive action can be taken. To test the functionality of monitoring biometric information flow between two hosts I downloaded INCITS 378 dataset from the NIST website consisting of standardized finger minutiae templates. All templates conforming to this standard have the string “FMR” embedded in it, and a rule was setup to detect any files with this string pattern in it.



The files were downloaded in gzip format over HTTP and Fidelis XPS successfully detected 100% of the files. Although this was quite a simple experiment it highlights existing technical capabilities for creating and implementing information flow models. Such products can also be deployed to prevent leakage of personally identifiable information to unauthorized recipients.

There are a few challenges that need to be addressed for a comprehensive IFM, including : language for expressing exchange policies, getting buy-in from all entities in the ecosystem, and automating enforcement of policies.   

Comments and discussions are welcome!

Sunday, August 5, 2012

Sensor Level Security

Last week an attack on iris recognition systems was described at the Black Hat conference. The attack made several assumptions : access to the registered template, access to highly granular similarity score from the matching subsystem, absence of managerial controls against multiple attempts and no liveness detection at the acquisition sensor. The attack described is a classic "hill climbing attack"; going through several iterations of generating a synthetic iris image, verifying it against the registered template and utilizing the similarity score to generate a new synthetic iris image that eventually will be close enough to the registered template.

Although the underlying idea employed by this attack is not new, it has brought attention to the need to design security controls around these attacks. The first one is to use liveness detection to ensure that a synthetic sample cannot be provided to the sensor. A hill climbing attack can only be successful if the synthetic sample is accepted by the sensor. Liveness detection techniques include measuruing biological features to ascertain that the source is live, as well as simple challenge response actions which only a live human can complete. Another technique to prevent hill climbing attacks is to provide a coarse response for the verification attempt. For example, the UID system implemented in India, which utilizes biometric technologies, only provides a "yes/no" response for each verification attempt.

From a practitioners standpoint these type of attacks can be mitigated using appropriate operational controls and providing only the amount of information necessary for the system to operate. Please feel free to leave your comments!

Wednesday, June 27, 2012

Security Analysis Using the Biometric Information Lifecycle


Irrespective of the type of information being stored in cloud systems, there is a growing demand to give user's control over their information. What typically used to involve assurance of confidentiality, integrity and availability of systems now has an added dimension of information flow among various systems, and ability of these systems to use 3rd party information for a range of services. When you view biometric information, which will eventually be used for identity and access management services, through this construct the need to provide user's control over their PII, or atleast be assured of how the data is being used, becomes evident. The biometric template protection standard (ISO/IEC 24745) has proposed a framework to provide information privacy to end users by creating a five stage information lifecycle as a basis for enforcement. The lifecycle comprises of the following stages: data collection, data storage, data usage, data archiving and data disposal. The application of this lifecycle to a state transition model provides a basis for a variety of complex privacy and security discussions. This idea is discussed in depth in this paper, and given below is a simple example of how to use the state transition model for analyzing biometric information privacy and security.

Consider an OpenID solution based on biometrics where the relying party, or the consuming service, is a financial institution and the identity provider is a separate service. Also assume there exists a policy which does not allow the identity service provider to store the data collected for verification purposes, but allows it store data collected during enrolment. The normal use case for such a solution would require the user to first enrol with identity service provider where the biometric information is collected and then stored for future use. When the user wants to access his financial institution, he is redirected to the identity service provider to verify his credentials. The identity service provider compares the biometric samples and provides a pass/fail answer to the financial institution, which then decides if the user can access its services. The state transition models for this use case is shown in Figure 1.


No doubt this is an extremely simple example, and more complex solutions can use the model for analyzing effectiveness of security controls and identifying vulnerabilities by mapping the flow of biometric information using a state transition model. This model could be useful when the technical architecture is being designed for a solution, or also as a real time detection and prevention control based on the state transition rules that are allowed for the biometric information. How to implement this framework in an automated manner that is efficient and practical is an open challenge, and it would be great to hear thoughts about a practical way of implementing it.
As always, thoughts and comments are much appreciated!

Thursday, June 7, 2012

Biometric Credentials in Federated Systems

There is an enormous amount of sensitive information that is exchanged and stored among internet services for a variety of purposes - ecommerce, internet banking, social networking etc. It has become extremely easy to share this sensitive information indirectly withother services. In monolithic systems users would have to establish a new identity credential for each service, but digital identities can be shared among services, sometimes without the consent or knowledge of the owner of the digital identity. Federated identity management frameworks serve to increase trusted portability of identity information across multiple domains, and this notion has been embraced by various initiatives such as The National Strategy for Trusted Identities in Cyberspace (NSTIC), Kantara Initiative, and OpenID among others.

Biometric systems use information like fingerprints and face images, which is considered to be personally identifiable information (PII), and this raises some very important questions in the context of federated identity management.  
  1. Where is the biometric data going to be stored?
  2. Who is the eventual owner of this information?
  3. With whom will this information be shared?
  4. How will this information be used?

Reducing the risk of unwitting exposure of PII in a federated framework requires alignment of security controls and privacy policies. Security analysis in the biometrics domain is quite a well researched area  focused mainly on: sub-components of the biometric system (acquisition, feature extraction, feature storage, feature matching and decision making), transmission of data between the different sub-components and biometric processes (enrollment and recognition). This analysis is sufficient for self-contained monolithic systems, but not for biometric systems which are part of an identity ecosystem. Such analysis is excellent for creating security controls which protects “data at rest” and “data in use”, but it does not lend context to data. Questions such as if a matching operation on biometric data is authorized, or if processed biometric data can be stored in multiple locations, including mobile devices cannot be answered efficiently and in scalable manner using existing security analysis techniques. In the light of such questions biometric system security analysis is necessary, but is no longer sufficient.

Information Lifecycle Management (ILM) provides a means of enforcing security controls on each phase of the information lifecycle. ILM has been used effectively to manage enterprise information assets in distributed systems. Security analysis of all stages of the information lifecycle can provide insight into vulnerabilities at each stage of the lifecycle and appropriate control objectives that should be applied. The information flow model can be extended to map transitions from one stage to the next as well as transfer from one system to another. There are several benefits of creating a security analysis framework based on such a model:


  1. Discovery: Mapping out possible information flow routes as information transitions between stages of the information lifecycle and between different systems
  2. Compliance: Translating policy statements into enforceable security controls thereby ensuring compliance
  3. Monitoring: Identifying current state of various information assets


Privacy is another key element which has to be addressed in a federated framework. The goal should be to allow users to determine what information is revealed to whom and for what purposes, and prevent function creep on a user’s PII. Dr. Ann Cavoukian, Information and Privacy Commissioner of Ontario, Canada, has proposed Privacy By Design which seeks to give user’s greater control over their personal information while allowing businesses to achieve their objectives.

To fully realize the potential of distributed identity services, specially ones that use PII to establish a digital identity, these services will need to earn their user’s complete confidence and trust. Over the next few weeks I will expand on the various themes that were brought up in this posting. Please feel free to leave a comment here or on Twitter.