Skip navigation
1 2 3 Previous Next

evanx

129 posts
In a recent blog entry, I publicized my Final Quadrilogy of articles on Java SSL. This article is a much shortened version of the rambling Local CA article in that series. 

Overview

CA-signed certificates are used for public server authentication. In this article, we consider a private network of devices connecting to our servers i.e. via client-authenticated SSL. In this case devices have client certificates which we explicitly trust (and none other). Let's setup a private CA to sign certificates with our own CA key. We will sign our server certificates, and consider signing client certificates as well. 

What is a digital certificate?

Firstly we note that public key cryptography requires a private and public key. These are mathematically linked, such that the public key is used to encrypt data which can be decrypted only using the private key. Moreover, the private key can create a digital signature, which is verified using the public key. According to Wikipedia
A public key certificate (also known as a digital certificate or identity certificate) is an electronic document used to prove ownership of a public key. The certificate includes information about the key, information about its owner's identity, and the digital signature of an entity that has verified the certificate's contents are correct. If the signature is valid, and the person examining the certificate trusts the signer, then they know they can use that key to communicate with its owner.
So a certificate is a document which contains a public key, and its related information such as its "subject." This is the name assigned by its creator, who is the sole holder of the corresponding private key. This document is digitally signed by the "issuer." If we trust the issuer then we can use the public key to communicate to the subject. Cryptographically speaking, we can use the public key to encrypt information, which can be decrypted only by the holder of the corresponding private key. Finally, X.509 is standard for Public Key Infrastructure (PKI) that specifies formats for public key certicates, revocation lists, etc. 

Root CA certificate

By definition, a root certificate (e.g. a CA certificate) is self-signed, and so has the same "Issuer" and "Subject." For example, inspect a GoDaddy root certificate as follows: 
$ curl -s https://certs.godaddy.com/repository/gd-class2-root.crt | 
    openssl x509 -text | grep 'Issuer:\|Subject:'

        Issuer: ... OU=Go Daddy Root Certificate Authority
        Subject: ... OU=Go Daddy Root Certificate Authority
A self-signed certificate is a public key and its subject name, which is digitally signed using its corresponding private key. We can verify its signature using the public key, but have no other inherent assurances about its authenticity. We trust it explicitly via a "truststore." 

Keystore vs truststore

A "keystore" contains a private key, which has a public key certificate. Additionally the keystore must contain the certificate chain of that key certificate, through to its root certificate (which is self-signed by definition). A "truststore" contains peer or CA certificates which we trust. By definition we trust any peer certificate chain which includes any certificate which is in our truststore. That is to say, if our truststore contains a CA certificate, then we trust all certificates issued by that CA. Note that since the keystore must contain the certificate chain of the key certificate, whereas the truststore must not contain the certificate chain of included trusted certificates, they differ critically in this respect. 

Client certificate management

In order to review active credentials, we require a perfect record of all issued certificates. If a certificate is signed but not recorded, or its record is deleted, our server is forever vulnerable to that "rogue" certificate. We could record our signed certificates into a keystore file as follows: 
$ keytool -keystore server.issued.jks -importcert -alias client -file client.pem 

Certificate was added to keystore
where this is not a truststore per se, but just a "database" of issued certificates. Interestingly, we consider signing our client certificates to avoid having such a truststore containing all our clients' self-signed certificates, but nevertheless end up with one - which is telling. We could similarly record revoked client certificates. However for private networks where the number of certificates is relatively small, it is simpler and more secure to trust clients explicitly, rather than implicitly trusting all client certificates signed by our CA, and managing a revocation list. If the number of clients is large, then probably we need to automate enrollment, which is addressed in the companion article Client Authentication in this series, which proposes a dynamic SQL truststore for client certificates. Alternatively we might use a client certificate authentication server, e.g. see my experimental Node microservice github.com/evanx/certserver- which uses Redis to store certificates and their revocation list. 

Self-signed client certificates

We prefer self-signed client certificates which are explicitly imported into our server truststore, where they can be reviewed. In this case, they are "revoked" by removing them from the truststore. However, self-signed client keys are effectively CA keys, and so rogue certificates can be created using compromised client keys, e.g. using keytool -gencert. So we implement a customTrustManager for our server - see the Explicit Trust Manager article in this series. 

Private CA

Consider that we must detect when our server has been compromised, and then generate a new server key. If using a self-signed server certificate, then we must update every clients' truststore. In order to avoid such a burden, our server certificate must be signed using a CA key which our clients trust. However, our clients must trust only our private server, and not for example any server with a Go Daddy certificate. So we generate a private CA key. This key controls access to our server. While our server naturally resides in a DMZ accessible to the Internet, its CA key should be isolated on a secure internal machine. In fact, it should be generated offline, where it can never be compromised (except by physical access). We transfer the "Certificate Signing Request" (CSR) to the offline CA computer, and return its signed certificate e.g. using a USB stick. In the event that our server is compromised, we generate a new server key, and sign it using our offline CA key. Our clients are unaffected, since they trust our CA, and thereby our new server key. However our clients must no longer trust the old compromised server key, as it could be used to perpetrate a man-in-the-middle(MITM) attack. So we must support certificate revocation. For example, we could publish a certificate revocation list to our clients, or provide a revocation query service, e.g. an OCSPresponder. Alternatively, we could publish the server certificate that our clients should explicitly trust. Before connecting, our clients read this certificate, verify that it is signed by our CA, and establish it as their explicit truststore for the purposes of connecting to our server. In general, it is better to be explicit rather than implicit, to have clarity. Explicit trust enables a comprehensive review of active credentials. We consider a scenario where the above "revocation" service and our server both suffer a simultaneous coordinated MITM attack. Generally speaking, our architecture should make such an attack expensive and detectable. Our revocation service should be divorced from our server infrastructure at least, to make it more challenging. An approach to avoid managing revocation ourselves, is to use a public CA signed certificate for our server, cross-signed by our private CA. In this case, the standard SSL trust manager would validate the certificate e.g. via OCSP to Godaddy. However, our clients' then chain the standard trust manager to an explicit trust manager, to verify that the server certificate is cross-signed by our private CA. 

Server certificate signing

We create a keystore containing a private key and its self-signed certificate (for starters) using keytool -genkeypair. 
$ keytool -keystore server.jks -genkeypair -alias server -noprompt \
      -dname "CN=server.com" -keyalg rsa -keysize 2048 -validity 365
Naturally the common name of a server certificate is its domain name. This is validated by the client e.g. the browser, that the certificate's "Common Name" matches the host name used to lookup its IP address. We export a "Certificate Signing Request" (CSR) using -certreq. 
$ keytool -keystore server.jks -alias server -certreq -rfc \
      -file server.csr
We can sign the CSR using using -gencert. 
$ keytool -keystore ca.jks -alias ca -gencert -infile server.csr \
      -dname "CN=server.com" \
      -validity 365 -rfc -outfile server.signed.pem \
      -ext BasicConstraints:critical=ca:false,pathlen:0 \
      -ext KeyUsage:critical=keyEncipherment \
      -ext ExtendedKeyUsage:critical=serverAuth
where we set the X509v3 extensions to restrict the key usage for good measure, as we see for certificates we buy from a public CA. We import this signed certificate reply into our server keystore. But keytool will not allow a signed certificate to be imported unless its parent certificate chain is already present in the keystore. So we must import our CA cert first. 
$ keytool -keystore server.jks -alias ca -importcert -file ca.pem

$ keytool -keystore server.jks -alias server -importcert -file server.signed.pem

Certificate chain

We can list the certificate chain as follows: 
$ keytool -keystore server.jks -alias server -list -v
...
Certificate chain length: 2

Certificate[1]:
Owner: CN=server.com
Issuer: CN=ca

Certificate[2]:
Owner: CN=ca
Issuer: CN=ca
The first certificate of the chain is our key certificate, and the last certificate is the root CA certificate. By definition the "root" certificate of a chain is self-signed. 

openssl

We can use openssl to connect to our SSLServerSocket and inspect its key certificate chain as follows: 
$ openssl s_client -connect localhost:4444
...
Certificate chain
 0 s:/CN=server.com
   i:/CN=ca
 1 s:/CN=ca
   i:/CN=ca
This demonstrates why the keystore requires a certificate chain, i.e. to send to the peer for validation. The peer validates the chain, and checks it against our trusted certificates. It stops checking as soon as it encounters a certificate in the chain that it trusts. Therefore the chain for a trusted certificate need not be stored in the truststore, and actually must not be - otherwise we trust any certificate issued by that trusted certificate's root, irrespective of the trusted certificate itself. Consider that our clients must trust only our server, whose certificate happens to be issued by GoDaddy - we don't want those private clients to trust any server with a certificate issued by GoDaddy! 

Client keystore

We create the private keystore on each of our clients. 
$ keytool -keystore client.jks -genkeypair -keyalg rsa -keysize 2048 \
      -validity 365 -alias client -dname "CN=client"
We print our certificate as PEM text using `-exportcert -rfc` as follows: 
$ keytool -keystore client.jks -alias client -exportcert -rfc 
----BEGIN CERTIFICATE-----
MIIDxTCCAq2gAwIBAgIBADANBgkqhkiG9w0BAQsFADCBgzELMAkGA1UEBhMCVVMx
...
We inspect the certificate using openssl. 
$ keytool -keystore client.jks -alias client -exportcert -rfc |
    openssl x509 -text -in client.pem
Enter keystore password:
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 345747950 (0x149bb1ee)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=client
        Validity
            Not Before: Feb 14 11:27:19 2015 GMT
            Not After : Feb 14 11:27:19 2016 GMT
        Subject: CN=client
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
...
Finally, we import each client's self-signed certificate into our server truststore. 
$ keytool -keystore server.trust.jks -alias client -importcert -file client.pem

Conclusion

Public CA certificates are typically used for public server authentication. However, we are primarily concerned with private client authentication for access to a private server, i.e. a virtual private network. Our clients should trust only our server, and not any server certificate issued by some public CA. We sign the server certificate using an offline CA key which our clients solely trust. When our server is compromised, we can change our server key without changing our clients' truststores. However, we must somehow invalidate the old server certificate. We might publish our current server certificate that our clients load first from multiple sources that we control. We check their consistency, and thereby combat a MITM attack. Our clients can then connect to our server, and verify that it has the certificate we expect, and that it is signed by our offline CA key. We prefer self-signed client certificates, which are explicitly trusted. However, we note that self-signed certificates are effectively CA certificates, and so a compromised private key can be used to create rogue certificates. So we should implement a custom "explicit trust manager" to ensure that the peer's key certificate itself is explicitly included in the truststore, i.e. disregarding its chain of signing certificates. 

Further reading

See my experimental Node microservice github.com/evanx/certserver, which uses Redis to store certificates and a revocation list. See the companion article Explicit Trust Manager. This is part of my Final Quadrilogy on Java crypto. @evanxsummers

We consider client devices (e.g. Android) connecting to a Java server for secure networking. Say we have stringent authentication requirements, and so decide to use client-authenticated SSL sockets, using self-signed client certificates.

http://upload.wikimedia.org/wikipedia/commons/5/55/Gnome-security-medium.svg Explicit Trust Manager: Safer self-signed client certificates for private servers. Part of "The Final Quadrilogy."

Preamble: In a previous blog entry, I announced the Final Quadrilogy on Java SSL, covering keytool, KeyStore, and X509TrustManager. This article is a shortened version of the Explicit Trust Manager part of that quadrilogy.

Server

CA-signed certificates are typically used for the authentication of public servers. However, our clients must trust our private server exclusively, and not any server certificate issued by some CA. So we need our own CA specific to our server, and none other.

Client

Actually we are primarily concerned with client authentication, and we want to automate the deployment and enrollment of new devices. Self-signed certificates enable automation, but is this at the cost of security?

Rogue certificates

Self-signed peer certificates in a truststore are effectively root CA certificates. Consequently, a rogue certificate can be issued using a compromised client's keystore, by abusing the client key as a trusted CA key, to exploit the server's trust, as follows.

 $ keytool -keystore keystore.jks -alias mykey -gencert -validity 365 -rfc \ -infile rogue.csr -outfile rogue.signed.pem -dname "CN=clientx" 

where keytool is used sign a CSR for a rogue certificate, using our compromised keystore. Alternatively, one could use openssl. The rogue certificate would be trusted by virtue of being signed by a trusted certificate. Let's fix that!

Custom trust manager

So for a private server, we should trust only client certificates that are explicitly imported into our truststore, irrespective of their certificate chain. For this purpose, we implement a custom X509TrustManager.

 public class ExplicitTrustManager implements X509TrustManager { final private Map certificateMap = new HashMap(); public ExplicitTrustManager(KeyStore trustStore) throws GeneralSecurityException { for (String alias : Collections.list(trustStore.aliases())) { X509Certificate certificate = (X509Certificate) trustStore.getCertificate(alias); String commonName = Certificates.getCommonName(certificate.getSubjectDN()); certificateMap.put(commonName, certificate); } } ... } 

where we build a map of our trusted certificates for lookup by their common name. We implement the methods required for an X509TrustManager.

 @Override public X509Certificate[] getAcceptedIssuers() { return new X509Certificate[0]; } @Override public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { checkTrusted(chain); } @Override public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { throw new CertificateException("Server authentication not supported"); } 

where we return an empty list of accepted issuers, so our trust manager applies to all certificates. We delegate checkClientTrustedto the following method.

 private void checkTrusted(X509Certificate[] chain) throws CertificateException { if (chain.length == 0) { throw new CertificateException("Invalid cert chain length"); } X509Certificate trustedCertificate = certificateMap.get( Certificates.getCommonName(chain[0].getSubjectDN())); if (trustedCertificate == null) { throw new CertificateException("Untrusted peer certificate"); } if (!Arrays.equals(chain[0].getPublicKey().getEncoded(), trustedCertificate.getPublicKey().getEncoded())) { throw new CertificateException("Invalid peer certificate"); } trustedCertificate.checkValidity(); } 

where we lookup the trusted certificate with the same common name as the peer's key certificate, i.e. the first certificate in the chain. We check the equality of this peer certificate to our trusted certificate, via its public key. This prevents a spoofing attackusing a rogue certificate with the same common name. We note that certificates are "revoked" by removing them from the truststore, and restarting our server. Disclaimer: This trust manager is an untested prototype and should not be used in production without further review and thorough testing ;)

Server key change

When we discover that our server has been compromised, we must generate a new server key. If our clients trust the server certificate specifically, then we must update each client's truststore accordingly. This might be quite a burden. Moreover, we might wish to change our server key periodically anyway. We want a server key change to not affect its clients. So we sign its certificate using an offline CA key, which we create for this purpose. Our private clients trust this CA certificate, which cannot be compromised (except by physical access). In the event that our server is compromised, we generate a new server key, which we sign by transferring its CSR to our CA machine, and returning its signed certificate reply, via USB key.

 $ keytool -keystore ca.jks -alias ca -gencert -infile server.csr -dname "CN=server.com" \ -validity 365 -rfc -outfile server.signed.pem \ -ext BasicConstraints:critical=ca:false,pathlen:0 \ -ext KeyUsage:critical=keyEncipherment \ -ext ExtendedKeyUsage:critical=serverAuth 

In this case, we can perform a server key change whereby our clients are unaffected.

Conclusion

CA-signed certificates are typically used for the authentication of public servers by any browser. However, we are concerned with client-authenticated devices, connecting to a private server. Our server and each of its clients should exclusively trust each other, and certainly not trust any certificate issued by some public CA. We note that a self-signed client certificate in our server truststore is effectively a CA certificate. Therefore a compromised client keystore can be abused to sign rogue certificates. These are likewise trusted, by virtue of being issued by a trusted certificate. So we implement a custom trust manager that trusts only certificates that are explicitly imported into our truststore, irrespective of their certificate chain. We note that a compromised client keystore can itself be used to spoof that client. Such an attack can be detected by analyzing logs for simultaneous connections, ostensibly from the same client, but from an alternate IP address. We note that rogue certificates can spoof any chosen certificate name, and so present a higher risk. If our server is compromised, the attacker might inject a fraudulent client certificate into our server truststore. However such tampering is overt, and detectable by monitoring our truststore. When we discover that our server is compromised, we must of course change our server key. Since our clients specifically trust our server's certificate, we must update every client's truststore. To avoid this burden, we should use an offline CA key to sign our server certificate. Our clients trust this local CA certificate, and so are unaffected by a server key change.

Resources

https://github.com/evanx/vellum/wiki - see ExplicitTrustManager.java.

Further reading

See the Final Quadrilogy on Java SSL. Besides treatment of Public Key Infrastructure from a Java perspective, this series features a minimal Java implementation of Dual Control to satisfy PCI DSS compliance.

In 2013, I was a Linux sysadmin, PostreSQL DBA, and erstwhile Java developer for a payment switching company, who was preparing for their first PCI assessment. Besides securing and virtualising their infrastructure - with KVM, virtual firewalls, and ssh forced commands - which kept me quite busy, there was this PCI requirement for "dual control" and "split knowledge" which was a show-stopper. 

https://github.com/evanx/vellum/wiki/FinalQuadrilogy

http://upload.wikimedia.org/wikipedia/commons/5/55/Gnome-security-medium.svg "The Final Quadrilogy" series: Featuring Dual Control encryption, SSL, keytool, Trust Managers etc
"Dual Control" requires that your sysadmin cannot access the keys used to encrypt credit card numbers. Because otherwise the sysadmin could "go rogue" and decrypt all the credit card numbers into plain text for nefarious purposes. I guess it must have happened a few times, hence the requirement. The general solution seems to be expensive "key management servers," but we decided to roll up our sleeves and put a minimal solution in place that would satisfy this PCI requirement. For this I developed an opensource dualcontrol package. We used this code and passed our PCI assessment - yay! In conjunction with developing and testing the code, I drafted the article "Dual Control" on my github wiki. But the problem with the "Dual Control" draft article, is that I got carried away with it, and it became too long, with various tangential ventures in aspects of Java crypto. So I began breaking it up into a series. I split out two smaller articles, namely Local CAand Explicit Trust Manager. The Dual Control article was still a bit too long, and I wanted to subdivide it further, but I never got around to that. Now, after a year, I decided to name and publicise this series... 

The Final Quadrilogy

And so, ladies and gentlemen, I give you, in one fell swoop, The Final Quadrilogy with its four articles, namely Dual Control, Explicit Trust Manager, Client Authentication and Local CA, and hope that you never have need to read any of it, as it's very dry crypto stuff, and rather drawn out. Especially Dual Controland most of all Local CA, which delves into Java SSL crypto quite deeply. Explicit Trust Manager are Client Authentication are at least relatively "short and sweet." In brief, the four articles are as follows: "Dual Control" presents our "dualcontrol" package for satisfying the PCI DSS requirements for "dual control" and "split knowledge" with respect to protecting data-encrypting keys."Local CA" discusses how one could setup a local private CA facility to sign certificates. "Explicit Trust Manager" presents a custom X509TrustManager that only trusts client certificates that are explicitly imported into our truststore, for a secure "private network" of clients connecting to a "private" Java server. Finally, in "Client Authentication," we propose a dynamic trust manager to automate certificate enrollment, e.g. where new clients' certificates are imported into an SQL-based dynamic truststore. These four articles are listed on the wiki page: The Final Quadrilogy. 

Move over Java, here comes Node

Having completed that PCI project by the end of 2013, I got another job in 2014, and had a very busy year, introducing myself to various new technologies, notably AngularJS, Node.js, and Redis, and trying to reinvent myself as a "web developer" of sorts. Consequently I guess my love affair with Java is cooling off after 15 great years. I'm now also loving JavaScript and Node microservices. In short, I've become one of the following types ;)Update: I've implemented a dual control cryptoserver prototype using... Node and Redis. And similarly a certserver, for managing client certificates. 

Witherto this blog

I don't bear to think that this is my last posting on this blog. If so, then hopefully I saved the best for last, with this crypto quadrilogy, at least for some. Actually in the coming months I may even post a blog article on each of the four articles in this quadrilogy, to make those specific resources easier to find from an SEO perspective. Also I might blog sometime about my last Java pet project, Chronic - a monitoring server - which is abandonware, but has some sweet crypto. It is something I wish to re-implement in Node, Redis and... ReactJS ;) Until then, find me on twitter: @evanxsummers
evanx

What is Redis? Blog

Posted by evanx Dec 31, 2014
Redis is good for prototyping, shared memory, messaging, caching and maximum performance. It might be used orthogonally and/or complementary to your SQL relational store, and/or NoSQL document store. For example, Redis might be used to cache dimensional aggregates of relational data for analytical purposes. It's use-case as "persistent shared memory" is important especially for microservices and distributed systems in general. I spent the last few weeks of this holiday season applying Redis to some use-cases. Herewith my write up for my colleagues on Redis. You can find the original (and likely updated) article on my github wiki: https://github.com/evanx/vellum/wiki/redis_intro. So let's get into it... Redis is a persistent in-memory key-value store. It was developed by Salvatore Sanfilippo. (Thank you so much!) http://upload.wikimedia.org/wikipedia/en/thumb/6/6b/Redis_Logo.svg/500px-Redis_Logo.svg.pngThe values of keys can be strings, lists, sets, sorted sets or hash tables. - Hash tables can store structured data, i.e. different fields of a record (or columns of a row). - Listsare queues with left and right sides (or head and tail), from which members can be pushed and popped - Sets have unique members, i.e. no duplicates. - Sorted sets' members have a "score" which is a floating point number. Members are ranked by this score, and/or the member (string) in the case of equal scores. They can be fetched by rank, or a range of their scores. All scores might be set to 0 to rank via the member string lexographically. Keys, and "members" of the above data types, are strings. However strings are "binary-safe" and so can be any type of data. Often strings (as keys, values and members) are stringified integer ids, or serialized objects e.g. JSON. Incidently, there is an operation to treat a string as an integer to be incremented - e.g. for generating sequential ids. The whole "database" is stored in memory. This deliberate "feature" gives it fantastic performance compared to on-disk databases. But it's also its "bug" that limits the database size. Considering this limitation, trimming the data set is critical - i.e. removing older data we don't actually need in memory for our use-case. (This might be archived into a disk-based database.) Besides snapshots, Redis has an "append-only log" which is configurable to sync to disk every second, or after every write (slow). This enables one to trade off the level of performance vs durability, e.g. potentially losing the last second of writes in the event of a power failure or crash. Actually Redis has various use-cases, besides a database per se: - memory cache in place of memcached: Redis supports expiry times on keys for this purpose. - message queue: Redis' list data type can be used as a queue, since it offers blocking pop operations. Redis also supports pubsub i.e. for asynchronous decoupled message passing. An important use-case especially for microservices and distributed systems in general, is using Redis for "persistent shared memory." - "persistent" because the data is still available when our application is restarted, or indeed Redis is restarted e.g. after reboot or power failure. - "shared" because it's a server which can be accessed remotely by multiple application instances. Redis is typically used to share data required by microservices running in different application containers, which is like different hosts, with different filesystems. For web applications, user session state might be stored in Redis, so that it is accessible by various microservices, e.g. by a session ID which contained in the HTTP request. (Authentication and authorisation could happen before the request reaches this service.) If not using Redis, one might use a RDBMS for persistent session state and queues. However extra management and integration effort is required when using RDBMS and SQL, compared to Redis, for such use-cases. The killer feature is that Redis might be 100x or 500x faster than an RDBMS, considering the latency of hard disk access compared to RAM. In general, web developers are increasingly finding NoSQL technologies attractive, in terms of ease of integration, performance and/or scalability. Nevertheless if one's data is relational, with foreign key relationships that must exist, and strict ACID compliance is required, then clearly an RDBMS is an appriopriate technology. However, a mix of SQL/NoSQL technologies is potentially advantageous. Redis is good for prototyping, shared memory, messaging, caching and maximum performance. It might be used orthogonally and/or complementary to your SQL relational store, and/or NoSQL document store. For example, Redis might be used to cache dimensional aggregates of relational data for analytical purposes. For comparative information on various NoSQL databases, see http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis. If you want to play with Redis, it's easy to install and start without any configuration required. Then just use the `redis-cli` command-line. See http://redis.io.  
evanx

Password Rehash Blog

Posted by evanx Apr 30, 2013
We continue our Password Salt adventures with PBKDF2, in order to store multiple revisions of the crypto parameters, and migrate hashes on the fly.   

https://github.com/evanx/vellum/wiki

 Password Rehash: The second part of a trilogy, part of "The Enigma Posts"
In thePassword Salt prequel, we suggested that the most secure passwords are no passwords, e.g. using Google Login, Facebook Connect, Mozilla Persona or what-have-you. Such an approach is simpler for developers and more convenient for end-users. However for internal enterprise apps, those identity services might not be suitable, so... In said prequel, we presented an implementation using PBKDF2WithHmacSHA1, with a high number of iterations. In this article, we cater for multiple revisions of the number of iterations and key size, and migrate hashes to the latest revision when the user logs in. 

PBKDF2 recap

Paraphrasing an earlier version OWASP Password Storage Cheat Sheet (before they gone done changed the whole thing, after this series was drafted), 
General hashing algorithms (e.g. MD5, SHA) are not recommended for password storage. Instead an algorithm specifically designed for the purpose should be used such as PBKDF2 or scrypt.
As presented in Password Salt, we salt, hash and match our passwords using PBKDF2 as follows. 
public class Passwords {
    public static final String ALGORITHM = "PBKDF2WithHmacSHA1";
    public static final int ITERATION_COUNT = 30000;
    public static final int KEY_SIZE = 160;

    public static byte[] hashPassword(char[] password, byte[] salt)
            throws GeneralSecurityException {
        return hashPassword(password, salt, ITERATION_COUNT, KEY_SIZE);
    }

    public static byte[] hashPassword(char[] password, byte[] salt,
            int iterationCount, int keySize) throws GeneralSecurityException {
        PBEKeySpec spec = new PBEKeySpec(password, salt, iterationCount, keySize);
        SecretKeyFactory factory = SecretKeyFactory.getInstance(ALGORITHM);
        return factory.generateSecret(spec).getEncoded();
    }

    public static boolean matches(char[] password, byte[] passwordHash, byte[] salt) 
            throws GeneralSecurityException {
        return matches(password, passwordHash, salt, ITERATION_COUNT, KEY_SIZE);
    }

    public static boolean matches(char[] password, byte[] passwordHash, byte[] salt,
            int iterationCount, int keySize) throws GeneralSecurityException {
        return Arrays.equals(passwordHash, hashPassword(password, salt, 
                iterationCount, keySize));
    }
}
where we check if the supplied password, and its salt, matches our hash, using the given PBKDF2 parameters. 

Measuring time

According to the aforecited earlier version of the OWASP Password Storage Cheat Sheet, 
One should measure the time required and make sure that it's as large as possible without providing a significantly noticeable delay when users authenticate.
Let's measure the time required, albeit on our dev machine. 
    @Test
    public void testMatchesEffort() throws Exception {
        char[] password = "12345678".toCharArray();
        byte[] saltBytes = PasswordSalts.nextSalt();
        long startMillis = System.currentTimeMillis();
        byte[] hashBytes = Passwords.hashPassword(password, saltBytes, 30000, 160);
        System.out.printf("hash duration (30k, 160bit): %dms\n", Millis.elapsed(startMillis));
        startMillis = System.currentTimeMillis();
        assertTrue(Passwords.matches(password, hashBytes, saltBytes, 30000, 160));
        System.out.printf("matches duration (30k, 160bit): %dms\n", Millis.elapsed(startMillis));
        startMillis = System.currentTimeMillis();
        Passwords.hashPassword(password, saltBytes, 100000, 160);
        System.out.printf("100k hash duration: %dms\n", Millis.elapsed(startMillis));
        startMillis = System.currentTimeMillis();
        Passwords.hashPassword(password, saltBytes, 300000, 160);
        System.out.printf("300k hash duration: %dms\n", Millis.elapsed(startMillis));
        assertFalse(Passwords.matches(password, hashBytes, saltBytes, 30001, 160));
        assertFalse(Passwords.matches(password, hashBytes, saltBytes, 30000, 128));
        assertFalse(Passwords.matches("wrong".toCharArray(), 
                hashBytes, saltBytes, 30000, 160));
    }
which prints the duration of the hashing and matching operations, as follows. 
hash duration (30k, 160bit): 146ms
matches duration (30k, 160bit): 141ms
100k hash duration: 542ms
300k hash duration: 1711ms
which shows the duration in milliseconds for 30k iterations, and also 10x that, for a key size of 160 bits. Naturally we expect the duration to be linearly dependent on the number of iterations. Also, as matches() invokes hashPassword(), we expect those durations to be similar. The above test indicates that iteration counts in the range 30k to 100k take less than 500ms, which might be our chosen time limit. But this is on our dev machine, and we should measure this in production, under varying loads. 

Storage

In order to revise the algorithm parameters e.g. if we migrate to a faster host, we need to store these parameters, together with the salt and the password hash, for each user. We might extend our SQL credential table to include the PBKDF2 parameters as follows. 
   LOGIN VARCHAR(100) PRIMARY KEY,
   PASSWORD VARCHAR(32),
   SALT VARCHAR(32),
   PASSWORD_ITERATION_COUNT INTEGER,
   PASSWORD_KEY_SIZE INTEGER,   
In the next sequel we'll consider encrypting the salt, which would require a further field named SALT_IV (for the AES "initialization vector"). But let's try to migrate to salted passwords without changing our database schema, by packing the password hash, salt and parameters into one field to be stored in the PASSWORD column. We can differentiate our legacy hash stored therein by its shorter length, and migrate. 
public class PasswordHash {
    private static final byte VERSION = 255;    
    int iterationCount;
    int keySize;
    byte[] hash;
    byte[] salt;
    byte[] iv;

    public PasswordHash(byte[] hash, byte[] salt, byte[] iv, 
            int iterationCount, int keySize) {
        this.hash = hash;
        this.salt = salt;
        this.iv = iv;
        this.iterationCount = iterationCount;
        this.keySize = keySize;
    }
    ...
Given a new password and specific algorithm parameters, we generate salt, and hash the password as follows. 
    public PasswordHash(char[] password, int iterationCount, int keySize) 
            throws GeneralSecurityException {
        this.iterationCount = iterationCount;
        this.keySize = keySize;
        this.salt = PasswordSalts.nextSalt(); // e.g. random 16 byte array
        this.hash = Passwords.hashPassword(password, salt, iterationCount, keySize);
        this.iv = new byte[0];
    }
We mash the hash et al into a byte array as follows, for storage purposes. 
    public byte[] getBytes() throws IOException {
        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        stream.write(VERSION);
        writeObject(new ObjectOutputStream(stream));
        return stream.toByteArray();
    }
                    
    private void writeObject(ObjectOutputStream stream) throws IOException {
        stream.writeInt(iterationCount);
        stream.writeShort(keySize);
        if (hash.length > 255) {
            throw new IOException("Hash length not supported");
        }
        stream.write(hash.length);
        stream.write(salt.length);
        stream.write(iv.length);
        stream.write(hash);
        stream.write(salt);
        stream.write(iv);        
        stream.flush();
    }
where we introduce a writeObject() method as perSerializable, for the sake of conformity. This byte array will be encoded using Base64 and stored in our PASSWORDcolumn in the database. In order to authenticate a user's password, we unpack the byte array retrieved from the database. 
       
    public PasswordHash(byte[] bytes) throws IOException {
        InputStream stream = new ByteArrayInputStream(bytes);
        int version = stream.read();        
        if (version != VERSION) {
            throw new IOException("Version mismatch: " + version);
        }
        readObject(new ObjectInputStream(stream));
    }

    private void readObject(ObjectInputStream stream) throws IOException {
        iterationCount = stream.readInt();
        keySize = stream.readShort();
        hash = new byte[stream.read()];
        salt = new byte[stream.read()];
        iv = new byte[stream.read()];
        stream.read(hash);
        stream.read(salt);
        stream.read(iv);        
    }
We provide a method to authenticate a password against this hash, using its accompanying parameters. 
    public boolean matches(char[] password) throws GeneralSecurityException {
        millis = System.currentTimeMillis();
        try {
            return Arrays.equals(hash, Passwords.hashPassword(password, salt, iterationCount, keySize));
        } finally {
            millis = Millis.elapsed(millis);
        }
    }
where we record the time taken to authenticate the password, for monitoring purposes, in order to assess the choseniterationCount parameter in our production environment. 

Testing

Let's test this and see what we end up with. 
    @Test
    public void testPasswordHashMinimumKeySize() throws Exception {
        testPasswordHash(30000, 128);
    }
    
    private void testPasswordHash(int iterationCount, int keySize) throws Exception {
        char[] password = "12345678".toCharArray();
        PasswordHash passwordHash = new PasswordHash(password, iterationCount, keySize);
        byte[] hashBytes = passwordHash.getBytes();
        String encodedString = Base64.encode(hashBytes);
        passwordHash = new PasswordHash(hashBytes);
        assertEquals("iterationCount", iterationCount, passwordHash.getIterationCount());
        assertEquals("keySize", keySize, passwordHash.getKeySize());
        assertTrue(PasswordHash.verifyBytes(hashBytes));
        assertFalse(passwordHash.matches("wrong".toCharArray()));
        assertTrue(passwordHash.matches(password));
        System.out.printf("iterationCount: %d\n", iterationCount);
        System.out.printf("keySize: %d\n", keySize);
        System.out.printf("byte array length: %d\n", hashBytes.length);
        System.out.printf("encoded string: %s\n", encodedString);
        System.out.printf("encoded length: %d\n", encodedString.length());
        System.out.printf("millis: %d\n", passwordHash.getMillis());
    }
which prints, 
iterationCount: 30000
keySize: 128
byte array length: 48
encoded string: /6ztAAV3KQAAdTAAgBAQADaOT6lY7axaXF56GJvOPjKMrHGQhyihJQWVzeqjyK+6
encoded length: 64
millis: 137ms
We note the array length is 48, which should be larger than our legacy hashes, so we can differentiate those by their shorter length. Seeing that the Base64-encoded length is 64 characters, ourPASSWORD column capacity should be altered toVARCHAR(64) at least. 

Migration

We provide a static method to confirm that the bytes are what our PasswordHash constructor would expect. 
public class PasswordHash {
    ...
    public static boolean verifyBytes(byte[] bytes) {
        return bytes.length >= 48;
    }   
} 
We assume that the length of our legacy unsalted password hashes is less than that of our PasswordHash serialized array, and so use the above method to differentiate those. Finally we migrate to salty passwords, on the fly, as follows. 
    public boolean matches(String user, char[] password, byte[] packedBytes) throws Exception {
        if (PasswordHash.verifyBytes(packedBytes)) {
            PasswordHash passwordHash = new PasswordHash(packedBytes);
            if (passwordHash.matches(password)) {
                monitor(passwordHash.getMillis());
                if (passwordHash.getIterationCount() != Passwords.ITERATION_COUNT
                        || passwordHash.getKeySize() != Passwords.KEY_SIZE) {
                    passwordHash = new PasswordHash(password,
                            Passwords.ITERATION_COUNT, Passwords.KEY_SIZE);
                    persistRevisedPasswordHash(user, passwordHash.getBytes());
                }
                return true;
            }
            return false;
        }
        if (matchesUnsalted(password, packedBytes)) {
            packedBytes = PackedPasswords.hashPassword(password);
            persistRevisedPasswordHash(user, packedBytes);
            return true;
        }
        return false;
    }
where if the password is correct, but not at the latest revision, or still a legacy unsalted hash, we take the opportunity of migrating that user's password hash to the latest salty non-cracker. As PCI compliance requires passwords to be changed every 90 days, admittedly it's not really necessary to migrate existing passwords to higher parameters, because a new password hash will be created soon enough :) Finally, we monitor()the time taken to authenticate the password, e.g. to log hints to revise our iteration count. An upcoming article in the Timestampedseries will illustrate how we can gather time-series stats. We will record the sample size, minumum, maximum, average, and distribution of the duration of the hashing operation, for different iteration counts, aggregated by hour and day. Of course, this must be nicely graphed so we can visualise the state of affairs, at a quick glance :) 

Salt cipher

The aforementioned version of the OWASP cheat sheet suggested "salt isolation" as follows, 
An additional password storage defense mechanism involves storing the salt in a different location than the password hash. Use of the server's filesystem is one commonly used mechanism for salt isolation, assuming the password hashes are stored in a different location such as a database.
As discussed in the prequel, it'd be simpler to encrypt the salt in the database, and store just the salt-encrypting key on the filesystem. Indeed, the iv field of PasswordHashis an "initialization vector" as required for AES encryption of the salt. In the finale of this trilogy, we'll present that aspect. However, is "salt isolation" really necessary? Please share your opinions in the comments section below :) 

Summary

While SHA-2 is recommended these days for general hashing, we should use computationally expensive algorithms for password hashing, so that the passwords are harder to crack. In the prequel, we presented an implementation using PBKDF2WithHmacSHA1, with a high number of iterations. Since the hashing operation should take as long as we are willing to make the user wait, the algorithm parameters must be tweaked according to our host's CPU. We should store of the number of iterations and key size, to enable revision thereof. We decide to encapsulate the hash, salt and parameters into a PasswordHash object. We serialize this object into a byte array, and encode with Base64 to store in our SQL database. Migration from legacy hashes is somewhat simplied thereby. We can migrate to revised hashes when the user logs in and their password is thus on hand. Having said that, and seeing that PCI compliance requires passwords to be changed every 90 days, it is probably not necessary to migrate existing passwords to higher parameters when a new password hash will be created soon enough. 

Coming up

In "Password Cipher," the finale of this sub-trilogy, we'll encrypt the password salt in our database, using... password-based encryption :) Also, an upcoming article in the Timestampedseries will illustrate how we can gather time-series stats, including the time taken to authenticate the password. Such stats are required to revise our iteration count. We really should try out bcrypt and scrypt, and hopefully we will :) 

Resources

https://github.com/evanx/vellum/wiki - see PasswordHash.java. @evanxsummers
This morning i enjoyed an article entitled "There Is Ubuntu, There Is Linux And Then There Are Others", and here i rehash my comment there. Chris Jones' article discusses how Microsoft and Canonical haven't listened to what their users want, which is the traditional "Start" desktop ala previous versions of Windows, GNOME 2 etc, not to mention KDE. Hence Windows 8 is not breaking any sales records, and Linux Mint's Cinnamon is more popular than Unity.   

The Wider Picture: Unity rocks

Actually, I find that the Unity launcher is very similar to the Windows 7 dock, except that it is on the left, which to my mind makes better use of today's wide screens. Vertical space is best used by applications rather than the shell, in my blinkered view ;) My Windows 7 laptop screen is wide enough, but lacking in the vertical department. So I put the dock on the left, just like Unity :) My screen at work is too wide for all intent and purposes, to be honest. But at least Unity makes best use of the available vertical real-estate. I hasten to add that the Unity search and notifications are top-notch, and Ubuntu update and security is great. So, to my mind, it's the best desktop OS and shell, bar none. I believe that Unity is designed for end-users by a team of UX experts hired by Canonical in London, rather than designed by developers. So i expect it to be "better" than GNOME and KDE for the "mainstream." Although i have used both KDE and GNOME extensively since their genesis (and i wrote this article on me mum's GNOME 2 pewter), these days i'm a Unity convert. For sure, there are some things not yet 100% for me, but that might just be me, or might just be "not yet." I know they have a decisive leader, professional designers, and a passionate community in attendance. When i started using bleeding-edge Unity a year or so ago (Ubuntu 11.10), it was falling over every day, which didn't suprise me. What was suprising is how quickly it stabilised for 12.04. I suspect this is a consequence of their prescription for testable code. Interestingly, this is one of primary reasons offered for developing Mir, and abandoning their Wayland plans. Other reasons appear to be agility for shell development, and supporting Android graphics drivers. So here we have a brand new OpenGL-based shell, and unified display server, written from the ground up for a converged 3D-graphics world, with a dogged insistence on automated testing. I bow in admiration of their efforts, as much as i enjoy the fruits of their labour. 

The Legacy: Windows

Having said all that, i'm still too lazy to reinstall my Windows laptop, as per my cheeky 2006 article, My Desktop OS: Windows XP ;) Incidently the person in that picture recently contacted me out of the blue. He is a most affable chap named Leslie Wilcock, and provided some background information which i posted as a comment :) Certainly the Linux desktop, in particular Ubuntu, has come a long way since then. In another 2006 blog, A tale of two CDs, i gave very low scores to Linux desktops of the day ;) Last January i tried Fedora with GNOME3 and KDE, but reverted to Ubuntu with Unity, despite its instability at the time. Fedora's installer looks and feels so dated which is off-putting from word go. KDE also felt dated, while GNOME3 seemed too fandangled, so there's no pleasing me ;) For my tastes, nothing is as slick as Ubuntu, from the installer through to the Unity shell. It looks beautiful and feels modern. 2013 is the year of the Linux desktop, by my personal metric which is: i really want to reinstall my Windows laptop with Ubuntu, at last ;) And in full confidence that i won't be missing a thing, but gaining a lot. Besides all the strides made by Linux and streamlined by Ubuntu, the public's insatiable appetite for cloud apps like Gmail, Google Talk, Google Docs et al, further enables the Linux desktop. Hey Microsoft, we don't need your Windows no more! Microsoft are being pummelled by Linux from both sides, on mobile and the cloud, while ChromeOS and Ubuntu are taking jabs at their desktop market in the middle. 

The Bigger Picture: Unification

And now please allow me to opine on the big picture. The benefits of a unified interface across phone/phablet and desktop in particular, will accrue. MS and Canonical are betting on this. That's not to mention the three other "screens of our lives" namely tablet, TV and IVI. http://www.ubuntu.com/static/u/img/devices/photo-industry-android-381x230.jpg MS and Ubuntu desktop users are rankling at the change, but as users inevitably grow more attached to their phones than desktops, this strategy will appreciate. Consumers want commonality between their desktop and phone, and producers want a unified platform for both. On the Linux front, Canonical is more agile and community-based than Google, and best espouses this unified vision. Canonical could replicate the success of Android and ChromeOS, with the crucial advantage of a unified platform. I hope they can induce developers of mobile apps. QML being Javascript-based, is well-chosen, in my opinion. A big part of the big picture is cloud apps and services. Their integration with mobile and the desktop can make or break those platforms, going forward. Google currently own the cloud with Gmail and Google Docs, so they're sitting in the driver's seat right now. But nothing stays the same in this wonderful industry of ours, as exemplified by Microsoft and Apple's rollercoasting, and the rise of Google. The future is open, and opensource, so everyone can play. Android is dominant. ChromeOS is rising. Tizen and FirefoxOS are exciting. But Ubuntu will be the first unified post-PC OS. 

The Other Story

I was about to publish the sequel to Password Salt today, but now that will have to wait for next week, so stay tuned :) On a related note, we read this week that a recent Cisco update "leaves users considerably more susceptible to password cracking," since "Cisco's new method for converting passwords into one-way hashes uses a single iteration of the SHA256 function with no cryptographic salt." I don't believe it! ;)  
evanx

Password Salt Blog

Posted by evanx Jan 24, 2013
This provides a long overdue update to "Password Hash" from the Enigma Prequels (2007), where that article neglected to add salt, which is embarassing for whoever wrote that article... which was unfortunately me.   

https://github.com/evanx/vellum/wiki

http://upload.wikimedia.org/wikipedia/commons/5/55/Gnome-security-medium.svg Password Salt: The first part of a hashed cryptrilogy, part of "The Enigma Posts"
We know that passwords should be hashed, and hear they should be salted. SHA-2 is recommended these days for general hashing, but we'll read that we should use computationally expensive algorithms for password hashing, so that the passwords are harder to crack. Just askLinkedIn, who learnt this lesson a few months ago (2012). The most secure passwords are no passwords, e.g. using Google Login, Facebook Connect, Mozilla Persona or what-have-you. Such an approach simplifies our implementation effort, improves security, and makes registration and login more convenient for the end-user. So that's surely the way forward for consumer sites. However for internal enterprise apps, those login services might not be suitable. Perhaps the reader can recommend an opensource identity solution which handles passwords in a PCI-compliant fashion? Having said that, since so many of us devs are lumbered with password management, let's get stuck in. 

Background reading

From https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet
General hashing algorithms (eg, MD5, SHA) are not recommended for password storage. Instead an algorithm specifically designed for the purpose should be used such as bcrypt, PBKDF2 or scrypt.
From http://en.wikipedia.org/wiki/Bcrypt
bcrypt is a key derivation function for passwords based on the Blowfish cipher.
From http://en.wikipedia.org/wiki/PBKDF2
PBKDF2 (Password-Based Key Derivation Function 2) is a key derivation function that is part of RSA Laboratories' Public-Key Cryptography Standards (PKCS) series.
PBKDF2 applies a pseudorandom function to the input password along with a salt value, and repeats the process many times.
The added computational work makes password cracking much more difficult.
Having a salt added to the password reduces the ability to use precomputed hashes (rainbow tables) for attacks.
From http://en.wikipedia.org/wiki/Scrypt
A password-based key derivation function (password-based KDF) is generally designed to be computationally intensive, so that it takes a relatively long time to compute (say on the order of several hundred milliseconds).
Legitimate users only need to perform the function once per operation (e.g., authentication), and so the time required is negligible.
However, a brute force attack would likely need to perform the operation billions of times at which point the time requirements become significant and, ideally, prohibitive.
From https://www.owasp.org/index.php/Hashing_Java,  
If each password is simply hashed, identical passwords will have the same hash. This has two drawbacks:
1. Due to the birthday paradox (http://en.wikipedia.org/wiki/Birthday_paradox), the attacker can find a password very quickly especially if the number of passwords in the database is large.
2. An attacker can use a list of precomputed hashes (http://en.wikipedia.org/wiki/Rainbow_table) to break passwords in seconds.
In order to solve these problems, a salt can be concatenated to the password before the digest operation.
A salt is a random number of a fixed length. This salt must be different for each stored entry. It must be stored as clear text next to the hashed password.
In this configuration, an attacker must handle a brute force attack on each individual password. The database is now birthday attack and rainbow crack resistant.
Note that the sources, erm, "quoted" above, have been surreptitiously paraphrased in some places, to improve readability to suit ourselves ;) 

Base64

Since we store password hashes and their salts in an SQL database, we encode those bytes into text using Base64. For convenience, we introduce methods which delegate to our Base64 codec of choice e.g. from Apache commons, or the built-in Sun one. 
import sun.misc.BASE64Decoder;
import sun.misc.BASE64Encoder;

public class Base64 {
    
   public static String encode(byte[] bytes) {
      return new BASE64Encoder().encode(bytes);
   }

   public static byte[] decode(String string) {
      try {
         return new BASE64Decoder().decodeBuffer(string);
      } catch (Exception e) {
         throw new RuntimeException(e);
      }
   }
}
Actually Apache Commons Codec is a better choice as the Sun one gives compilation warnings to the effect that the "sun.misc.BASE64Decoder is a Sun proprietary API and may be removed in a future release." But we take that with a pinch of salt, so to speak. 

Psuedo salt

We read about SecureRandom vs Random on infosecinstitute.com, whilst enjoying Dilbert's comic RNG :) So apparently we must use SecureRandom to generate our salt, and not java.util.Random
public class PasswordSalts {
   public static final int SALT_LENGTH = 16;    
    
   public static byte[] nextSalt() {
      byte[] salt = new byte[SALT_LENGTH];
      SecureRandom sr = SecureRandom.getInstance();
      random.nextBytes(salt);
      return salt;
   }    
}
where our salt is a 16 byte random number. Sooo easy :) Let's immerse in Base64 salts. 
   @Test
   public void testSaltEncoding() throws Exception {
      byte[] saltBytes = PasswordSalts.nextSalt();
      String encodedSalt = Base64.encode(saltBytes);
      System.out.println(encodedSalt);
      assertEquals(encodedSalt.length(), 24);
      assertEquals(encodedSalt.substring(22, 24), "==");
   }
r2tWqOrfKpr64rpOwoRlcw==
So apparently a 16 byte array encoded with Base64 yields a 22 character string followed by two characters of padding. Sold! 

Salt per password

When the user chooses a new password, we generate some salt for this specific password, and hash them together. The OWASP example presents an SQL credential table with the following columns: 
   LOGIN VARCHAR (100) PRIMARY KEY,
   PASSWORD VARCHAR (32),
   SALT VARCHAR (32)
where we store each login's password and salt. 

Crypto parameters

So let's try PBKDF2. We'll leave Bcrypt and Scrypt for another day. (Your opinions on these options, parameters and what-not, are invited, so please come to the party!) 
public class Passwords {
   public static final String ALGORITHM = "PBKDF2WithHmacSHA1";
   public static final int ITERATION_COUNT = 8192;
   public static final int KEY_SIZE = 160;

   public static byte[] hashPassword(char[] password, byte[] salt)
          throws GeneralSecurityException {
      return hashPassword(password, salt, ITERATION_COUNT, KEY_SIZE);
   }

   public static byte[] hashPassword(char[] password, byte[] salt,
          int iterationCount, int keySize) throws GeneralSecurityException {
      PBEKeySpec spec = new PBEKeySpec(password, salt, iterationCount, keySize);
      SecretKeyFactory factory = SecretKeyFactory.getInstance(ALGORITHM);
      return factory.generateSecret(spec).getEncoded();
   }
   ...
where we give a PBEKeySpec to a SecretKeyFactory specified with thePBKDF2WithHmacSHA1 algorithm. Let's test that this salting, hashing and matching actually works. 
   @Test
   public void test() throws Exception {
      char[] password = "12345678".toCharArray();
      byte[] salt = PasswordSalts.nextSalt();
      byte[] hash = Passwords.hashPassword(password, salt);
      assertTrue(Passwords.matches(password, hash, salt));
      byte[] otherSaltBytes = Arrays.copyOf(salt, salt.length);
      otherSaltBytes[0] ^= otherSaltBytes[0];
      assertFalse(Passwords.matches(password, hash, otherSaltBytes));
      assertFalse(Passwords.matches("wrong".toCharArray(), hash, salt));
   }
where we use the following method to authenticate a supplied password, having retrieved the hash and salt from our database. 
public class Passwords {
   ...
   public static boolean matches(char[] password, byte[] passwordHash, 
          byte[] salt) throws GeneralSecurityException {
      return matches(password, passwordHash, salt, ITERATION_COUNT, KEY_SIZE);
   }

   public static boolean matches(char[] password, byte[] passwordHash, 
          byte[] salt, int iterationCount, int keySize) 
          throws GeneralSecurityException {
      return Arrays.equals(passwordHash, hashPassword(password, salt, 
          iterationCount, keySize));
   }
}
where we must specify the PBKDF2 parameters used to create the hash in the first place. Note that we use char[] so that provided passwords can be cleared from memory, e.g. usingArrays.fill() to zero the char array. 

Computational effort

According to the aforecited OWASP Password Storage Cheat Sheet
You should measure the time required and make sure that it's as large as possible without providing a significantly noticeable delay when your users authenticate.
Perhaps the time required to hash the password should be less than 100ms for consumer sites? And in the case of secure admin sites, a tad longer? 
   @Test
   public void testEffort() throws Exception {
      String password = "12345678";
      long startMillis = System.currentTimeMillis();
      byte[] saltBytes = Passwords.nextSalt();
      Passwords.hashPassword(password, saltBytes);
      System.out.println("time " + Millis.elapsed(startMillis));
      if (Millis.elapsed(startMillis) < 10) {
         System.out.println("Ooooooo.... i'm not sure");
      } else if (Millis.elapsed(startMillis) > 500) {
         System.out.println("Mmmmmmm.... i don't know");
      }
   }
Given that CPU power is increasing every year, surely we need a dynamic solution where we can revise the parameters at a future date? So let's extend OWASP'sSQL credential table to include the PBKDF2 parameters. 
   LOGIN VARCHAR (100) PRIMARY KEY,
   PASSWORD VARCHAR (32),
   SALT VARCHAR (32),
   PBKDF2_ITERATION_COUNT INTEGER,
   PBKDF2_KEY_SIZE INTEGER,   
This would facilitate rehashing passwords on-the-fly to higher parameters when a user logs in and their actual password is thus on hand. This will be presented in the upcoming sequel entitled "Password Rehash" :) 

Secret salt

According to the oft aforementioned OWASP Password Storage Cheat Sheet
An additional password storage defense mechanism involves storing the salt in a different location than the password hash.
Use of the server's filesystem is one commonly used mechanism for salt isolation, assuming the password hashes are stored in a different location such as a database.
Is this really necessary? Well then rather than using the filesystem for salt storage per se, it'd be simpler to encrypt the salt in the database, and store just the salt-encrypting key on the filesystem. Our app then uses this key to decrypt the salts retrieved from the database. Moreover, rather than employing a KeyStore file, it would be simpler to specify a password in our application configuration on the filesystem, for the purpose of password-based encryption of the salt. This will be presented in the finale of this sub-trilogy, entitled "Password Cipher." Then we'd have a salt hard-coded in our source code, for a password stored on the filesystem, for the encryption of the password salts stored in our database. All these things would have to be stolen to have a crack at our password hashes. Hopefully this will buy us just enough time to alert our users that, erm, their data is potentially amiss, and they should change their passwords rather urgently ;) 

Summary

Before hashing passwords, we must add salt, to protect against rainbow attacks and what-not. While SHA-2 is recommended these days for general hashing, we should use computationally expensive algorithms for password hashing, so that the passwords are harder to crack. We present an implementation usingPBKDF2WithHmacSHA1, with a high number of iterations. We read that other recommended alternatives are bcrypt and scrypt. The hashing operation should take as long as we are willing to make the user wait, perhaps a few hundred milliseconds. We tweak the algorithm parameters accordingly. 

Sequels

In Password Rehash, we cater for multiple revisions of the number of iterations and key size in the database, and migrate hashes on the fly to the latest revision when the user logs in. Finally, in "Password Cipher," we'll encrypt the password salt in our database, using... password-based encryption :) 

Resources

https://github.com/evanx/vellum/wiki - see Passwords.java
evanx

Timestamped Millis Blog

Posted by evanx Nov 21, 2012
We present a miniscule Millis utility class for handling intervals, in milliseconds, not least because we record timestamps as per System.currentTimeMillis, i.e the number of milliseconds since the Unix epoch. As such we can skirt around the issue of the time as seen on clocks, with their time zones and calendars and what-not.    

https://code.google.com/p/vellum/wiki/home

http://upload.wikimedia.org/wikipedia/commons/1/12/Gnome-appointment-new.svg Timestamped Millis: A part of "Timestamped: a trilogy in a few parts."
Isn't TimeUnit the best thing since sliced bread?! Indeed anything and everything from Doug Lea is always thus :) That said, for the purposes of some Timestampedthingies, we find ourselves cobbling together a Millisutil class. 
public class Millis {
    
    public static long elapsedMillis(long startMillis) {
        return System.currentTimeMillis() - startMillis;
    }

    public static boolean isElapsed(long startMillis, long millis) {
        return (System.currentTimeMillis() - startMillis) > millis;
    }
    ...
This util class primarily deals with time intervals. We often express time intervals as a long value without an explicit time unit, which is then assumed to be milliseconds. Furthermore we often treat the timestamp of an event as the time interval since the Unix epoch, thanks toSystem.currentTimeMillis(). Time without its time zone, like measurement without units, will cause us problems at some stage. Remember what happened to that Arianne rocket when the units got mixed up? So we have to be careful with numbers without qualification. Hence Doug Lea's introduction of TimeUnitin his superlative java.util.concurrent package, is sooo good, providing safety and convenience. http://upload.wikimedia.org/wikipedia/commons/thumb/e/e3/1000000000seconds.jpg/320px-1000000000seconds.jpg 

Time and tide

While the "epochal time" is absolute, the time of an event as recorded on our clock and calendar is relative to theTimeZone for which that clock is configured. And the problem is there are so many clocks in the world, and so few of them have the same time! ;) System#currentTimeMillis javadocs says, 
See the description of the class Date for a discussion of slight discrepancies that may arise between "computer time" and coordinated universal time (UTC).
We will certainly do that, but not today! 

Time conversion

Of course we often want to convert to and from millis. 
 
    public static long toSeconds(long millis) {
        return millis/1000;
    }

    public static long toMinutes(long millis) {
        return millis/1000/60;
    }

    public static long toHours(long millis) {
        return millis/1000/60/60;
    }

    public static long toDays(long millis) {
        return millis/1000/60/60/24;
    }
    
    public static long fromSeconds(long seconds) {
        return seconds*1000;
    }

    public static long fromMinutes(long minutes) {
        return minutes*60*1000;
    }

    public static long fromHours(long hours) {
        return hours*60*60*1000;
    }
    
    public static long fromDays(long days) {
        return days*24*60*60*1000;
    }
Not exactly rocket science - but when rocket scientists get these wrong, their rockets tend to explode. Actually the above type of interval conversions are comprehensively and beautifully handled byTimeUnit, as the following rewrite illustrates. 
    public static long fromDays(long days) {
        return TimeUnit.DAYS.toMillis(days);
    }

Format

We roll in a format method. 
       
    public static String format(long millis) {
        if (millis == 0) return "00:00:00,000";
        long hour = millis/Millis.fromHours(1);
        long minute = (millis % Millis.fromHours(1))/Millis.fromMinutes(1);
        long second = (millis % Millis.fromMinutes(1))/Millis.fromSeconds(1);
        long millisecond = millis % Millis.fromSeconds(1);
        return String.format("%02d:%02d:%02d,%03d", hour, minute, second, millisecond);        
    }
where this is used for logging and stuff, just to makemillis time intervals more readable. Let's test. 
public class MillisTest {

    @Test
    public void testIntervalMillis() {
        Assert.assertEquals(Millis.format(1001), "00:00:01,001");
        Assert.assertEquals(Millis.format(60888), "00:01:00,888");
        Assert.assertEquals(Millis.format(3600999), "01:00:00,999");
    }    
But now let's do the wrong thing. 
    @Test
    public void breakingBad() {
        System.out.println(new SimpleDateFormat("HH:mm:ss").format(new Date(0)));
        System.out.println(new SimpleDateFormat("HH:mm:ss").format(new Date(System.currentTimeMillis())));
        System.out.println(Millis.format(System.currentTimeMillis() % Millis.fromDays(1)));
    }    
We see the folly of using SimpleDateFormat and pretending that a time interval is a time of day, on the day of the Epoch. When using Date you're putting your clock on the block. 
02:00:00
22:43:50
20:43:50,437
where my default time zone seems to be 2 hours ahead of Greenwich. Recall that at the moment of time on Earth that the Unix epoch happened way back around 1970-01-01, actually the time was00:00:00 only up there in Greenwich. So for most of us, our clocks where not 00:00:00 at the time of the Epoch. Gutting. In fact for a good deal of us, it wasn't even 1970 yet - it was still 1969 - yeah baby! 

In parsing

We implement some parsing. 
    public static long parse(String string) {
        int index = string.indexOf(" ");
        if (index > 0) {
            return TimeUnit.valueOf(string.substring(index + 1)).toMillis(
                Long.parseLong(string.substring(0, index)));
        } else if (string.length() >= 2 &&
                Character.isLowerCase(string.charAt(string.length() - 1)) && 
                Character.isDigit(string.charAt(string.length() - 2))) {            
            long value = Long.parseLong(string.substring(0, string.length() - 1));    
            if (string.endsWith("d")) {
                return TimeUnit.DAYS.toMillis(value);
            } else if (string.endsWith("h")) {
                return TimeUnit.HOURS.toMillis(value);
            } else if (string.endsWith("m")) {
                return TimeUnit.MINUTES.toMillis(value);
            } else if (string.endsWith("s")) {
                return TimeUnit.SECONDS.toMillis(value);
            }
        }  
        throw new ParseRuntimeException(string);
    }
The test case below illustrates the functionality of this method. 
    @Test
    public void testParse() {
        Assert.assertEquals(Millis.parse("1 SECONDS"), 1000);
        Assert.assertEquals(Millis.parse("1m"), 60000);
        Assert.assertEquals(Millis.parse("60m"), 3600000);
        Assert.assertEquals(Millis.parse("60m"), Millis.parse("1h"));
        Assert.assertEquals(Millis.parse("24h"), Millis.parse("1d"));
    }
So this is used for configuration settings of time intervals e.g. in our app's context.xml
  <parameter name="interval" value="45s"/>

Conclusion

In this Timestamped series, we'll sometimes make use of aMillis convenience class which is presented here. This class is a utility for time intervals, rather than "time" per se. We express time intervals in milliseconds, and timestamps as the time interval since the Unix epoch, in order to steer clear of time zones (and Date and Calendar), for now. 

Further treatment

At some stage we should gloss over that unrepentant Date, crucial TimeZone, cardinal UTC, stupendous Calendar, and serendipitous JSR 310

Resources

https://github.com/evanx/vellum/wiki - see the Millis class. @evanxsummers

The Google Authenticator mobile apps implement an IETF time-based one-time-password standard. This hashes the time, with a shared secret using the HMAC-SHA1 algorithm, to generate a one-time password. But besides enabling multi-factor authentication for our personal Google account, how would we emp ]]>

Prequels

http://upload.wikimedia.org/wikipedia/commons/5/55/Gnome-security-medium.svg Google Authenticator: Part of "The Enigma Posts" series

https://github.com/evanx/vellum Probably you've heard of this Google Authenticator thing? So let's install this app on our phone, perhaps add the Chrome extension to our desktop browser, and change our Google account to require 2-step authentication on https://www.google.com/settings/security.

What is this Google Authenticator?

The Google Authenticator is a client-side implementation of a "time-based one-time password algorithm" (TOTP), in particular IETF RFC6238. Each account we configure on our Google Authenticator has a stored secret, shared with some web account. The app displays the time-based code for each secret, which changes every 30 seconds. This code is computed from the number of 30 second intervals since the UNIX time epoch, hashed with that shared secret using the HMAC-SHA1 algorithm. Sooo simple! :) We see on http://code.google.com/p/google-authenticator that it also supports the counter-based HMAC-Based One-time Password (HOTP) algorithm specified in RFC 4226, which we ignore here and leave for another article perhaps. So the question arises, can we support Google Authenticator clients for multi-factor authentication on websites that we build? Let's explore what this would entail...

Random secret in Base32

For each user account, we generate a random secret for the Google Authenticator client.

 void test() { byte[] buffer = new byte[10]; new SecureRandom().nextBytes(buffer); String secret = new String(new Base32().encode(buffer)); System.out.println("secret " + secret); } 

where the secret is a 10 byte random number, which is encoded using Base32 (RFC3548). This produces a string that is 16 characters long, in particular the characters A-Z and 2-7.

 secret OVEK7TIJ3A3DM3M6 

The user creates a new account on their Google Authenticator app, entering this secret. For entering codes into mobiles, we find that mixing alpha and numeric is not uber-convenient, and so one might reorder that secret e.g. OVEKTIJADMM73336, albeit thereby losing some of its randomness. But take any of my suggestions with a pinch of salt. Heh heh, a crypto pun!

QR code

Alternatively, one can generate a QR barcode e.g. using the Google Chart API service, for the user to scan into their Google Authenticator app.

 String secret = "OVEK7TIJ3A3DM3M6"; String user = "evanx"; String host = "beethoven"; void test() throws Exception { System.out.println(getQRBarcodeOtpAuthURL(user, host, secret)); System.out.println(Strings.decodeUrl(getQRBarcodeURLQuery(user, host, secret))); System.out.println(getQRBarcodeURL(user, host, secret)); } public static String getQRBarcodeURL(String user, String host, String secret) { return "https://chart.googleapis.com/chart?" + getQRBarcodeURLQuery(user, host, secret); } public static String getQRBarcodeURLQuery(String user, String host, String secret) { return "chs=200x200&chld=M%7C0&cht=qr&chl=" + Strings.encodeUrl(getQRBarcodeOtpAuthURL(user, host, secret)); } public static String getQRBarcodeOtpAuthURL(String user, String host, String secret) { return String.format("otpauth://totp/%s@%s&secret=%s", user, host, secret); } 

where the QR code encodes the following URL.

 otpauth://totp/evanx@beethoven?secret=OVEK7TIJ3A3DM3M6 

The QR code can be rendered using the following Google Chart request.

 https://chart.googleapis.com/chart?chs=200x200&chld=M%7C0&cht=qr&chl= otpauth%3A%2F%2Ftotp%2Fevanx%40beethoven%26secret%3DOVEK7TIJ3A3DM3M6 

Just for clarity, the following shows the decoded URL query.

 chs=200x200&chld=M|0&cht=qr&chl=otpauth://totp/evanx@beethoven&secret=OVEK7TIJ3A3DM3M6 

So we use this Google Chart service to render the QR code onto our computer screen so we can scan it into our phone's Google Authenticator app. Otherwise we have to be poking those tiny buttons with our fat fingers again. https://chart.googleapis.com/chart?chs=200x200&chld=M%7C0&cht=qr&chl=otpauth%3A%2F%2Ftotp%2Fevanx@beethoven%3Fsecret%3DOVEK7TIJ3A3DM3M6 You can right-click on the above link, and scan the barcode into your Google Authenticator, to test it. Wow, it works! So cool... :) You might get prompted to install a barcode scanner first, which is so easy ;) Having scanned this into our Google Authenticator app, it will display the time-varying code for this account, together with our other accounts. Woohoo!

Authentication

So when users login to our site, they enter the current 6-digit OTP code displayed on their phone for their account on our website, where this OTP changes every 30 seconds. To authenticate this on the server-side, first we get the current time index i.e. the number of 30s intervals since the UNIX time epoch.

 public static long getTimeIndex() { return System.currentTimeMillis()/1000/30; } 

So far, soooo easy :) We can then calculate the authentication code, for this time interval, using the user's secret.

private static long getCode(byte[] secret, long timeIndex) throws NoSuchAlgorithmException, InvalidKeyException { SecretKeySpec signKey = new SecretKeySpec(secret, "HmacSHA1"); ByteBuffer buffer = ByteBuffer.allocate(8); buffer.putLong(timeIndex); byte[] timeBytes = buffer.array(); Mac mac = Mac.getInstance("HmacSHA1"); mac.init(signKey); byte[] hash = mac.doFinal(timeBytes); int offset = hash[19] & 0xf; long truncatedHash = hash[offset] & 0x7f; for (int i = 1; i < 4; i++) { truncatedHash <where...
  • The time index is put into an 8-byte array.
  • We use HmacSHA1to hash this array into 20 bytes.
  • The first nibble (4 bits) of the last byte is taken as an offset (from 0 to 15) into the 20-byte array.
  • Four bytes from the offset are then extracted, with the highest-order bit zero'ed. So at this stage, i guess we have an unsigned zero-based 31-bit number with a maximum value of 2^31 minus 1 i.e. Integer.MAX_VALUEi.e. 2,147,483,647.
  • We take the lower 6 decimal digits as our one-time password. Voila!
Now the only problem is that the clocks might be out of whack by a minute or two (or indeed our workstation which we are using a test server). So we need to check codes for a few time indexes before and after the supposed time. Problem solved! Hopefully it goes without saying that servers always have the correct time thanks to NTP, although you'd be surprised... ;)
public static boolean verifyCode(String secret, int code, long timeIndex, int variance) throws Exception { byte[] secretBytes = new Base32().decode(secret); for (int i = -variance; iSo let's test this!
long testTimeIndex = 45064605; int testCode = 111070; void test() throws Exception { System.out.println("time: " + getTimeIndex()); System.out.println("code: " + getCode(secret, getTimeIndex())); System.out.println("codes: " + getCodeList(secret, getTimeIndex(), 5)); System.out.println("verify: " + verifyCode(secret, testCode, testTimeIndex, 5)); } static long getCode(String secret, long timeIndex) throws NoSuchAlgorithmException, InvalidKeyException { return getCode(new Base32().decode(secret), timeIndex); } static List getCodeList(String secret, long timeIndex, int variance) throws NoSuchAlgorithmException, InvalidKeyException { List list = new ArrayList(); for (int i = -variance; iThe following output is observed.
 time: 45076085 code: 766710 codes: [192262, 720538, 629431, 92289, 937348, 766710, 74053, 425245, 738189, 469760, 486815] verify: true 
We can add the GAuth Chrome extension to our browser to compare, and of course the mobile app :)

Multi-factor authentication

According to http://en.wikipedia.org/wiki/Multi-factor_authentication,
Existing authentication methodologies involve three basic "factors":
  • Something the user knows (e.g., password, PIN);
  • Something the user has (e.g., ATM card, smart card); and
  • Something the user is (e.g., biometric characteristic, such as a fingerprint).
and
Authentication methods that depend on more than one factor are more difficult to compromise than single-factor methods.
Clearly the only way to generate the correct OTP code is by having the secret key, so then if the user has entered the correct password (as the thing they "know"), as well as the correct OTP (proving that they "have" the secret), then two-factor authentication is "thus enabled" :) Finally, the question arises how to handle the event of a user losing their phone (and OTP secret)? So I guess in addition to a "Forgot Password" facility, one needs a "Lost phone" button ;) That is, we need to enable users to reset their OTP secret, without that becoming the weak link. We'll address this another day ;)

Conclusion

The Google Authenticator mobile apps implement the IETF RFC6238 time-based one-time-password standard. This hashes the time since the epoch with a shared secret using the HMAC-SHA1 algorithm. Besides enabling multi-factor authentication for our personal Google account, we can easily employ Google Authenticator for multi-factor authentication on our websites. We hash the number of 30s time intervals since the epoch with a secret using the HMAC-SHA1 algorithm. A slight complication is that the clocks might be a bit out of sync, so we allow some leniency for that.

Coming up

 We should investigate the counter-based HOTP, since we might be missing a trick there?!

Resources

 https://github.com/evanx/vellum/wiki - see the vellumdemo.totppackage.

Link list

 http://en.wikipedia.org/wiki/Multi-factor_authentication http://en.wikipedia.org/wiki/Hash-based_message_authentication_code. http://en.wikipedia.org/wiki/Time-based_One-time_Password_Algorithm http://en.wikipedia.org/wiki/HOTP http://tools.ietf.org/html/rfc6238 http://tools.ietf.org/html/rfc4226 http://code.google.com/p/google-authenticator
evanx

Deque shredder Blog

Posted by evanx Aug 1, 2012

Last time we introduced the trivial namesake Timestamped interface, and used the excellent ArrayDeque of Java6 to collect such things, imposing a time-based capacity and some external synchronization. Now let's test this with some threads.

http://upload.wikimedia.org/wikipedia/commons/1/12/Gnome-appointment-new.svg Deque shredder: A part of "Timestamped: a trilogy in a few parts."

In case you missed the update to our previous installment, our generic TimestampedElement now has a compareTo() for natural orderingby the timestamp.

 public class TimestampedElement implements Timestamped, Comparable { T element; long timestamp; public TimestampedElement(T element, long timestamp) { this.element = element; this.timestamp = timestamp; } public T getElement() { return element; } @Override public long getTimestamp() { return timestamp; } @Override public int compareTo(Timestamped other) { if (timestamp < other.getTimestamp()) return -1; if (timestamp > other.getTimestamp()) return 1; else return 0; } } 

For completeness herewithin, our deque collector for our Timestampedthingies, follows.

 public class TimestampedDequer { long capacityMillis; long lastTimestamp; ArrayDeque deque = new ArrayDeque(); public TimestampedDequer(long capacityMillis) { this.capacityMillis = capacityMillis; } public synchronized void addLast(T element) { if (element.getTimestamp() == 0 || element.getTimestamp() < lastTimestamp) { deque.clear(); // throw our toys out the cot exception } else { lastTimestamp = element.getTimestamp(); prune(lastTimestamp); deque.addLast(element); } } private void prune(long latestTimestamp) { while (deque.size() > 0 && deque.getFirst().getTimestamp()  snapshot(long lastTimestamp) { prune(lastTimestamp); return deque.clone(); } public synchronized Deque tail(int size) { Deque tail = new ArrayDeque(); Iterator it = deque.descendingIterator(); for (int i = 0; i < size && it.hasNext(); i++) { tail.addFirst(it.next()); } return tail; } } 

where we use the efficient ArrayDeque implementation of Java6. As discussed last time, we remove expired elements from the head when we add the latest element to the tail, to make it self-pruning. And we provide a sychronized snapshot() and tail() for a couple of use-cases as follows... Decisively, we will use snapshot() to analyse the latest records for the desired interval e.g. for an automated status check every minute. Furthermore, we will use a size-based tail() e.g. to display the latest so-many records in a status web page, for informational purposes. Now let's do us some "heavy-dropping" with threads, using a ScheduledExecutorServiceto regularly add records to the deque.

 public class TimestampedDequerTest { long capacityMillis = 90; long scheduledInterval = 10; long scheduledDelay = 0; final TimestampedDequer dequer = new TimestampedDequer(capacityMillis); boolean verbose = false; Runnable scheduledRunnable = new Runnable() { @Override public void run() { addLast(); } }; private void addLast() { long timestamp = System.currentTimeMillis(); String value = "record at " + timestamp; dequer.addLast(new TimestampedElement(value, timestamp)); if (verbose) { System.out.println(value); } } @Test public void testConcurrently() throws Exception { ScheduledExecutorService executorService = Executors.newScheduledThreadPool(capacity); ScheduledFuture future = executorService.scheduleAtFixedRate( scheduledRunnable, scheduledDelay, scheduledInterval, TimeUnit.MILLISECONDS); checkConcurrently(); checkConcurrently(); future.cancel(true); } 

where we check twice, just to make sure! ;) Actually we want to make sure of the prune()'ing, following the sleep for capacityMillis.

private void checkConcurrently() throws Exception { long startMillis = System.currentTimeMillis(); System.out.println("startMillis " + startMillis); verbose = true; Thread.sleep(capacityMillis); int expectedCapacity = (int) (capacityMillis / scheduledInterval); verbose = false; long stopMillis = System.currentTimeMillis(); System.out.println("stopMillis " + stopMillis); Deque deque = dequer.snapshot(stopMillis); long firstTimestamp = deque.getFirst().getTimestamp(); long lastTimestamp = deque.getLast().getTimestamp(); System.out.println("size " + deque.size()); System.out.println("first " + firstTimestamp); System.out.println("last " + lastTimestamp); Assert.assertTrue("first time", firstTimestamp >= startMillis); Assert.assertTrue("last time", lastTimestamp >= firstTimestamp); Assert.assertTrue("capacityMillis min", lastTimestamp - firstTimestamp >= 0); Assert.assertTrue("capacityMillis max", lastTimestamp - firstTimestamp 0); Assert.assertTrue("size max", deque.size()which prints...
 scheduledInterval 10 record at 1340231378158 record at 1340231378168 record at 1340231378178 ... record at 1340231378228 record at 1340231378238 startMillis 1340231378157 stopMillis 1340231378247 size 9 first 1340231378158 last 1340231378238 ... startMillis 1340231378249 stopMillis 1340231378339 size 9 first 1340231378258 last 1340231378338 
We survey this output, eyeing the timestamps, and nod ponderously. Just for good measure, we add the records to a SortedSet, and check that the first and last timestamps match.
 private void checkSet(Deque deque) throws Exception { SortedSet set = new TreeSet(); set.addAll(deque); Assert.assertEquals("size", deque.size(), set.size()); Assert.assertEquals("first", deque.getFirst().getTimestamp(), set.first().getTimestamp()); Assert.assertEquals("last", deque.getLast().getTimestamp(), set.last().getTimestamp()); } 
where our TimestampedElement's compareTo() method enables the natural ordering, and forgoes having to use our TimestampedComparator to construct the SortedSet. Let's vary the scheduledInterval.
 @Test public void testScheduledIntervals() throws Exception { while (--scheduledInterval > 0) { ScheduledFuture future = Executors.newScheduledThreadPool(10).scheduleAtFixedRate( scheduledRunnable, scheduledDelay, scheduledInterval, TimeUnit.MILLISECONDS); Thread.sleep(capacityMillis); int expectedCapacity = (int) (capacityMillis / scheduledInterval); long stopMillis = System.currentTimeMillis(); Deque deque = dequer.snapshot(stopMillis); Woohoo.assertEquals("interval " + scheduledInterval, expectedCapacity, deque.size()); future.cancel(true); Thread.sleep(scheduledInterval); } } 
where we loop the scheduledIntervaldown to 1ms.
 interval 9: Woohoo! 10 == 10 interval 8: Woohoo! 11 == 11 interval 7: Woohoo! 12 == 12 interval 6: D'oh! 15 != 14 interval 5: D'oh! 18 != 17 interval 4: Woohoo! 22 == 22 interval 3: D'oh! 30 != 29 interval 2: D'oh! 45 != 44 interval 1: D'oh! 90 != 89 
Given how unpredictable time is, ironically, with those threads and what-not, we can't exactly predict the size of the list. D'oh! So for that we have used the following util class to see if the size is more or less what we expect...
 public class Woohoo { public static void assertEquals(String message, Object expected, Object actual) { if (actual.equals(expected)) { System.out.printf("%s: Woohoo! %s == %s\n", message, expected, actual); } else { System.out.printf("%s: D'oh! %s != %s\n", message, expected, actual); } } 
Selectively using the above drop-in replacement for Assert, we get our tests to pass 100%, woohoo! ;) To further increase the intensity, we allow the scheduled threads to coalesce.
 @Test public void testCumulativeScheduledIntervals() throws Exception { Deque futures = new ArrayDeque(); int expectedCapacity = 0; while (--scheduledInterval > 0) { expectedCapacity += (int) (capacityMillis / scheduledInterval); ScheduledFuture future = Executors.newScheduledThreadPool(100).scheduleAtFixedRate( scheduledRunnable, scheduledDelay, scheduledInterval, TimeUnit.MILLISECONDS); futures.add(future); Threads.sleep(capacityMillis); long stopMillis = System.currentTimeMillis(); Deque deque = dequer.snapshot(stopMillis); Woohoo.assertEquals("interval " + scheduledInterval, expectedCapacity, deque.size()); Threads.sleep(scheduledInterval); } for (ScheduledFuture future : futures) { future.cancel(true); } } 
where we only cancel all the scheduled threads when all is said and done. As expected, we get some D'oh's and Woohoo's!
 interval 9: Woohoo! 10 == 10 interval 8: Woohoo! 21 == 21 interval 7: D'oh! 33 != 34 interval 6: Woohoo! 48 == 48 interval 5: Woohoo! 66 == 66 interval 4: D'oh! 88 != 89 interval 3: Woohoo! 118 == 118 interval 2: D'oh! 163 != 165 interval 1: D'oh! 253 != 254 
Let's wrap this up by asserting that our Deque collector is reasonably robust!? Next time, we'll use our timestamped deque to capture log4j records, and as alluded to previously, regularly analyse the latest dequeof logs to detect when our app's wheels are wobbling or even coming off altogether, and notify ourselves thereof.

Resources

 https://github.com/evanx/vellum/wiki
evanx

Timestamped Deque Blog

Posted by evanx Jun 27, 2012

In this here second part of the "Timestamped" series, we introduce its namesake Timestamped interface, and use a Deque from Java6 to collect such things, and impose a time-based capacity for that.

Ultimately we gonna hook up a remote Log4j appender, to digest logs in order to provide ]]>What fun!But lemme not get ahead of myself.

Prequels

Incidently we kicked off this no-hit wonder series with Counter Map, which does not feature further yet, so ignore that for now.

http://upload.wikimedia.org/wikipedia/commons/1/12/Gnome-appointment-new.svg Timestamped Deque: A part of "Timestamped: a trilogy in a few parts."

https://code.google.com/p/vellum/wiki/homeWithout further ado, I give you the namesake interface of this series.

 public interface Timestamped { public long getTimestamp(); } 

which returns the timestamp in "millis" ala System.currentTimeMillis(). Also take an adapter for Log4j's LoggingEvent.

 public class TimestampedLoggingEventAdapter implements Timestamped { LoggingEvent loggingEvent; public TimestampedLoggingEventAdapter(LoggingEvent loggingEvent) { this.loggingEvent = loggingEvent; } @Override public long getTimestamp() { return loggingEvent.getTimeStamp(); } } 

And a generic wrapped element.

 public class TimestampedElement implements Timestamped, Comparable { T element; long timestamp; public TimestampedElement(T element, long timestamp) { this.element = element; this.timestamp = timestamp; } public T getElement() { return element; } @Override public long getTimestamp() { return timestamp; } @Override public int compareTo(Timestamped other) { if (timestamp < other.getTimestamp()) return -1; if (timestamp > other.getTimestamp()) return 1; else return 0; } } 

where we implement compareTo() for natural ordering by the timestamp. Since duplicate timestamps are possible i.e. where two or more events occur at the same millisecond, and indeed duplicate log messages at the same time, we forgo implementing hashCode() and equals(). Imagine we add such elements to a Set, whose javadoc describes it thus:

Set - A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e)

Therefore we defer to the default hashCode() and equals() from Object, which are based on object address reference. We might construct a SortedSet of Timestampedelements.

SortedSet - The elements are ordered using their natural ordering, or by a Comparator typically provided at sorted set creation time.

So if we have not implemented compareTo(), then we will need a comparator.

 public class TimestampedComparator implements Comparator { @Override public int compare(Timestamped o1, Timestamped o2) { if (o1.getTimestamp() < o2.getTimestamp()) return -1; if (o1.getTimestamp() > o2.getTimestamp()) return 1; else return 0; } } 

Our inclination might be collect Timestamped elements in a List, or a Queueperhaps.

Queue - A collection designed for holding elements prior to processing.

That sounds rather appropriate for our digestive purposes, to find the ghost in the machine.

Deque collector

So let's introduce the namesake of this article, a collector of timestamped thingies - a circular buffer, some might call it - and impose a time-based capacity thereupon. So we use the java.util.Dequefound in Java 1.6, courtesy of those most excellent gentlemen, Doug Lea and Josh Bloch. Its javadoc describes it thus:

Deque - A linear collection that supports element insertion and removal at both ends. The name deque is short for "double ended queue" and is usually pronounced "deck."

We use the efficient ArrayDequeimplementation.

ArrayDeque - This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue.

Fantastic. Let's get us some of that.

 public class TimestampedDequer { long capacityMillis; long lastTimestamp; ArrayDeque deque = new ArrayDeque(); public TimestampedDequer(long capacityMillis) { this.capacityMillis = capacityMillis; } public synchronized void addLast(T element) { if (element.getTimestamp() == 0 || element.getTimestamp() < lastTimestamp) { deque.clear(); // throw our toys out the cot exception } else { lastTimestamp = element.getTimestamp(); prune(lastTimestamp); deque.addLast(element); } } private void prune(long lastTimestamp) { while (deque.size() > 0 && deque.getFirst().getTimestamp()  

 where we compose an ArrayDeque and synchronize it "externally" for our purposes, considering that it will be digesting log records continually, whilst being under interrogation by RMX, HTTP requests and what-not. When we add an element onto the tail, we prune() expired elements from the head. Observe that the above implementation assumes that elements are added in chronological order. However we expect our host's time to be adjusted by NTP occassionally - hence we clear() i.e. when we encounter an eventuality that we don't play nicely with. If we are digesting logs from multiple servers or what-have-you, the above so-called "dequer" aint gonna work, baby - it's gonna come up empty, baby. Don't shuffle this deque, baby. (As Elvis might have said.) We'll deal with such a handful another time. Now, in order to analyse the contents for the desired interval, we take a snapshot().
 public synchronized Deque snapshot(long lastTimestamp) { prune(lastTimestamp); return deque.clone(); } 
which returns a defensive copy, and so is a relatively costly operation. Perhaps you could recommend an alternative strategy? Perhaps we could implement a special concurrent deque implementation in a future episode, as an exercise? Taking inspiration from that Disruptor thingymajig, perchance, as well as ArrayDequeitself? One that supports aggregating from multiple servers, methinks. Another use-case is to get the tail i.e. the latest so-many elements, for informational purposes e.g. to display via a servlet, or attach to an email alert.
 public synchronized Deque tail(int size) { Deque tail = new ArrayDeque(); Iterator it = deque.descendingIterator(); for (int i = 0; i < size && it.hasNext(); i++) { tail.addFirst(it.next()); } return tail; } 
where we use descendingIterator() to read from the tail of the deque, and addFirst()to rectify the order. Let's test this thing.
 public class TimestampedDequerTest { TimestampedDequer dequer = new TimestampedDequer(capacityMillis); boolean verbose = false; private void addLast() { long timestamp = System.currentTimeMillis(); String value = "record at " + timestamp; dequer.addLast(new TimestampedElement(value, timestamp)); if (verbose) { System.out.println(value); } } @Test public void test() throws Exception { check(); check(); } 
where we check() twice... just to make sure. Of prune(), that is.
 private void check() throws Exception { Thread.sleep(capacityMillis); Assert.assertEquals(0, dequer.snapshot(System.currentTimeMillis()).size()); Assert.assertEquals(0, dequer.tail(4).size()); addLast(); Assert.assertEquals(1, dequer.tail(4).size()); Assert.assertEquals(1, dequer.snapshot(System.currentTimeMillis()).size()); Thread.sleep(capacityMillis/2); Assert.assertEquals(1, dequer.snapshot(System.currentTimeMillis()).size()); Assert.assertEquals(1, dequer.tail(4).size()); addLast(); Assert.assertEquals(2, dequer.tail(4).size()); Assert.assertEquals(2, dequer.snapshot(System.currentTimeMillis()).size()); Thread.sleep(capacityMillis/2); Assert.assertEquals(2, dequer.tail(4).size()); Assert.assertEquals(1, dequer.snapshot(System.currentTimeMillis()).size()); Assert.assertEquals(1, dequer.tail(4).size()); } 

 where expect the final snapshot() to loose an element to prune()'ing, given the two half capacityMillis sleeps since the first addList(). Considering that the purpose of this Timestamped series is reducing information overload, we'll tail off here for now, and leave the "heavy-dropping" for next week, namely, testing with threads. Thereafter, we'll see about using our TimestampedDequer to analyse the latest dequeof logs e.g. every minute, to detect when our app might be coming down like a house of cards.

Resources

https://github.com/evanx/vellum/wiki

Credits

 Thanks to my colleague Zach Visagie at BizSwitch.net, for his kind reviews and indispensible input!
evanx

Counter map Blog

Posted by evanx Jun 7, 2012

We herewith begin a saga about monitoring with this series entitled "Timestamped: a trilogy in a few parts," this being the first part, where we introduce a map to count key things, and ensure we can order it by its integer values.

http://upload.wikimedia.org/wikipedia/commons/1/12/Gnome-appointment-new.svg Counter map: A part of "Timestamped: a trilogy in a few parts."

We will be analysing logs in this unwinding series. Ultimately we gonna hook up a remote Log4j appender to digest our logs in order to gather stats, and make some judgement calls as to the rapidly changing status of our app.

As you can imagine we will need to count so many key things like the number of successes and failures of one thing or another. For this purpose, we introduce the following so-called "counter map."

 public class IntegerCounterMap extends TreeMap { private int totalCount = 0; public int getInt(K key, int defaultValue) { if (!containsKey(key)) { return defaultValue; } else { return get(key); } } public int getInt(K key) { return getInt(key, 0); } public void add(K key, int augend) { totalCount += augend; put(key, new Integer(getInt(key) + augend)); } public void increment(K key) { add(key, 1); } public int getTotalCount() { return totalCount; } public int calculateTotalCount() { int total = 0; for (K key : keySet()) { total += getInt(key); } return total; } public LinkedList descendingValueKeys() { return Maps.descendingValueKeys(this); } } 

where since we are often counting things one by one e.g. as we digest log messages, we provide that increment() convenience method. Our typical key type is String e.g. the name of something we are counting e.g. "ERROR".

 @Test public void testCounterMap() { IntegerCounterMap counterMap = new IntegerCounterMap(); counterMap.increment("INFO"); counterMap.increment("ERROR"); counterMap.increment("ERROR"); Assert.assertEquals(100*counterMap.getInt("ERROR")/counterMap.getTotalCount(), 66); Assert.assertEquals(counterMap.getInt("WARN"), 0); Assert.assertEquals(counterMap.getInt("INFO"), 1); Assert.assertEquals(counterMap.getInt("ERROR"), 2); Assert.assertEquals(counterMap.size(), 2); Assert.assertEquals(counterMap.getTotalCount(), 3); Assert.assertEquals(counterMap.getTotalCount(), counterMap.calculateTotalCount()); } 

where getTotalCount() is used to get a percentage of the total e.g. a 66% error rate, D'oh!

Note that since WARN is not incremented like the others, it's not put into the map, and so the sizeof map is only the two keys, namely INFO and ERROR.

We have a descendingValueKeys()method for when we wanna display counters in descending order, to see the biggest numbers and/or worst culprits. This delegates to the following util class.

 public class Maps { public static  LinkedList descendingValueKeys(Map map) { return keyLinkedList(descendingValueEntrySet(map)); } public static  NavigableSet> descendingValueEntrySet(Map map) { TreeSet set = new TreeSet(new Comparator>() { @Override public int compare(Entry o1, Entry o2) { return o2.getValue().compareTo(o1.getValue()); } }); set.addAll(map.entrySet()); return set; } public static  LinkedList keyLinkedList(NavigableSet> entrySet) { LinkedList keyList = new LinkedList(); for (Map.Entry entry : entrySet) { keyList.add(entry.getKey()); } return keyList; } public static  V getMinimumValue(Map map) { return getMinimumValueEntry(map).getValue(); } ... } 

where courtesy of a TreeMap, we sort the map's entries by value, and put the thus ordered keys into a LinkedList to iterate over e.g. in a foreach loop.

See also stackoverflow.comwhich helped me with that.

Let's test this ordering.

 @Test public void testDescendingMap() { IntegerCounterMap counterMap = new IntegerCounterMap(); counterMap.add("BWARN", 0); counterMap.add("DEBUG", 1000000); counterMap.add("ERROR", 5000); counterMap.add("INFO", 1); Assert.assertEquals(counterMap.size(), 4); Assert.assertEquals(counterMap.descendingValueKeys().getFirst(), "DEBUG"); Assert.assertEquals(counterMap.descendingValueKeys().getLast(), "BWARN"); Assert.assertEquals(Maps.getMinimumValue(counterMap).intValue(), 0); Assert.assertEquals(Maps.getMaximumValue(counterMap).intValue(), 1000000); for (String key : Maps.descendingValueKeys(counterMap)) { System.out.printf("%d %s\n", counterMap.get(key), key); } } 

where we use "BWARN" instead of "WARN" to be sure we aren't picking up the natural ordering. (Yes, that was hiding a bug that bit me in the bum!)

After some DEBUG'ing (with the help of a lot logging), we eventually get what we would have expected...

 1000000 DEBUG 1001 ERROR 1 INFO 0 BWARN 

As so often seems to be the case, we have a million DEBUG messages, a thousand and one ERRORs, with very little INFO to go on, and no bloody warning! "Put that in your internet, put that in your twitter right there."

Resources

https://code.google.com/p/vellum/ - where i will collate these articles and their code.

Coming soon

In the next installment, we'll properly introduce the namesake of this series, a so-called Timestampedinterface, for our "time series" or what-have-you.

 public interface Timestamped { public long getTimestampMillis(); } 
evanx

The Prodigal Android Blog

Posted by evanx Aug 18, 2010

I enjoyed reading the Android=Javablog just now, and agree with Osvaldo's sentiments.

Android is clearly a Java knock-off - using javac and Apache class libraries, but implementing a new VM designed for better performance in the resource-constrained environments of smartphones - hence the dex class file format. Otherwise they could have adopted JamVM. What approach would the reader take in their situation?

I thought that the Java language was an open standard, allowing clean-room implementations of the compiler, runtime libraries and VM? And if your implementation doesn't pass the TCK, then sorry you can't call it Java(tm), but good luck to you and yours.

Compared to what Microsoft did with .NET, Google took a very sensible route by not developing a new java'esque language and toolchain - but rather leveraging existing Java skills and familiar tools to hasten adoption. This is better for us Java developers than an unfamiliar language, IDE etc?

Isn't the problem that Sun hasn't really innovated in the smartphone space, as Google has done so impressively with Android? While Google was leveraging opensource Linux and Java goodies, and building a better VM for smartphones, ironically Sun was off building a new language and toolset (JavaFX).

The Java trademark and promise of compatibility is great, but Android is great too apparently - according to developers and the market.

In fairness, I don't see why Oracle should win this lawsuit, because there's no copyright infringment (in Apache Harmony), no infringment of the Java trademark (it's Android not JavaTM), and software patents are (as always) ridiculous. So maybe Google is taking this opportunity to stand up against software patents? Well in my book, fighting software patents is always to be lauded...

evanx

Betwixt the Brackets Blog

Posted by evanx Jul 15, 2010

So we need to convert objects into XML and back again, eg. to store some data in the database in XML format, because otherwise maybe we just gonna have too many tables and joins and what-not.

Actually this is for a PNR, which is "Person Name Record" or something of that nature, as used in the travel industry for any booking eg. flights but also car hire, hotels, cruises etc. The other columns in the PNR table will be search terms for an agent to find a PNR, but otherwise all the gazillion details that passengers seem to require are stuffed into the XML, from meal preferences, seat preferences, who's got whose baby on their lap, and when was this baby born, not to mention various comments and remarks communicated betwixt the airline and the agents via this PNR.

Having used XStreambefore - which i must say is really great - i thought, let's try getting in and amongst Betwixt, not least because most of our third-party jars are from the Apache behemoth.

For a start, lets say we have an entity as follows.

public class Traveler extends IdEntity {    
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    Long id;

    String firstNames;

    String surname;

    String email;

    String telephone;
    
    @Enumerated(EnumType.STRING)
    Salutation salutation;

    @Temporal(TemporalType.DATE)
    Date dateOfBirth;
    ...
}

So we write and read as follows.

    @Test
    public void testWrite() throws Exception {
        StringWriter outputWriter = new StringWriter();
        BeanWriter beanWriter = new BeanWriter(outputWriter);
        beanWriter.getXMLIntrospector().getConfiguration().setAttributesForPrimitives(false);
        beanWriter.getXMLIntrospector().getConfiguration().setTypeBindingStrategy(
            new CustomTypeBindingStrategy());
        beanWriter.getBindingConfiguration().setMapIDs(false);
        beanWriter.getBindingConfiguration().setObjectStringConverter(
            new CustomObjectStringConverter());
        beanWriter.enablePrettyPrint();
        Traveler traveler = new TravelerBean();
        traveler.setDateOfBirth(new GregorianCalendar(1969, 12, 31).getTime());
        traveler.setEmail("test@test.com");
        traveler.setSalutation(Salutation.MR);
        beanWriter.write("traveler", traveler);
        String xml = outputWriter.toString();
        outputWriter.close();
        System.out.println(xml);
        read(xml);
    }

    protected void read(String xml) throws Exception {
        StringReader xmlReader = new StringReader(xml);
        BeanReader beanReader = new BeanReader();
        beanReader.getXMLIntrospector().getConfiguration().setAttributesForPrimitives(false);
        beanReader.getXMLIntrospector().getConfiguration().setTypeBindingStrategy(
            new CustomTypeBindingStrategy());
        beanReader.getBindingConfiguration().setMapIDs(false);
        beanReader.getBindingConfiguration().setObjectStringConverter(
            new CustomObjectStringConverter());
        beanReader.registerBeanClass("traveler", Traveler.class);
        System.out.println(beanReader.parse(xmlReader));
    }

Googling the interwebs, we got the hint to use the following custom convertors and strategies eg. to handle enum's and what-not.

public class CustomObjectStringConverter extends ObjectStringConverter {

    final DateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy");

    @Override
    public String objectToString(Object object, Class type, Context context) {
        if (object == null) {
            return null;
        }
        if (type == Date.class) {
            return dateFormat.format((Date) object);
        }
        if (type.getSuperclass() == Enum.class) {
            return object.toString();
        }
        return super.objectToString(object, type, context);
    }

    @Override
    public Object stringToObject(String string, Class type, Context context) {
        if (string == null || string.length() == 0) {
            return null;
        }
        if (type == Date.class) {
            return formatDate(string);
        }
        if (type.getSuperclass() == Enum.class) {
            return Enum.valueOf(type, string);
        }
        return super.stringToObject(string, type, context);
    }

    protected Date formatDate(String string) {
        try {
            return dateFormat.parse(string);
        } catch (ParseException e) {
            throw new IllegalArgumentException(string, e);
        }
    }
}

What we probably should do above is check for annotations like@Temporal to see if our Date property is a date only, or a timestamp.

By the way, we can set an extendedPropertySuppressionStrategy to suppress writing properties, and for reading XML,ElementSuppressionStrategy etc.

Finally, in order to make enum's work, we specify aBindingType.PRIMITIVE as follows.

public class CustomTypeBindingStrategy extends TypeBindingStrategy {

    @Override
    public BindingType bindingType(Class type) {
        if (isPrimitive(type)) {
            return BindingType.PRIMITIVE;
        }
        return BindingType.COMPLEX;
    }

    protected boolean isPrimitive(Class type) {
        if (type.isPrimitive()) {
            return true;
        } else if (type.equals(Object.class)) {
            return false;
        } else if (type.getName().startsWith("java.lang.")) {
            return true;
        } else if (type == Date.class) {
            return true;
        } else if (Enum.class.isAssignableFrom(type)) {
            return true;
        }
        return false;
    }
}

What else? Ja, you need quite a few commons jars likebeanutils and digester 'cos they all build on each other apparently.

evanx

Trivial Templating Blog

Posted by evanx Jul 9, 2010

Notwithstanding the fact that anyone in their right mind (which rules me out) would use Apache Velocity or FreeMarker for templating, we present a trivial templating helper class, where for instance we have an HTML template as follows to send a confirmation email to a customer.

 ]]>Hey ${displayName}

Your Travelstart reference: ${bookingReference}

${travelerName}

where the above is used to compose an HTML email, and/or render a PDF document using FlyingSaucer.

So we invoke the "templating engine" as follows.

 public String generate(String resourceName) { HtmlTemplate template = new HtmlTemplate(getClass().getResourceAsStream(resourceName)); template.setProperty("displayName", "Evan"); template.setProperty("bookingReference", "555555"); template.setProperty("bookingDate", "16 Jul 2010"); template.setProperty("amount", "R 1,200.00"); template.setProperty("travelerRow", 0, "travelerName", "Evan Summers"); template.setProperty("travelerRow", 1, "travelerName", "Tootie Milburn"); return template.compose(); } 

This uses the trivial templating class below to read the HTML template, and substitute the given values into the template, etc.

 public class HtmlTemplate { List lineList = new ArrayList(); InputStream inputStream; StringBuilder templateBuilder = new StringBuilder(); String trClass = null; StringBuilder rowBuilder = new StringBuilder(); String line; EntryList entryList = new EntryList(); public HtmlTemplate(InputStream inputStream) { this.inputStream = inputStream; } public void setProperty(String name, Object value) { entryList.getList().add(new Entry(null, 0, name, value)); } public void setProperty(String parent, int index, String name, Object value) { entryList.getList().add(new Entry(parent, index, name, value)); } ... } 

Once the properties to substitute have been given using setProperty() above, we invoke compose() below, which handles multiple rows by looking for <tr> and </tr> elements with the appropriate CSS class.

  public String compose() throws Exception { BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream)); while (true) { line = reader.readLine(); if (line == null) { return templateBuilder.toString(); } line += "\n"; boolean trClosed = line.trim().equals(""); boolean tableClosed = line.trim().equals(""); if (trClass != null) { if (trClosed) { rowBuilder.append(line); templateBuilder.append(replaceRow()); rowBuilder.setLength(0); line = null; } else if (tableClosed) { trClass = null; } else if (rowBuilder.length() > 0) { rowBuilder.append(line); line = null; } } else { trClass = getRow(); if (trClass != null) { rowBuilder.setLength(0); rowBuilder.append(line); line = null; } } if (line != null) { templateBuilder.append(replace()); } } }  

which relies on the following methods to do the donkey-work.

 protected String getRow() { for (String name : entryList.getParentList()) { if (line.indexOf("0) { return line.substring(0, index) + value + line.substring(index + pattern.length()); } } return line; } 

We use the following rough-shod classes to coddle the data to substitute into the template.

 class EntryList { List list = new ArrayList(); public List getList() { return list; } public int getListSize(String parent) { int maxIndex = 0; for (Entry entry : list) { if (entry.parent != null && entry.parent.equals(parent)) { if (entry.index > maxIndex) { maxIndex = entry.index; } } } return maxIndex + 1; } public Map getMap(String parent, int index) { Map map = new HashMap(); for (Entry entry : list) { if (entry.parent != null && entry.parent.equals(parent) && entry.index == index) { map.put(entry.name, entry.value); } } return map; } public Map getMap(int index) { Map map = new HashMap(); for (Entry entry : list) { if (entry.parent == null && entry.index == index) { map.put(entry.name, entry.value); } } return map; } public List getParentList() { List parentList = new ArrayList(); for (Entry entry : list) { if (entry.parent != null && !parentList.contains(entry.parent)) { parentList.add(entry.parent); } } return parentList; } } 

where the above handles a list of the following tuples, allowing for a parent and index for rows.

 class Entry { String parent; int index; String name; Object value; public Entry(String parent, int index, String name, Object value) { this.parent = parent; this.index = index; this.name = name; this.value = value; } } 

But of course one should keep it real with Apache Velocity or FreeMarker. I was just decided to have some trivial coding fun :)