Integrating Gmail

One of our clients recently requested that, as an office full of heavy Gmail users, they’d like the ability to drag-and-drop messages and attachments into an application we’d developed for them – a potentially massive time-saver. The user interface we initially went for involved presenting the user with a drop-down view of their Gmail, which they can use to search, browse and, most importantly, move content into the application. Excellent!

The application in question used their Google logins to gain entry via OAuth, and luckily Gmail had extended the IMAP protocol to also support authentication via OAuth, as they don’t have any other kind of web-friendly API (as far as I could find anyway; please let me know if there’s a massive hole we’ve missed!).

Implementing this, though, presented a number of challenges…

Our first attempt

My first instinct when faced with a problem I haven’t tackled before is to call on some colleagues, do a bit of Googling, and try to get a proof-of-concept working – just to determine that it’s even possible and we’re not wasting our time. The web app we’d developed was primarily PHP so I saw no reason to not use this for the prototype.

An hour or two later and I’d found that Zend Framework had the only available implementation of OAuth over IMAP (apart from some guy on Github who’d coded up his own pure-PHP IMAP implementation and long ago abandoned the project) so I forged ahead and got something that seemed, on the face of it, to be working pretty nicely – if I do say so myself!

But firing up a few accounts for testing and throwing even the most modest refresh load at my development environment was slow, and started producing some annoying “connection limit reached” errors. Hmm…

Rethinking things

With ‘it’s possible’ proven, it was now time to engineer something that would actually work. The first option we considered was to mirror users’ email locally, so we had fast scripting-language-happy access to it (only a database query or something away). But this introduced some problems, the most troublesome being:

  • A potentially large task of first-time account sync
  • Issues of consistency staying up-to-date with new/moved/deleted emails (users will expect pretty timely access to new emails)
  • Having to re-implement Gmail’s search
  • How this would scale to a system with a large number of accounts

The second option was to introduce a proxy between our PHP application and Gmail.  I briefly looked at ImapProxy, but it doesn’t support TLS…which you can add Perdition in to the mix for…but that won’t work with XOAUTH. So, I decided to write a small JSON API in a resident language that could maintain persistent IMAP connections; essentially presenting a fast HTTP interface to what we needed from Gmail.

The result is a small Java/Clojure application called Groxy (Gmail-proxy, obviously…), which gave us these benefits:

  • It’s fast (though still not as fast as it can be; IMAP is slow…) and, through pooled connections, no more errors
  • There’s full Gmail search
  • It’s always up-to-date
  • Users have access to their entire Gmail archive
  • Easy handling of new accounts

On to some of the more interesting techy details…

Brief overview

The application does pretty much exactly what I just mentioned we needed so I won’t go over it again in great detail. It runs as a stateless service (allowing us to install it on all our web servers for redundancy), exposing a JSON HTTP API which clients then send their Gmail requests to, along with the required OAuth details. It implements Gmail’s IMAP extensions (X-GM-RAW and X-GM-THRID) so it has proper search, and you can fetch messages and attachments in a web-API-friendly way with a simple client (we’re using a thin wrapper around Guzzle).

You can of course check it out and give it a go on Github.

Compojure web application

The web part of the application is written in Clojure using Compojure on Jetty. I knew I’d be using JavaMail (more on that next) so given this, Clojure’s seamless access to Java, and how well-suited I’ve found it to small API services like this I was reasonably confident it would be a good fit. I initially coded the app up using http-kit as the Ring adapter, but it turned out that it doesn’t handle streaming large attachments very well; requiring them to be realised in memory all at once. The Jetty adapter, however, will stream them properly.

The Clojure part acts as the glue to tie together our access to JavaMail, our extensions for Gmail, and the Google OAuth code, then adding our layers of connection handling, caching, concurrency control, and message formatting for JSON.

JavaMail API

I don’t know about other developers, but I found JavaMail…a bit obscure. Personally, I think striving to support many different protocols has made it all a bit too abstract. But it’s what Google’s OAuth code is written for, and it is pretty battle-hardened.

As good as Clojure’s Java interop is, it couldn’t hide all the quirks of passing messages with JavaMail. And one area where it gets particularly tricky is when it comes to multipart messages (probably due to multipart as much as JavaMail to be honest). So I broke out the message parsing part to a separate library called Cail.

This left bootstrapping the Google OAuth code, which is easily initialised…

(:import
(com.google.code.samples.oauth2
OAuth2SaslClientFactory OAuth2Authenticator))
(OAuth2Authenticator/initialize)

…then when requested we open the store, and do some connection handling by trying to keep users’ folders open, which significantly speeds up subsequent requests.

Gmail extensions

As I mentioned, with Gmail Google has added a number of custom extensions to IMAP, and the one we wanted to take advantage of most was X-GM-RAW, to provide the full power of Gmail search. Even for the default inbox view we use the search term label:inbox to give us up-to-date results.

There appear to be versions of JavaMail that include some support for these extensions, but I couldn’t for the life of me find an available distribution for them. And, looking into their implementation, I was baffled by the complexity. Luckily though with JavaMail it’s quite straightforward to write your own custom commands that act directly on the protocol.  So given the command for X-GM-RAW…

SEARCH X-GM-RAW label:inbox

…which simply returns a list of message IDs like…

34 12 34 17 143

…it was easy to add our own Gmail search command, which we can then access from the Clojure web application:

(let [folder (imap/folder email token FOLDER_ALL_MAIL)
     command (GmailSearchCommand. FOLDER_ALL_MAIL search-term)
     results (.doCommand folder command)]
 (.. use results ..)

Next, we use the results to fetch the associated messages and map them into Clojure data structures.

We default to using the All Mail folder even for inbox queries, as the message IDs returned are folder-specific. This gives us a consistent view onto the user’s email (rather than losing our position if we chopped and changed folders). An annoying quirk is this folder isn’t named consistently across all versions of Gmail, but for our use case it isn’t a problem.

Worker middleware

One of the main issues we had with our initial PHP-only version was that if the user opened multiple tabs or just happened to make a bunch of queries all at the same time we would get multiple connections to Gmail being opened up to perform the exact same task (which is somewhat tricky to effectively synchronise via PHP over an arbitrary number of web boxes).

To deal with this more efficiently in Clojure we wrote some Ring middleware which can handle simultaneous tasks, called Worker. To use this just give it some long-running task (like fetching email via IMAP) and an ID:

(worker
 :some-unique-id
 (fetch-email-via-imap))

If you call this again before the task has completed Worker just waits on the result of the original task, returning both at the same time (it uses Futures internally, and is very simple). The middleware wraps this library, generating the unique ID from the request. Add it like normal middleware of course…

(-> #’app-routes
 (wrap-worker)
 (handler/api))

… and this will then control all our API requests: lovely.

Caching

Email is immutable, and fetching message data via IMAP is slow so we take advantage of the immutability and cache parsed messages indefinitely. To do this we use Clojure’s core.cache library, and our own database-backed durable cache:

(defcache DatabaseCache [db]
 CacheProtocol
 …)

We’re using the new 0.3alpha version of clojure.java.jdbc, which provides a cleaned-up API with a mini Korma-esque SQL DSL

(query db
      (select :data
              cache-table
              (where {:id (name id)}))

This part really was trivial to implement; apart from the database lookup we do a write for new cache data…

(defn store [db id data]
 (insert! db cache-table
          {:id (name id)
           :data (pr-str data)}))

…and that’s pretty much it. When we first deployed Groxy we only used in-memory caching (with an LRU strategy capped at 1000 messages). But that had the problem of blowing away the cache with each of our RPM-based deployments. So, moving on to using a datastore for the cache allows us to deploy updates much more frequently if needed.

Metrics

The application we developed Groxy for is using Librato for metrics visualisation, with StatsD as the collecting agent. This allows us to easily get some idea of how it’s performing.

Graph showing number of requests

Graph showing number of requests, with deployments as red lines

The library we’re using from Clojure is clj-statsd. It’s simple, and works really well for us.

Deployment

I’ve written previously about how we’re managing packaging and deploying our Clojure applications, so I won’t duplicate it here.

Conclusion

I think being pragmatic about technology is important. While of course it will never guarantee success, I think choosing the right tool for each little job (even if unfamiliar at first) can improve overall solutions, and lead to longer-term benefits in the development of products.

As for this tool, there’s still a lot of work to do to make it as good as it can be (while general usage is ok, worst-case uncached queries on new connections can still take up to 20 seconds, for example) but I feel that, now we’ve got a solid foundation in place, we can take the features built on top of it forward to more interesting places.

At Box UK we have a strong team of bespoke software consultants with more than two decades of bespoke software development experience. If you’re interested in finding out more about how we can help you, contact us on +44 (0)20 7439 1900 or email info@boxuk.com.