The Case of the Trailing Space

Solve the case of the trailing space with IMVU’s Senior Engineer, Michael Slezak.

In this post, I wanted to discuss a problem we ran into recently dealing with REST authentication in our IMVU client application which ultimately boiled down to small discrepancies between JSON encoders. In this case, I’m going to focus on Python’s JSON library and javascript’s JSON encoder. Before I dive into the problem, I want to provide a quick background on the client application architecture.

Client Architecture

The IMVU client application contains several layers in its architecture ranging from rendering, business logic, and front-end development. We use C++ for rendering and other low level functionality such as windows, call stacks, and interfacing with the business logic. Our business/client logic layer is all in Python. We use Python for communicating to the front-end, pinging our servers, and maintaining advanced chat logic and chat state, among many other things. Finally, we use the Gecko SDK (a.k.a. XULRunner) to handle all of our UI needs. This means we can write our front-end using HTML, Javascript, and CSS. We also have our own library to allow the front-end to call out to Python for things such as user data. Using web technologies for client UI development has allowed us to unify the technology we use for our site such as jQuery, Underscore.js, and even a few in-house libraries, resulting in increased engineering productivity.

With that said, we’ve recently hand rolled our own implementation of Promises as identified by the upcoming ECMAScript 6 proposal in our open-sourced imvu.js library. I decided to drop this implementation into the client for our immediate use. With this change, I was able to also drop in a new REST client to help with chaining our requests in a more synchronous-like fashion (despite the fact that it’s asynchronous). Hooray! Asynchronous programming has just gotten easier for the client! However, that wasn’t the case…

XMLHttpRequest, are you there?

We got our feet wet using this exciting change with a new feature that my team is developing. There was one issue: Not authorized for request.

Uh oh.

Looks like we need an auth token for these requests. Specifically, the POST and DELETE requests. Simple enough. I found that we handle authentication in our Python code within this “securePostRaw” function. Every single request ever made in the client goes through this. To my knowledge, POSTing with XMLHttpRequest has never been used in the client. Ever. Needless to say, this was news to me…

It’s Dangerous to go Alone…

Looking at the securePostRaw function, we seem to take the auth token that the server initially gave us, and hash everything and use that as the new auth token. For example, if there is a request body, we JSON encode it and also utf-8 encode it. We then take the the customer id, the original auth token, the JSON encoded body, and the query parameters (if they exist), and then run a hash function over the whole, concatenated blob

OK… a little odd way of securing a POST request since we are running over HTTPS. But this is legacy code! So, it’s understandable.

I took this hashing function and copied it into another file for our front-end to call directly. I don’t need Python to send off the request, I just want to set the correct auth headers via XMLHttpRequest so that we can use this new REST client. You might be wondering, “Why not just let Python handle it then?” Partly because we have a bigger vision in the near future where we bring in a bigger, hand-rolled front-end library that is now ubiquitous to how we write front-end software at IMVU. To achieve this, I need XHR to work so that it’s easier to just “plop it in”.

Anyways, I finally get the right tokens to put in our headers. We’re on our way! All is right with the world, so we test again and: Not authorized for request

…Take This!

Wait, but I did what you told me. I did all the right things. The old and new client tokens match up! This isn’t fair!!

Luckily, I can run the IMVU client off a local server and dig into what the server is seeing. Through our REST middleware stack on the server, we’re failing authentication! When we attempt to grab the logged in user, it fails to identify us! But the old and new client tokens are the same! I’m in parity! Not so fast…

Here is a dump from the server of when Python makes the request using the securePostRaw function:

slezakblog1

And here is the dump when XHR makes the request:

slezakblog2

sign is what the server comes up and sent is what the client sent (obviously). Why in the hell are they different? The server is playing tricks on us… So, I decide to log the final output of the string the server uses before it runs the hash function on it.

The Case of the Trailing Space

The log looked something like this:

slezakblog3

….There is an extra space right before the "http://" starts!!! , It turns out JSON.stringify doesn’t leave any spaces like this. Since we let Python compute the hash, it also encodes the request body into JSON which means that it’s causing the spacing! Since we do xhr.send(JSON.stringify(body)), we have a mismatch between what the client calculates and what the server calculates because the server technically has a different request body (by one single space!).

Fortunately, the json library in Python has a keyword argument in it’s dump function called separators. So, the code now looks like json.dumps(body, separators=(‘,’, ‘:’)) which gives us a more compact version of the encoding. We are now matching with JSON.stringify.

After this change, we were finally able to come to a solution and it works!

Conclusion

Several lessons learned from this:

  • Hashing things that are dependent on variable data (which is also encoded) can be problematic. Since JSON is flexible in its encoding and allows for spacing, it can throw off the whole hash. Things that are more in our control such as integer values and constant strings are probably better to use.
  • All JSON encoders/decoders aren’t created equal.

If you enjoyed reading this article and are excited about solving problems such as unifying web technologies across multiple platforms, you’re in luck! We’re hiring!

 

 

 

One thought to “The Case of the Trailing Space”

Leave a Reply