All posts by chadimvu

Code Reviews: Follow the Data

After years of reviewing other people’s code, I’d like to share a couple practices that improve the effectiveness of code reviews.

Why Review Code?

First, why review code at all? There are a few reasons:

  • Catching bugs
  • Enforce stylistic consistency with the rest of the code
  • Finding opportunities to share code across systems
  • Helping new engineers spin up with the team and project
  • Helping API authors see actual problems that API consumers face
  • Maintain the health of the system overall

It seems most people reach for one of two standard techniques when reviewing code; they either review diffs as they arrive or they review every change in one large chunk. We’ve used both techniques, but I find there’s a more effective way to spend code review time.

Reviewing Diffs

It seems the default code review technique for most people is to sign up for commit emails or proposed patches and read every diff as it goes by. This has some benefits – it’s nice to see when someone is working in an area or that a library had a deficiency needing correction. However, on a large application, diffs become blinding. When all you see is streams of diffs, you can lose sight of how the application’s architecture is evolving.

I’ve ended up in situations where every diff looked perfectly reasonable but, when examining the application at a higher level, key system invariants had been broken.

In addition, diffs tends to emphasize small, less important details over more important integration and design risks. I’ve noticed that, when people only review diffs, they tend to point out things like whitespace style, how iteration over arrays is expressed, and the names of local variables. In some codebases these are important, but higher-level structure is often much more important over the life of the code.

Reviewing diffs can also result in wasted work. Perhaps someone is iterating towards a solution. The code reviewer may waste time reviewing code that its author is intending to rework anyway.

Reviewing Everything

Less often, I’ve seen another code review approach similar to reviewing diffs, but on entire bodies of work at a time. This approach can work, but it’s often mindnumbing. See, there are two types of code: core, foundational code and leaf code. Foundational code sits beneath and between everything else in the application. It’s important that it be correct, extensible, and maintainable. Leaf code is the specific functionality a feature needs. It is likely to be used in a single place and may never be touched again. Leaf code is not that interesting, so most of your code review energy should go towards the core pieces. Reviewing all the code in a project or user story mixes the leaf code in with the foundational code and makes it harder see exactly what’s going on.

I think there’s a better way to run code reviews. It’s not as boring, tends to catch important changes to core systems, and is fairly efficient in terms of time spent.

Follow the Data

My preferred technique for reviewing code is to trace data as it flows through the system. This should be done after a meaningful, but not TOO large, body of work. You want about as much code as you can review in an hour: perhaps more than a user story, but less than an entire feature. Start with a single piece of data, perhaps some text entered on a website form. Then, trace that data all the way through the system to the output. This includes any network protocols, transformation functions, text encoding, decoding, storage in databases, caching, and escaping.

Following data through the code makes its high-level structure apparent. After all, code only exists to transform data. You may even notice scenarios where two transformations can be folded into one. Or perhaps eliminated entirely — sometimes abstraction adds no value at all.

This style of code review frequently covers code that wasn’t written by the person or team that initiated the code review. But that’s okay! It helps people get a bigger picture, and if the goal is to maintain overall system health, new code and existing shouldn’t be treated differently.

It’s also perfectly fine for the code review not to cover every new function. You’ll likely hit most of them while tracing the data (otherwise the functions would be dead code) but it’s better to emphasize the main flows. Once the code’s high-level structure is apparent, it’s usually clear which functions are more important than others.

After experimenting with various code review techniques, this approach has been the most effective and reliable over time. Make sure code reviews are somewhat frequent, however. After completion of every “project” or “story” or “module” or whatever, sit down for an hour with the code’s authors and appropriate tech leads and review the code. If the code review takes longer than an hour, people become too fatigued to add value.

Handling Code Review Follow-Up Tasks

At IMVU in particular, as we’re reviewing the code, someone will rapidly take notes into a shared document. The purpose of the review meeting is to review the code, not discuss the appropriate follow-up actions. It’s important not to interrupt the flow of the review meeting with a drawn-out discussion about what to do about one particular issue.

After the meeting, the team leads should categorize follow-up tasks into one of three categories:

  1. Do it right now. Usually tiny tweaks, for example: rename a function, call a different API, delete some commented-out code, move a function to a different file.
  2. Do it at the top of the next iteration. This is for small or medium-sized tasks that are worth doing. Examples: fix a bug, rework an API a bit, change an important but not-yet-ubiquitous file format.
  3. Would be nice someday. Delete these tasks. Don’t put them in a backlog. Maybe mention them to a research or infrastructure team. Example: “It would be great if our job scheduling system could specify dependencies declaratively.”

Nothing should float around on an amorphous backlog. If they are important, they’ll come up again. Plus, it’s very tempting to say “We’ll get to it” but you never will, and even if you have time, nobody will have context. So either get it done right away or be honest with yourself and consciously drop it.

Now go and review some code! 🙂

Optimizing WebGL Shaders by Reading D3D Shader Assembly

We are optimizing WebGL shaders for the Intel GMA 950 chipset, which is basically the slowest WebGL-capable device we care about. Unfortunately, it’s a fairly common chipset too. On the plus side, if we run well on the GMA 950, we should basically run well anywhere. 🙂

When you’re writing GLSL in WebGL on Windows, your code is three layers of abstraction away from what actually runs on the GPU. First, ANGLE translates your GLSL into HLSL. Then, D3DX compiles the HLSL into optimized shader assembly bytecode. That shader bytecode is given to the driver, where it’s translated into hardware instructions for execution on the silicon.

Ideally, when writing GLSL, we’d like to see at least the resulting D3D shader assembly.

With a great deal of effort and luck, I have finally succeeded at extracting Direct3D shader instructions from WebGL. I installed NVIDIA Nsight and Visual Studio 2013 in Boot Camp on my Mac. To use Nsight, you need to create a dummy Visual Studio project. Without a project, it won’t launch at all. Once you have your dummy project open, configure Nsight (via Nsight User Properties under Project Settings) to launch Firefox.exe instead. Firefox is easier to debug than Chrome because it runs in a single process.

If you’re lucky — and I’m not sure why it’s so unreliable — the Nsight UI will show up inside Firefox. If it doesn’t show up, try launching it again. Move the window around, open various menus. Eventually you should have the ability to capture a frame. If you try to capture a frame before the Nsight UI shows up in Firefox, Firefox will hang.

Once you capture a frame, find an interesting draw call, make sure the geometry is from your WebGL application, and then begin looking at the shader. (Note: again, this tool is unreliable. Sometimes you get to look at the HLSL that ANGLE produced, which you can compile and disassemble with fxc.exe, and sometimes you get the raw shader assembly.)

Anyway, check this out. We’re going to focus on the array lookup in the following bone skinning GLSL:

attribute vec4 vBoneIndices;
uniform vec4 uBones[3 * 68];

ivec3 i = ivec3(vBoneIndices.xyz) * 3;
vec4 row0 = uBones[i.x];

ANGLE translates the array lookup into HLSL. Note the added bounds check for security. (Why? D3D claims it already bounds-checks array accesses.)

int3 _i = (ivec3(_vBoneIndices) * 3);
float4 _row0 = _uBones[int(clamp(float(_i.x), 0.0, 203.0))]

This generates the following shader instructions:

# NOTE: v0 is vBoneIndices
# The actual load isn't shown here.  This is just index calculation.

def c0, 2.00787401, -1, 3, 0
def c218, 203, 2, 0.5, 0

# r1 = v0, truncated towards zero
slt r1.xyz, v0, -v0
frc r2.xyz, v0
add r3.xyz, -r2, v0
slt r2.xyz, -r2, r2
mad r1.xyz, r1, r2, r3

mul r2.xyz, r1, c0.z # times three

# clamp
max r2.xyz, r2, c0.w
min r2.xyz, r2, c218.x

# get ready to load, using a0.x as index into uBones
mova a0.x, r2.y

That blob of instructions that implements truncation towards zero? Let’s decode that.

r1.xyz = (v0 < 0) ? 1 : 0
r2.xyz = v0 - floor(v0)
r3.xyz = v0 - r2
r2.xyz = (-r2 < r2) ? 1 : 0
r1.xyz = r1 * r2 + r3

Simplified further:

r1.xyz = (v0 < 0) ? 1 : 0
r2.xyz = (floor(v0) < v0) ? 1 : 0
r1.xyz = (r1 * r2) + floor(v0)

In short, r1 = floor(v0), UNLESS v0 < 0 and floor(v0) < v0, in which case r1 = floor(v0) + 1.

That’s a lot of instructions just to calculate an array index. After the index is calculated, it’s multiplied by three, clamped to the array boundaries (securitah!), and loaded into the address register.

Can we convince ANGLE and HLSL that the array index will never be negative? (It should know this, since it’s already clamping later on, but whatever.) Perhaps avoid a bunch of generated code? Let’s tweak the GLSL a bit.

ivec3 i = ivec3(max(vec3(0), vBoneIndices.xyz)) * 3;
vec4 row0 = uBones[i.x];

Now the instruction stream is substantially reduced!

def c0, 2.00787401, -1, 0, 3
def c218, 203, 1, 2, 0.5

# clamp v0 positive
max r1, c0.z, v0.xyzx

# r1 = floor(r1)
frc r2, r1.wyzw
add r1, r1, -r2

mul r1, r1, c0.w # times three

# bound-check against array
min r2.xyz, r1.wyzw, c218.x
mova a0.x, r2.y

By clamping the bone indices against zero before converting to integer, the shader optimizer eliminates several instructions.

Can we get rid of the two floor instructions? We know that the mova instruction rounds to the nearest integer when converting a float to an index. Given that knowledge, I tried to eliminate the floor by making my GLSL match mova semantics, but the HLSL compiler didn’t seem smart enough to elide the two floor instructions. If you can figure this out, please let me know!

Either way, I wanted to demonstrate that, by reading the generated Direct3D shader code from your WebGL shaders, you may find small changes that result in big wins.

Web Platform Limitations, Part 1 – XMLHttpRequest Priority

The web is inching towards being a general-purpose applications platform that rivals native apps for performance and functionality. However, to this day, API gaps make certain use cases hard or impossible.

Today I want to talk about streaming network resources for a 3D world.

Streaming Data

In IMVU, when joining a 3D room, hundreds of resources need to be downloaded: object descriptions, 3D mesh geometry, textures, animations, and skeletons. The size of this data is nontrivial: a room might contain 40 MB of data, even after gzip. The largest assets are textures.

To improve load times, we first request low-resolution thumbnail versions of each texture. Then, once the scene is fully loaded and interactive, high-resolution textures are downloaded in the background. High-resolution textures are larger and not critical for interactivity. That is, each resource is requested with a priority:

High Priority Low Priority
Skeletons High-resolution textures
Meshes
Low-resolution textures
Animations

Browser Connection Limits

Let’s imagine what would happen if we opened an XMLHttpRequest for each resource right away. What happens depends on whether the browser is using plain HTTP or SPDY.

HTTP

Browsers limit the number of simultaneous TCP connections to each domain. That is, if the browser’s limit is 8 connections per domain, and we open 50 XMLHttpRequests, the 9th would not even submit its request until the 8th finished. (In theory, HTTP pipelining helps, but browsers don’t enable it by default.) Since there is no way to indicate priority in the XMLHttpRequest API, you would have to be careful to issue XMLHttpRequests in order of decreasing priority, and hope no higher-priority requests would arrive later. (Otherwise, they would be blocked by low-priority requests.) This assumes the browser issues requests in sequence, of course. If not, all bets are off.

There is a way to approximate a prioritizing network atop HTTP XMLHttpRequest. At the application layer, limit the number of open XMLHttpRequests to 8 or so and have them pull the next request from a priority queue as requests finish.

Soon I’ll show why this doesn’t work that well in practice.

SPDY

If the browser and server both support SPDY, then there is no theoretical limit on the number of simultaneous requests to a server. The browser could happily blast out hundreds of HTTP requests, and the responses will make full use of your bandwidth. However, a low-priority response might burn bandwidth that could otherwise be spent on a high-priority response.

SPDY has a mechanism for prioritizing requests. However, that mechanism is not exposed to JavaScript, so, like HTTP, you either issue requests from high priority to low priority or you build the prioritizing network approximation described above.

However, the prioritizing network reduces effective bandwidth utilization by limiting the number of concurrent requests at any given time.

Prioritizing Network Approximation

Let’s consider the prioritizing network implementation described above. Besides the fact that it doesn’t make good use of the browser’s available bandwidth, it has another serious problem: imagine we’re loading a 3D scene with some meshes, 100 low-resolution textures, and 100 high-resolution textures. Imagine the high-resolution textures are already in the browser’s disk cache, but the low-resolution textures aren’t.

Even though the high-resolution textures could be displayed immediately (and would trump the low-resolution textures), because they have low priority, the prioritizing network won’t even check the disk cache until all low-resolution textures have been downloaded.

That is, even though the customer’s computer has all of the high-resolution textures on disk, they won’t show up for several seconds! This is an unnecessarily poor experience.

Browser Knows Best

In short, the browser has all of the information needed to effectively prioritize HTTP requests. It knows whether it’s using HTTP or SPDY. It knows what’s in cache and not.

It would be super fantastic if browsers let you tell them. I’ve seen some discussions about adding priority hints, but they seem to have languished.

tl;dr Not being able to tell the browser about request priority prevents us from making effective use of available bandwidth.

FAQ

Why not download all resources in one large zip file or package?

Each resource lives at its own URL, which maximizes utilization of HTTP caches and data sharing. If we downloaded resources in a zip file, we wouldn’t be able to leverage CDNs and the rest of the HTTP ecosystem. In addition, HTTP allows trivially sharing resources across objects. Plus, with protocols like SPDY, per-request overhead is greatly reduced.

HTML5 – New UI Library for Games [Annotated Slides]

HTML5: The New UI Library for Games

At GDC 2011, on behalf of IMVU Engineering, I presented our thesis that HTML is the new UI library for developing desktop applications and games.

HTML is Winning

The browser wars are hotter than ever. We have four major browser vendors, each with significant market shares, competing in earnest. Features such as SVG, Canvas, GPU acceleration, FAST JavaScript engines, and even WebGL are widely available.

It’s a great time to be a web developer!

Some Terminology

When I refer to HTML, I’m really referring to the entire stack of web technologies: markup, style sheets, scripts, canvas, websockets, etc. Also, Mozilla’s branding is confusing: Firefox the browser is built by Mozilla the company and is powered by Gecko/XULRunner the technology. Internally, we use these terms interchangeably.

History of IMVU’s UI

2004-2007: Win32, OpenGL, and C++

In 2004, when the company was started, IMVU’s entire UI was written in Win32, OpenGL, and C++. This is a fairly traditional, though old school, way to build UIs. However, our framework was extremely rigid, and it was hard to hire people to work on it. We tell legendary stories of “I just wanted to move that button from here to there, but I couldn’t figure it out in a few hours, so I gave up”.

In addition, our iteration speeds were terrible. You had to recompile, maybe add some new functionality to our UI framework, and then restart the client. This made UI development hard enough that we simply didn’t do it. The product didn’t change much for 3 years.

2007-2009: Flash

In 2007, a couple engineers realized we had a problem and spent a weekend integrating Adobe Flash. Flash is widely available and, if you get permission from Adobe, you can distribute Flash with your application. We implemented support for Flash UI panels and rendering Flash to translucent overlays atop the 3D scene.

Using Flash to develop UI was definitely a huge improvement. We used the Flex UI library and its standard components to get a head start. We were suddenly able to iterate on our UI, and we could add capabilities like animation and video to our client.

That said, Flash wasn’t all roses. It consumed a great deal of memory, especially address space, and mxmlc compiles were sloooow. Finally, even though Flex is open source, we found several bugs.

2009+: HTML

In 2009, we began a project to rebuild our entire experience from first principles. Our very talented design team spent months iterating with paper prototypes and user tests, and produced a very snazzy client architecture. At that point, the engineering team had some implementation choices:

1) Should we build this UI with a custom framework?
2) Should we use our existing Flash technology?
3) Or should we try integrating a browser and use HTML?

We decided to give a browser a shot, and we started with Firefox, since a couple of our people had familiarity with the Gecko codebase.

Within days we had something up and running, and the performance was amazing! It loaded quickly, was super-responsive, and used hardly any memory. We instantly decided to use HTML for the entire project.

The UI we built in HTML matched the intended design almost to the pixel. Iteration was nearly live: and we could even do our development in Firefox the browser with the Firebug add-on.

We patched Gecko to support rendering documents into textures so that we could overlay them on top of the 3D scene.

This let us build sophisticated UI elements across our entire product.

Benefits of HTML

Lingua Franca

IMVU is a Silicon Valley company, and we hire people from all backgrounds. We don’t necessarily hire people for the skills they already have; instead, we assume they’ll be able to learn whatever we need. That said, it’s valuable for people to have some background knowledge, and almost everyone these days knows a bit of the web stack. You can’t expect expertise in the nuances of the box model, but everyone knows a bit of HTML and JavaScript.

Amazing Tools

The web stack brings with it a plethora of tools like Firebug and Firebug Lite. We use these tools every day to rapidly iterate on our markup and style sheets. We can even integrate Firebug Lite into our downloadable client and work with live data.

Ecosystem of Libraries

In the last few years, we’ve seen a proliferation of amazing JavaScript libraries such as jQuery and YUI. If you build your UI with web technologies, you can leverage these libraries directly.

Advertising

When we integrated web technologies into our application, we discovered an unintended benefit. We could now directly benefit from the tens-of-billion dollar Internet advertising industry, which has built an entire advertising pipeline around HTML, JavaScript, and Flash. One engineer integrated ads into our product in less than one week.

Demo

At this point in the presentation, I gave two mini demonstrations. First, I showed what Firebug Lite was capable of: JavaScript commands, real-time editing of attributes and styles.

Then I demonstrated the process by which we iterate on our UI by, in 5 minutes, adding a jQuery accordion to an HTML overlay and demonstrating that the change had already taken effect.

Performance?

The biggest concern we hear from others is “Wouldn’t the web stack be slow and take a lot of RAM?” There was a time when browsers were comparatively slow and heavy, but with today’s fast CPUs and the continual pressure to make browsers faster (rich web experiences, tabbed browsing), we’ve found that HTML has never been a bottleneck for us.

Gecko consumes less than 1 MB of RAM for each loaded document, and we even think we’d be able to make every seat node in the scene into an HTML overlay.

You do need to be careful with DOM structures with huge numbers of children. Many DOM operations are linear in the number of children or siblings, so you’ll want to structure your DOM with a b-tree.

You also need to be careful with image tag and canvas memory usage. Avoid allocating for elements that aren’t immediately visible.

Drawbacks

Flash is still better at animation, but I think this will change. This may simply be a tool-chain issue.

3D on the web is possible now, but WebGL isn’t quite prime time. This year we expect WebGL penetration to reach 50% of the market (Chrome and Firefox), but we still depend on our software renderer for our customers without strong or reliable GPUs.

Tracing JITs are still RAM-hungry and don’t match native code performance yet. Core game components will still be written in close-to-the-metal C++.

Who else uses HTML for UI?

The independent game studio Wolfire uses WebKit for an in-game material and shader editor. They used the off-the-shelf editor component, CodeMirror, for their text editors.

Electronic Arts ported WebKit to PlayStation 3 for Skate 3. About 50% of their UI was written in HTML and JavaScript.

Netflix on PlayStation 3 uses WebKit so they can split-test their UI after they’ve shipped. That way they can improve their UI without shipping new software.

Several MMOs have added browsers so they can include forums, wiki, and help within their games.

How do I get started?

There are several open source and commercial libraries to help you get started. I’ve included a list in the slides above.

Let me know how it goes!

I’d like to conclude by saying that HTML has worked very well for us, and I strongly encourage you to give it a shot. Send me a message at chad@imvu.com to let me know if it works for you or not!

Q&A

I’d like to clarify some points I made during the Q&A session:

A man from Khronos approached me and said I was a bit hard on WebGL. If I came across that way, I didn’t mean it. I love WebGL and the work that Khronos has done, but I don’t think it’s prime time _yet_. We can’t port our existing product, which often runs on machines without a reasonable GPU, to WebGL yet. We do, however, intend to write a WebGL backend eventually. By the end of the year, we expect 50% WebGL adoption.

Someone asked about initial startup overhead. I said we probably consumed 30 MB of RAM and address space in base Gecko initialization, but that we’d done absolutely no work to reduce that. On the console, I expect it would be worth optimizing further.

There was some discussion around how our designers and engineers interacted. I mentioned that our teams are often cross-functional: our designers typically know enough CSS to make changes if they want and some of our engineers have strong design sense. We find it more productive to just sit down together and have an engineer and a designer work together over an hour to polish up the experience.

Thanks!

Thanks to Chuck Rector, Ben McGraw, Chris Dolezalek, Enno Rehling, Andy Friesen, James Birchler, Brett Durrett, Jon Watte, Laura Austin, Ben and Staci Scott, and Michael and Samantha Van Waardhuizen for providing valuable feedback during the preparation of these materials.