Above and Beyond the Affirm Job Puzzle

The latest startup from internet luminary Max Levchin recently launched, and they have a very entertaining programming puzzle up on their jobs page.

You should read the page for some background, but in summary the problem is to find the distances between any two cells in a hexagonal grid numbered in the following manner:

The hexagonal grid

Approach

The problem is interesting because the problem seems like a traditional graph theory exercise, but Affirm hints that your solution should work at the scale of ten billion nodes. These two ideas don’t really jive, so one wonders if there’s an analytic solution that doesn’t involve any graph traversal.

It seems plausible that there is one, since the problem seems very similar to calculating the Manhattan Distance between any two nodes in a rectangular grid, except in this case we have a hexagonal grid where there are six ways to move between neighboring nodes instead of four. These six neighboring cells lie along three axes of movement.

The grid with axes

If you play around with the axes in your head, you can see that you can represent any hexagon in terms of any two of the three axes of movement. The corollary to that conclusion is that any translation down one axis of movement can be thought of as some combination of the other two. Essentially, we have one almost-unnecessary axis.

This suggests an approach:

  • Turn a hex’s number into its grid coordinates
  • Figure out the distance between two arbitrary grid coordinates

Translating a hexagon number into coordinates

Going from an arbitrary hexagon number to coordinates seems a bit tricky at first. You can’t modulo or divide by anything obvious to get some aspect of the geometry. The hex at position 1000 could be almost anywhere. What does seem obvious, though, is that higher numbers must be on larger “rings” of hexagons. Closer examination shows that each larger ring of hexagons has 6 more nodes than the last. Therefore:

$$ MaxNumOnRing(n) = ‎1 + ‎\sum\limits_{ring=0}^n 6ring \\ ... = 1 + 6\sum\limits_{ring=0}^n ring \\ ... = 1 + 6 \frac{n(n + 1)}{2} \\ ... = 3n^2 + 3n + 1 $$

The formula seems to check out, since the 0^th ring ends in 1, the 1^st ring in 7, the 2^nd ring in 19, and so on. Programmatically, you could tell which ring a hex falls on by finding which two “max-ring” numbers the hex is between. In the case of say, 12, it would be the 2^nd ring, since it is greater than 7 and less than 19. However, we can do better mathematically by simply inverting the previous formula, to get one that takes a number and outputs a ring:

$$ num = 3n^2 + 3n + 1 \\ num = 3(n^2 + n) + 1 \\ num = 3(n^2 + n + \tfrac{1}{4} - \tfrac{1}{4}) + 1 \\ num = 3((n + \tfrac{1}{2})^2 - \tfrac{1}{4}) + 1 \\ \frac{num - 1}{3} + \tfrac{1}{4} = (n + \tfrac{1}{2})^2 \\ \frac{4num - 1}{12} = (n + \tfrac{1}{2})^2\ \\ \sqrt\frac{4num - 1}{12} - \tfrac{1}{2} = n $$

Plugging in num = 10 gets us a ring number of ~1.3, which we round up to 2. Looks good! Now that we have the ring, we have to do the hard part: figure out where exactly on the ring we are. It also means we need to finalize our axes. I’ve illustrated the coordinate system I went with below. Why I choose this particular configuration should become clear later:

The grid with coordinates

Following the pattern of coordinates here, you can see the largest number on each ring occurs on the negative X axis. Essentially we can say that if a number is the largest number on ring n, its position will be (0, -n). After a bit more thought, I figured that you could represent the various corners of the unit hexagon as six vectors that all pointed to hexagons exactly one away from the origin, like so:

require 'matrix'

UNIT_HEXAGON = {
  0 => Vector[-1,  0],
  1 => Vector[ 0,  1],
  2 => Vector[ 1,  1],
  3 => Vector[ 1,  0],
  4 => Vector[ 0, -1],
  5 => Vector[-1, -1]
}

You can get any number’s position along its hexagon ring by figuring out which of the six sides of the hexagon it is on, and its distance from the last corner of the hexagon.

def ring_to_max_num ring
  3 * ring**2 + 3 * ring + 1
end

def num_to_ring num
  (Math.sqrt((4 * num - 1) / 12.0) - 0.5).ceil
end

def num_to_coords num
  return Vector[0, 0] if num == 1

  # The length of a side is also the ring number
  side_length = ring = num_to_ring num

  # How far away am I from the end of my ring?
  offset = ring_to_max_num(ring) - num

  side_number = offset / side_length
  side_offset = offset % side_length

  corner = UNIT_HEXAGON[side_number]

  # The direction to the next corner is just the position of the
  # next corner minus the position of the current one.
  direction = UNIT_HEXAGON[(side_number + 1) % 6] - corner

  (corner * ring) + (direction * side_offset)
end

Solving for distance

So now we can easily get the grid coordinates of the start and end node. We still have to solve for the distance between them. A quick observation shows that the distance necessary to travel any delta vector is the same regardless of starting and ending nodes. That is to say, moving (+3, +2) units is the same distance whether you start at hex 1 or 1000. Luckily, with the axes I’ve chosen, calculating this distance isn’t too hard. At any given hexagon, you can move one unit along either the X or Y axes. You can also move one unit up or down the third axis, which is equivalent to moving by either (+1, +1) or (-1, -1). Our choice of axes has made that bit simple.

The process for finding the distance of a delta is:

  • If the coordinates of the delta share a sign (both positive or negative), the distance is just the maximum of the absolute values of both coordinates. In moving (+3, +5), for instance, you move first to (+3, +3) in three moves, and then to (+3, +5) in another two for a total distance of five. The same is true of (-3, -5), with reversed directions.
  • If they do not share a sign, merely add both of their absolute values together. For instance, moving (+2, -6) takes eight moves, because you have to move +2 in the X direction and -6 in the Y. Moving along the Z axis cannot aid you here.
def length_of_delta delta
  delta = -1 * delta if delta.all? {|i| i < 0}
  delta.all? {|i| i > 0} ? delta.max : delta.max - delta.min
end

def distance_between num1, num2
  delta = num_to_coords(num1) - num_to_coords(num2)
  length_of_delta delta
end

num1 = ARGV[0].to_i
num2 = ARGV[1].to_i
puts "distance between #{num1} and #{num2} is #{distance_between num1, num2}"

And viola, a constant-time algorithm that works fine at 10 billion nodes:

~/misc/affirm $ time ruby honeycomb.rb 1 100
distance between 1 and 100 is 6

real  0m0.030s
user  0m0.023s
sys 0m0.005s
~/misc/affirm $ time ruby honeycomb.rb 1 10000000000
distance between 1 and 10000000000 is 57735

real  0m0.030s
user  0m0.024s
sys 0m0.005s

Extra Credit

Solving for distance was nice, but honestly a bit anticlimactic. Wouldn’t it be more impressive if we could actually output the number of each hexagon on the way to our destination?

def path_between pos1, pos2
  delta = pos1 - pos2

  path = []

  while delta != Vector[0, 0]
    path.push(pos2 + delta)

    move = if delta.all? {|i| i < 0}
      Vector[1, 1]
    elsif delta.all? {|i| i > 0}
      Vector[-1, -1]
    elsif delta[0] != 0
      Vector[delta[0] > 0 ? -1 : 1, 0]
    else
      Vector[0, delta[1] > 0 ? -1 : 1]
    end

    delta += move
  end

  path.push pos2

  path
end

This function constructs a path of coordinates from pos1 to pos2, by attempting to move one hex at a time, favoring the Z axis where possible and otherwise just moving down the X or Y axes if there exists a remaining delta on either axis.

However, we still only have a path in coordinates. To get back to hexagon numbers, we need a translation function. This article is getting a bit long, so I’ll just leave it here as an exercise for the reader to puzzle out:

def coords_to_num pos
  return 1 if pos == Vector[0, 0]

  ring = [pos.max - pos.min : pos.map(&:abs).max].max

  side = if pos[0] == ring then 2
    elsif pos[0] == -ring then 5
    elsif pos[1] == ring then 1
    elsif pos[1] == -ring then 4
    elsif pos[0] > pos[1] then 3
    else 0
  end

  max_num = ring_to_max_num ring
  corner = UNIT_HEXAGON[side] * ring
  offset = side * ring + length_of_delta(pos - corner)
  offset == ring * 6 ? max_num : max_num - offset
end

Bring all the pieces together and you get…

~/misc/affirm $ ruby ../misc/affirm/honeycomb.rb 1 100
distance between 1 and 100 is 6
path is [1, 2, 9, 22, 42, 68, 100]

Magic! The runtime here is $O(\sqrt{n})$, because the path-finding algorithm is linear on the length of the path, and the path can only be as long as the square root of the largest hexagon number on the path. Recall that the ring of a hexagon number is a function of the square root of that number.

Thanks for the fun times, Levchin and co.

The Riders and Hunters of Men

Posit that humans are just the reproductive organs of ideas, and our minds little more than churning pools of interchanging notions. In short, standard meme theory. We might be hosts whose carefully formulated egos are nothing more than the emergent terrain of an ancient memetic battleground - one imagines long wars between the myriad incarnations of fear and laziness for control of our weary brains. The memes with the most dominant survival characteristics must have long since evolved - likely candidates being caution, hope, and solidarity. In this paradigm, the most prolific ideas are the hardiest survivors of thousands of years of crossbreeding, their properties now the building blocks of more complicated and fragile abstractions like justice, ambition, and melancholy. The simplest ideas might occupy the lowest and most important rungs of an intellectual ecosystem, akin to our humble meat-space phytoplankton.

However, how do ideas like self-destruction and asocialism breed, if their character makes their owners shun reproducing? The answer may lie in the non- intuitive fact that the reproduction of ideas need not be necessarily coupled with the reproduction of humans. Have you found yourself slighted when cut off on the highway, and thus spurred to cut another off in the same way? Some ideas may predispose their hosts towards not mating, and yet be potent enough to spread from individual to individual. Biologists have even identified similar processes in the field, where individual bacteria swap genetic material without reproducing. They call this act Horizontal Gene Transfer, and it may be a more widely applicable concept than they know.

More broadly, the reproduction criteria for a meme have little to do with the host at all, beyond causing the host to express behaviors that cause others to adopt that same meme. In Biology, in order for a lethal virus to survive, it must assure that its genetic code is passed on before it destroys its host. This seemingly ironclad Darwinian rule of the natural world need not apply to the spread of ideas. Consider again that for ideas to be exchanged, humans do not even have to meet face to face, or even be alive simultaneously. How many suicides were born decades before they occurred, spread posthumously by another carrier, like an oak desperately dropping acorns in the death throes of a fire. The striking tragedy of a suicide is more than just sadness, but the idea’s resilience in the minds of its witnesses. The kernel of the idea sits idly for years in the intellectual field of a man’s mind until the fire of anger or loneliness causes the undergrowth to clear enough for the seed of suicide to sprout. Perhaps I erred, then, in suggesting that the most popular memes are hardier than their less successful cousins.

What is the nature of our relationship with ideas? In some cases it seems parasitic, especially with self-destructive ideas. The will to self-destruct manifests itself at great personal cost to its host. Other ideas seem symbiotic, like tribalism and love. Through expressing themselves in us, these ideas not only heighten their own odds at survival, but bring us happiness and safety as well. Still others seem hard to classify, like asceticism and other highly abstract pursuits. It seems evident that humans and ideas need each other to survive, but beyond that, their relationship is murky.

Perhaps the binary of parasitism and symbiosis is unhelpful. What if ideas owe each other more allegiance than they owe us, and vice versa? We readily accept the notion that the wolves who prey on the slowest of deer place a selection pressure on the herd for speed. Might not some ideas prey on the most susceptible of humans to ensure the fitness of the rest for peaceful occupation by their memetic brethren? In that same vein, it could be said that humans choose which ideas to pollinate, optimizing for ideas that appeal to ourselves the most.

Ideas cooperating to produce a fitter human herd seems pat, but could the truth be darker? What if our mostly Darwinian universe decreed that there must be wolves to cull the weak human symbiotes and untenable ideas equally from the herds of both humans and ideas. These wolves scent out their human prey, the infirm harborers of unfit ideas, and stalk them until they fall. The complete destruction of their prey has a threefold result: the thinning of the least fit humans, the purging of divergent ideas remaining in those hosts, and the birth of an ever so slightly deadlier predator from the wreckage of its victim.

The API for Google's most valuable resource sucks

Disclosure: I recently worked for Google for about a year. It was alright.

Recently, Vic Gundotra of Google+ fame made a bold statement. He proclaimed that the lack of write API access to Google+ is born not out of lack of foresight, planning, or even bandwidth, but out of trepidation, caution, and the desire to do right by developers.

This is raw, barely-refined bullshit and I regret not being able to respond to the thread with a pithier snarky comment. The truth is simpler: Google is full-stop terrible at APIs.

Vic Gundotra

Don’t look at me, it says so right there

As a case study, let’s examine what I had to do in order to use the most powerful collection of user-generated content that Google makes available to developers: a user’s email contact list. People who work for consumer- facing web companies probably know how deeply important having a user’s email address is for marketing. Despite being invented before I was born, email has yet to be outmoded as the most effective way to push content to users on demand.

The feature

Recently here at Everlane, we thought it might be a good idea to have a button a user could hit to view a list of their Gmail contacts with portraits, select some of them, and then have us send those users an invitation email. Simple, obvious stuff to want to do, right?

import screen 1

import screen 2

What we want to implement

Well, let’s hit the getting started page of the contacts API documentation. Reading closely seems to reveal that there might already be a library for doing what we want, which is really nothing more than getting a bunch of names, faces, and emails. Those geniuses at Google have to have something already made for this, right? We head over to the libraries and samples page. Cool, a Javascript library! This could be easy! Nope, they seem to support every Google API but Contacts. The Google+ read API doesn’t seem to have a good way to grab emails, either.

That’s fine – we’re pretty decent engineers and can call the Contacts API on our own. We register our application with the API Console and start reading about OAuth. Google provides a few authorization schemes. We probably want the one titled “Client-side Applications”, which saves us the complexity of an application server having to be aware of any sensitive information.

API woes

Now we can finally ask the Contacts API for a list of contacts! Easy enough, right? We’ll just do a JSONP request to get around the cross-domain restrictions.

var authParams = { access_token: ..., token_type: ... }; // from Google oAuth

$.ajax({
  url: 'https://www.google.com/m8/feeds/contacts/default/full',
  dataType: 'jsonp',
  data: authParams,
  success: function(data) { console.log(data); }
});

What does this give us? To our slow and creeping horror, we get back something like this for every contact:

<entry gd:etag='&quot;QHc_fDVSLit7I2A9WhJXFUkDQQ0.&quot;'>
  <id>http://www.google.com/m8/feeds/contacts/your%40gmail.com/base/13659b580fe5686d</id>
  <updated>2012-08-09T22:01:31.944Z</updated>
  <app:edited xmlns:app='http://www.w3.org/2007/app'>2012-08-09T22:01:31.944Z</app:edited>
  <category scheme='http://schemas.google.com/g/2005#kind' term='http://schemas.google.com/contact/2008#contact'/>
  <title>Diane Duane</title>
  <link
    rel='http://schemas.google.com/contacts/2008/rel#photo'
    type='image/*'
    href='https://www.google.com/m8/feeds/photos/media/your%40gmail.com/13659b580fe5686d?v=3.0'
    gd:etag='&quot;UWlBIlclWit7I2A9AFQKRg9YFXoHL0oQSQA.&quot;'/>
  <link
    rel='self'
    type='application/atom+xml'
    href='https://www.google.com/m8/feeds/contacts/your%40gmail.com/full/13659b580fe5686d?v=3.0'/>
  <link
    rel='edit'
    type='application/atom+xml'
    href='https://www.google.com/m8/feeds/contacts/your%40gmail.com/full/13659b580fe5686d?v=3.0'/>
  <gd:name>
    <gd:fullName>Diane Duane</gd:fullName>
    <gd:givenName>Diane</gd:givenName>
    <gd:familyName>Duane</gd:familyName>
  </gd:name>
  <gd:email
    rel='http://schemas.google.com/g/2005#other'
    address='diane.duane@gmail.com'
    primary='true'/>
</entry>

You take several deep breaths. You ignore the fact that you are fetching roughly 1kb of data per contact (out of potentially thousands) to get a name, an email, and the URL of an image. “Okay”, you think to yourself, “this is still salvageable. I can parse XML on the client. In fact, jQuery can probably do it for me.” You take a quick stab at grabbing names and emails.

success: function(data) {
  var xml = $.parseXML(data);
  $(xml).find('entry').each(function() {
    var entry = $(this);
    var name  = entry.find('title').text();
    var email = entry.find('email').attr('address');
  });
}

A quick browser test shows that this only appears to work in Chrome. A bit more digging turns up, to your chagrin, that jQuery has trouble finding namespaced elements like “" in XML documents on different browsers. You Google around and find a fix:

var email = entry.find('email, gd\\:email').attr('address');

For now, you ignore how inefficient this is, hoping merely to reach functionality. It works! Now you want to add images. It looks like one of the link elements under <entry> appears to point to an image for that contact. You fiddle around on the console:

entry.find('link[type="image/*"]').attr('href')
// => https://www.google.com/m8/feeds/photos/media/your%40gmail.com/13659b580fe5686d?v=3.0

Attempting to load this in your browser gives you a 401. Taking a look a the photo management section of the docs seems to suggest you need to additionally apply auth credentials to this url. You amend your code:

var href = entry.find('link[type="image/*"]').attr('href');
var imageUrl = href + '&' + $.param(authParams);

This seems to produce a real photo in your browser. Success! Let’s extrapolate:

$(xml).find('entry').each(function() {
  var entry = $(this);
  var href = entry.find('link[type="image/*"]').attr('href');
  var imageUrl = href + '&' + $.param(authParams);
  $('body').append('<img src="' + imageUrl + '" />');
});

Image contortions

Unfortunately, only a few images load. What’s going on? You squint at the docs more closely.

Note: If a contact does not have a photo, then the photo link element has no gd:etag attribute.

Great, so some image links have a magic attribute on them that says they’re real images. You wonder why Google even bothers returning image links for photos that don’t exist. You try something like the following:

.find('link[type="image/*"][gd\\:etag]') // or even
.find('link[type="image/*"][gd:etag]') // nope? how about
.find('link[type="image/*"][etag]')

But no, jQuery can’t deal with namespaced attribute selection, at all, so you arrive at:

var link = entry.find('link[type="image/*"]');
if (entry.attr('gd:etag')) {
  var imageUrl = link.attr('href') + '&' + $.param(authParams);
  $('body').append('&lt;img src="' + imageUrl + '" /&gt;');
}

This time, a few more photos load, but then they stop coming. The console shows a few successful image loads, but most of the requests for images returned with 503 (Service Unavailable) errors. You realize, after an hour or so, that each image load is being counted as an API call against you, and that there must be some rate limiting in place.

Naturally, this fact is completely undocumented. Playing around with the code, you find that Google doesn’t like it if you have more than one in-flight API request at a time. You come up with essentially the opposite of an image preloader to stagger image loading:

var imageUrls = ['https://...', ...];
function staggerImages(index) {
  if (index == imageUrls.length) return;
  var img = new Image();
  img.onload = function() {
    setTimeout(function() {
      // Load the next image 100ms after this one finishes loading
      staggerImages(index + 1);
    }, 100);
  };
  img.src = imageUrls[index];
  $('body').append(img);
}
staggerImages(0);

Whew, that was fun, right? At this point it seems like a good idea to try to sort contacts by some sort of relevance metric. Unfortunately, the Contacts API doesn’t support this at all. Oh well. You give up, having reached something approximating your original goals.

What have we learned?

  • Google doesn’t get JSON.
  • Google can’t design clean APIs or document them well.
  • Despite their browser-first mantra, Google doesn’t put out first-party Javascript libraries for the browser.
  • Vic Gundotra is soooooo lying.

Developers waiting for Google+ to deliver on its full, API-wonderland potential: you should probably just give up. What are the odds that this whole time, they’ve been cooking up the perfect write API, replete with features, libraries, and documentation? I’m betting that they’re doing exactly what I did when I worked for Google: absolutely nothing of importance.

Lastly, we’re always on the lookout for talented developers who are interested in the fashion space here at Everlane. You don’t need to own more than one pair of shoes. My love of fashion is born of William Gibson novels and is almost entirely academic. If you’re interested, check out our jobs page or send me an email at me@vincentwoo.com.

P.S. Yes, I am aware there is an older gdata library that can potentially handle contacts. I might have used it if it wasn’t deprecated or Google had made any mention of it whatsoever on their Contacts API page.

Love in the Time of MapReduce

This piece was written for an internal Google fiction contest, for the 100th edition of the engineering newsletter. The call to arms arrived in my inbox like so:

For this special Eng Newsletter issue, we’re running a “google eng-y” short fiction contest. You can write about anything, but the story must begin with these two words: “The MapReduce”.

Please note that some meaning may be lost on non-Googler’s, notably the bits concerning company hierarchy. All the opinions expressed are my own and obviously do not constitute the workings of an actual Google plan, etc. Jeff Dean is a very nice man. This is a piece of fiction in almost every sense.

The MapReduce was a piece of technology whose existence its steward, Jeff Dean, sometimes begrudged. It was glamorous, in a way, to be the public face of the algorithm that had essentially rewritten interpersonal contact, but it was also draining and surreal.

In one of Jeff’s increasingly common attacks of perspective, he realized that his daughters, too, had been completely swept up by a thing that he himself had designed, built, and evangelized. They were, of course, perfectly happy with the product. Jeff noted this with a tinge of grim pride, remembering the long nights of trial runs. Victoria and Natalie were a bit too happy, Jeff mused, so completely satisfied with something they could never understand (indeed, that he himself no longer understood well), that he found their lack of doubt troubling. Why didn’t they care that it probably shouldn’t work, that time and computation could twist statistics in such a fundamentally disturbing way? It was probably due to both of them being so preoccupied with Natalie’s wedding, he concluded wearily.

Later, outside of his office, Irina was waiting for him.

“Jeff, you have a visitor waiting for you in your office,” she said. Something in her tone gave away the urgency of the situation, and Jeff nodded, having long grown used to trusting Irina to manage his calendar more deftly than he could tie his shoes.

His suited visitor was a trim man of about sixty, which was unusual enough for the Googleplex in terms of both age and dress. He wore his graying hair swept back and neatly cropped. With a start, Jeff realized that his visitor was none other than a senator of Iowa.

“I’m Robert Graves, and sorry about showing up so unannounced, Mr. Dean,” said the man, with a smile. Jeff paused for a moment to admire how finely countenanced the man was, and to feel a small thrill at being so delightfully underdressed, himself.

“You’re the senator pushing for patent reform. I don’t watch TV much, but I’ve seen you on when my wife watches the news.” Jeff shook his visitor’s hand and seated himself behind his desk.

“The very same. Look, I’ll spare you the pleasantries and get right to why I’ve come. I’m told that engineers prize truth and directness.” Jeff lifted an eyebrow at this, having found that lately he valued being left well enough alone better than both of those things. “As you well know, MapReduce is proving problematic, socially. FOX is filming a reality TV show at this very moment about an engaged couple who are convinced that after trying out their MapReduce partners, they’ll still want to get married.”

“Jesus. How’s it looking for the couple?”

“Not good. Even worse, they’re filming it in my hometown.” Graves massaged his temples.

Jeff was not surprised. MapReduce rarely erred. Though it had begun as a general purpose framework for parallelizing search index updates, it eventually lent itself to analyzing the massive amounts of user generated social data Google+ collected. In time, this would become all that MapReduce was known for (at least externally of Google), in a queer reversal of how the words escalator and aspirin came to describe all such contrivances, though they were once only brands.

“Basically the right is getting as much fuel as it wants for its eternal fire of shouting about our perpetual moral decay. On top of that, MapReduce is having a powerful economic impact, which doesn’t help. We’re having an employment problem, as you’ve doubtless inferred by now, since you must have all the numbers on how many people are using MapReduce to pair up.”

The first Jeff had heard about the phenomenon the media had dubbed as the “honeymoon effect,” had been from the news itself, but he nodded anyway. “My citizens are up and leaving jobs they’ve worked at for a decade to meet their dreamboat on the other side of the world. I mean, great for them, but our coffers weren’t in great shape before, and your invention is a drain we can’t possibly afford right now, never mind the bad press. As much as I am for the future, I desperately need you to stop operating in my state.”

Jeff suppressed the urge to tell the man to just contact press@google.com, and instead reluctantly launched into a narrative he had delivered many times before. “I’m sure you’ve seen and read all the press releases about this. What we do isn’t terribly new. We provide a service that users want. In a sense, we provide nothing more than what eHarmony and Match.com have been providing for years, just with much less uncertainty and a bigger candidate pool.”

Robert snorted. “I’d hardly call ‘every Google user on Earth’ a bigger pool. Your operation is different, too. You know all the things that people have searched for, and all the things they’re too ashamed to search for. You know why some actresses draw men to them, and which men women will wait hours to receive texts from. Those bankrupt dating sites had only the constructed personas of the desperate to work with. There’s a case that could be made here for unlawful invasion of privacy and monopolistic abuse of information.”

A Googler rode past Jeff’s window on a small yellow bicycle. Jeff focused on the bright colors to briefly escape his current uncomfortable tension.

If Graves was right about anything, it was that MapReduce was uncannily effective. Through what some people might call sorcery, or what Jeff’s team leads described as “massively parallel Bayesian-adapted machine learning plus deep social mining,” it was able to identify, with nearly 97% confidence, a lifetime romantic partner for any given user. The algorithm could even supply just the right amount of shared interests as conversation starters, while leaving enough unsaid for the nascent couple to discover independently, leaving them feeling as if they had come to know each other intimately of their own volition. Some people found this deeply unnerving.

Even those people commonly derided by society could find love in this way, though MapReduce might take weeks instead of seconds to produce a suitable pairing. People of every sexual deviancy and every personal vice were being matched up, to the horror of the many people alienated by the brutal efficiency of MapReduce’s perfect lack of bias.

In short, romantic fulfillment was, for most people, little more work than clicking “I’m feeling lucky” and buying a plane ticket. This is what the people wanted more than anything else. Graves knew it, and Jeff knew that Graves knew it. Furthermore, Jeff knew that Graves was powerless to do anything about it, so strongly did the public crave MapReduce’s presence in the world. Yet Jeff felt sympathy for Grave’s willingness to shoulder the impossible task of squaring the budget against falling revenues and changing social tides.

“Mr. Graves, I understand your dilemma. The last thing you need right now is the income rug pulled out from under you. But look at it this way: about half of those people who have gone and paired off will probably come back to their hometown, bride or husband in tow, so your population will probably end up about even. After these couples outgrow their honeymoon period, they’ll settle down, work, have kids, and spend with an intensity that only the truly content can bring to bear. In the coming decade, your books might even make it into the black.”

Robert was not easily placated. “Can you say for certain that this is the way it’s going to play out? The world has never seen this kind of mass social movement. What if the people become complacent instead of motivated? What if your algorithms can’t guarantee long term stability?”

Jeff had an inner conflict. As usual, the side favoring the least amount of social friction won out. “We’re the ones who managed to pair everyone up so well in the first place, aren’t we? The models say the population will eventually converge on a higher level of stable productivity. I can’t promise you it’s going to happen, of course, but here at Google we have pretty high hopes for the future.”

The two men talked in this way for some time. The elder statesman pushed and the younger (but not exactly young) engineer deflected until the senator grew weary or satisfied enough to defer discussion to a later date. Jeff had managed to end the meeting with only vague promises, a surprising talent that had earned him his relative autonomy from Larry Page’s inner circle. Later he would have to file a report, naturally, detailing the intricacies of his conversation with the senator, but for now Page trusted him to keep third parties at arm’s length on his own.

Later that evening, after a quiet supper with Heidi, Jeff lay in bed thinking. The models actually didn’t say much about the economic reality of the future. The social data that allowed his team to pair people so effectively seemed to shrug mutely at the problem of what the future might be like. He had assured Graves that everything would be fine, but by the time Jeff could be proved wrong, he would be long retired.

Sleep took him. He dreamt, which was not unusual (though he didn’t know it), but he also remembered his dreams from that night, which was. He dreamt of a young man smashing a perfect chalice in a decrepit hallway, and of women who laughed while they danced away from their homes.

When he woke, Jeff knew what he had to do.

Sanjay probably could be trusted, but Jeff couldn’t take the chance. He would split his change into pieces, and sneak them into other, tangentially related changelists. The other developers on his team would probably rubber stamp these, anyway, since Jeff was one of the most prolific programmers there was. Who would look at yet another Jeff Dean code review too closely?

What did it mean to adhere to Google’s famed “Don’t be Evil” policy, when it came to arranging marriages? The standard Google answer would be to make the user as happy as possible without violating their trust. But what trust was there to violate if users themselves didn’t know what they wanted in relationships, or what would truly make them happy? Marriages are long lived beasts, Jeff reasoned, subject to slowly building changes in the macroeconomic climate. If marriages affect the economy, and the economy affects all marriages, what should you optimize for, and how?

Jeff’s changelists were approved, as a matter of course. Years later, he retired. The day he first started noticing what might have been the fruits of his subversion ripen, he remembered a thing that his old mentor Urs loved to say, before Urs had left him in charge.

“It is better to ask forgiveness than permission,” Hölzle would often chuckle, in a particularly German way. Jeff chuckled now, too.

Wired was doing a bio piece on a recently minted tech millionaire. The man was one of the few people for whom MapReduce’s pairing hadn’t worked out in the long term. When asked what had motivated him to start the company he had just sold, the man somewhat abashedly said that he wanted to prove to his ex-wife that dumping him was a mistake.

Ambition and talent sometimes survive contact with love, Jeff mused, but are more often dulled by it. MapReduce could identify those individuals who are defined by intelligence, drive, and pride. In other words, the archetypal entrepreneur. A few modified terms in a complex linear algebra equation could yield surprising results, Jeff had discovered, like optimizing for romantic partners that would net the largest increase in a person’s ambition, rather than happiness. A lot of the unfortunate people of talent singled out by Jeff’s modification would probably yield little value, but one, he hoped, would build the next Google. Jeff longed to see that day.