Facebook Puzzle: peaktraffic

Last week I was working on Facebook’s peak traffic puzzle, which was a pretty entertaining and informative exercise. The idea is basically to parse a log file and generate an undirected graph that represents mutual friendships between Facebook users. Then you are to find every group of friends where each friend in the group is friends with every other person in the group.

I wrote a naive implementation at first that looked something like

# $seen is a hash to menoize previously seen sets
# $sparse is a hash of usernames to a list of neighboring usernames
# $set is the list of output clusters

$seen = {}
def subgraph(set, adj)
    hash = (set + adj).sort
    return if $seen[hash]
    $sets.push set.sort.join(", ") if adj.empty? and set.size > 2
    adj.each {|node| subgraph(set + [node], $sparse[node] & adj)}
    $seen[hash] = true
end

$sparse.keys.each do |vertex|
    subgraph([vertex], $sparse[vertex])
end

This appeared to work pretty well on my tests, but continued to fail on Facebook puzzle submission for some reason I couldn’t track down at the time. In my frustration, I went to the internet. Turns out, this problem is actually known as list maximal cliques, a category of clique problem. The prior link has more information about cliques, which are subgraphs that have the friend property desired by this problem. Additionally, there is a known “pretty good” strategy for solving this problem, the Bron Kerbosch algorithm .

I went all out and applied a couple optimizations, namely (warning: abstract pdfs) pivoting and degeneracy ordering, which are two techniques used to cut down on the number of recursive calls. At this point my code looked something like

def generate_degeneracy_ordering
    d = []  #degree buckets
    dw = {} #degree for each vertex
    $sparse.each_pair do |vertex, neighbors|
        deg = neighbors.size
        d[deg] ||= []
        d[deg].push vertex
        dw[vertex] = deg
    end
    d.each_index {|i| d[i] ||= []}
    $sparse.size.times do
        vertex = d.find {|x| !x.empty?}.pop
        $degen.push vertex
        for neighbor in $sparse[vertex]
            if d[dw[neighbor]].delete neighbor
                dw[neighbor] -= 1
                d[dw[neighbor]].push neighbor
            end
        end
    end
end

def bron_kerbosch(set, points, exclude, pivot_neighbors=nil)
    if points.empty?
        $sets.push set.sort.join(', ') if set.size > 2 and exclude.empty?
        return
    end

    pivot_neighbors ||= (exclude.empty? or $sparse[points.last].size > $sparse[exclude.last].size) ?
        $sparse[points.last] : $sparse[exclude.last]

    points.each_with_index do |vertex, i|
        next if pivot_neighbors.include? vertex
        points[i] = nil
        bron_kerbosch(set + [vertex],
                      points & $sparse[vertex],
                      exclude & $sparse[vertex])
        exclude.push vertex
    end
end

exit unless ARGV.size == 1
ingest(ARGV[0])

generate_degeneracy_ordering
before = []
after = $degen[1..$degen.size-1]
$degen.each do |vertex|
    intersect = after & $sparse[vertex]
    bron_kerbosch([vertex],
                  intersect,
                  before & $sparse[vertex],
                  $sparse[intersect.last]) #last elements in $degen have highest degrees
    before.push vertex
    after.shift
end

Which let me parse a 120MB input file in 1 minute flat. However, for some reason this was slower than my naive solution, which could do it, ironically, in 48s. It did pass the Facebook puzzle checker, though, so at least I had that. If anyone has any insight into why my naive solution appears to outperform bron_kerbosch I would greatly appreciate it, as it does not seem correct. My guess is slow implementations in Ruby of Array “set-like” operators (&, +).

final thoughts

I spent 2-3 days stuck on bugs that amounted to

  • parsing input off by one whitespace
  • output missing a final newline
  • mysterious bug that turned out to be RUBY 1.8.6 NOT HAVING MAX_BY AND ME TESTING IN 1.8.7

Facebook absolutely does not provide you any clues as to what went wrong besides there being a syntax error in your build. In this case it falls on you to both duplicate their runtime environment and generate workable tests. But finally seeing the confirmation email is definitely worth it. Here are the results after finally getting my naive solution running.

Dear submitter,

Thank you for your submission of a puzzle solution to Facebook! After running your solution to peaktraffic (received on February 28, 2011, 4:07 pm), I have determined it to be correct. Your solution ran for 1162.186 ms on its longest test case.

Here are my two solutions to this problem. The codepath that executes the Bron Kerbosch algorithm is not active but fairly obvious. They both work.

Cheap Toto Pagination

Toto is a great tiny blogging platform for Ruby/Rack. However, it doesn’t expose much in the way of a MVC structure and it can be just annoying enough when you want to add some feature that isn’t there. In this case, I wanted to add some simple older/newer pagination to the front page. To my chagrin, I couldn’t find a way to pass variables to a Toto page without using the GET variable syntax (i.e. ?page=1) and I still wanted to hold onto the rails RESTful paradigm of /page/1, so I monkey patched the Toto::Site dispatcher, like so:

# in config.ru
class Toto::Site
    alias_method :old_go, :go

    def go route, env = {}, type = :html
        if not route.first =~ /\d{4}/ and route.size == 2 and route.last =~ /(\d+)/
            @config[:id] = route.last.to_i
            route.pop
        end
        ret = old_go route, env, type
        @config.delete :id
        ret
    end
end
...
# and in the config initializer block:
set :root, "page"                                           # page to load on /

You can see that we intercept routes that look like name/1234 and pass the numeric portion of the route in @config[:id], and then clear @config[:id] (because @config is persistent). This is pretty hacky and only really acceptable in the context of Heroku caching everything.

and in templates/pages/page.rhtml…

<%
    page = @config[:id]
    per_page = @config[:articles_per_page]
    page = 1 if (page.nil? or (page-1) * per_page > @articles.length) or page < 1
    page_results = @articles[(page-1) * per_page .. page * per_page - 1]
    prev_page = page > 1 ? page - 1 : nil
    next_page = @articles.length > page * per_page ? page + 1 : nil
%>
...
<p id="footer">
<% if prev_page %>
    <a href="/page/<%=prev_page%>">&laquo; newer</a>
    <% if next_page  %>|<% end %>
<% end %>
<% if next_page %>
    <a href="/page/<%=next_page%>">older &raquo;</a>
<% end %>
</p>

Also, syntax highlighting brought to you by highlight.js.

Interactive Google Doodle

Google’s take on an interactive ball logo was interesting enough to me that I felt that I needed to write something at least as fun. My take is backed by HTML5, so you will need a suitably compliant browser to view it.

Features

  • Mass based ball collisions
  • Mouse input to scatter balls
  • Balls “anchored” to positions with dampened springs
  • Typing creates additional interactable balls
  • Searching cues exit animation

Technical information

The code for this demo (as well this entire blog) is on github. Here is the primary javascript engine.

In google’s original doodle, the balls would react with changes in size and velocity to the presence of the mouse. They would not, however, collide with anything. I thought this was a shame, and set out to make a HTML5 canvas app that could handle collisions between a large number of objects.

A naive initial implementation worked well enough. It looked like this (pseudocode is rubyish):

for i in 0 to circles.length - 2
for j in i + 1 to circles.length - 1
handleCollision(circles[i], circles[j]);
end
end

which served me well enough on Canary on my admittedly beefy desktop. It wasn’t until I tried to run the same on Firefox 3.x on my work laptop that I realized I probably wasn’t performant on a wide range of configurations.

The thing to do, then, was to set up a 2D grid covering the entire canvas. Each cell is about the diameter of a circle, and knows about any circles within itself, as well as up to four of its neighbors. Then, collision code becomes something like

for circle in circles
for potentialCollider in grid.getPotentialColliders(circle.pos)
handleCollision(circle, potentialCollider);
end
end

def grid.getPotentialColliders(pos)
cell = getCell(pos)
ret = []
for neighbor in cell.neighbors
ret = ret.concat(neighbor.objects)
end
ret
end

which takes the runtime complexity down from n^2 to about n. Neat!

Probably the hardest part was making circles appear for the query text. Understanding whether the user added characters, deleted, replaced, pasted, etc. is kind of difficult. I honestly hacked together a solution that works in 95% of use cases, but the best solution would to be to implement a full fledged dynamic programming algorithm to find the minimum edit distance.

And finally, a big thanks to Rob Hawke’s recreation of the google ball logo for both reference and initial source data points.

The Gustav: A BC2 Hack

The Gustav is a closed source C++ hack for Battlefield: Bad Company 2. It incorporates realtime physics simulation to determine potential targets, as well as unique D3D hooking to provide visual cues for the user in the game world itself.

It has taken me about three months of tinkering in my time off to put together and I am more or less satisfied with what it has taught me about physics and reverse engineering, which are two subjects that are very easy to unlearn without practice.

I have written up a couple pieces of interest encountered while developing the Gustav, namely how to draw in 3D within the game world while respecting Z-depth and a short primer on how to write a tree-recursive menu, the latter of which will probably be nothing new to any experienced dev.

on resilience

Russell Bennett:
i’m worried about steve man
all this resilience is getting to his head
he keeps trying to get people to hurt him in real life
talking about how it’s “going to hurt 40% less”
i think you need to talk to him

Vincent Woo
hahahaha
shit man
have you talked to him about how your health pool is still more important

Russell Bennett
yeah man
still running around with 96k hp, gemming everything res
saying “no it’s better i don’t need socket bonuses i read it on ej”