February 28th 2011
Last week I was working on Facebook’s peak traffic puzzle, which was a pretty entertaining and informative exercise. The idea is basically to parse a log file and generate an undirected graph that represents mutual friendships between Facebook users. Then you are to find every group of friends where each friend in the group is friends with every other person in the group.
I wrote a naive implementation at first that looked something like
This appeared to work pretty well on my tests, but continued to fail on Facebook puzzle submission for some reason I couldn’t track down at the time. In my frustration, I went to the internet. Turns out, this problem is actually known as list maximal cliques, a category of clique problem. The prior link has more information about cliques, which are subgraphs that have the friend property desired by this problem. Additionally, there is a known “pretty good” strategy for solving this problem, the Bron Kerbosch algorithm .
I went all out and applied a couple optimizations, namely (warning: abstract pdfs) pivoting and degeneracy ordering, which are two techniques used to cut down on the number of recursive calls. At this point my code looked something like
Which let me parse a 120MB input file in 1 minute flat. However, for some reason this was slower than my naive solution, which could do it, ironically, in 48s. It did pass the Facebook puzzle checker, though, so at least I had that. If anyone has any insight into why my naive solution appears to outperform bron_kerbosch I would greatly appreciate it, as it does not seem correct. My guess is slow implementations in Ruby of Array “set-like” operators (&, +).
I spent 2-3 days stuck on bugs that amounted to
- parsing input off by one whitespace
- output missing a final newline
- mysterious bug that turned out to be RUBY 1.8.6 NOT HAVING MAX_BY AND ME TESTING IN 1.8.7
Facebook absolutely does not provide you any clues as to what went wrong besides there being a syntax error in your build. In this case it falls on you to both duplicate their runtime environment and generate workable tests. But finally seeing the confirmation email is definitely worth it. Here are the results after finally getting my naive solution running.
Thank you for your submission of a puzzle solution to Facebook! After running your solution to peaktraffic (received on February 28, 2011, 4:07 pm), I have determined it to be correct. Your solution ran for 1162.186 ms on its longest test case.
Here are my two solutions to this problem. The codepath that executes the Bron Kerbosch algorithm is not active but fairly obvious. They both work.