开发者

"The Social Network" programming puzzle

There's a neat sequence in the movie The Social Network in which the character writes a perl script to grab images from sorority web servers on campus. His goal is to get a picture for ev开发者_JS百科ery member of each sorority with a minimum of missed members. Typically, this just involves him grabbing it from a public directory or other little hoops like an empty search which returns all members, but he describes one really interesting set up and never gives a solution for it.

One sorority's site allows for searching and returns the pictures for matching members. However, if a search returns more than 20 matches, nothing is displayed.

Assuming no other way to access the pictures and without a list of the names of sorority members, is there an elegant way to get at least a majority of member pictures in this case? Or any way at all?

Edit: Here's a link to the scene from the movie, slightly cut up to show only the coding parts.


Pick up the phone, ask for a campus directory and feed those names into the sorority's search to get members back one at a time.

This is, after all, the social network.


I haven't seen the movie, but let me state some assumptions:

  • You have a search field which searches by name.
  • You don't know the names of any sorority members.
  • There is no other way to access the pictures (except the search box).

In this scenario, I do not think there is an elegant answer. This may be one of those cinematic "I translate the ancient tablet in an unknown language" sort of moments. My guess is your best bet would be a brute-force search.

  1. If you search by common names (and last names), you could get a majority of the members.
  2. If you have the time and will, an actual brute force (letter by letter, etc) would eventually fill up the gaps missed by item 1.

Edit: Also, in theory, if the sorority is large enough and everyone is named "Jane Smith", there would be no solution.


findall(prefix):
   res = set()
   for char in alphabet:
      sresults = search(prefix + char)
      if len(sresults) == 0 and len(prefix) < ABORT_SIZE:
          res += findall(prefix + char)
      else:
          res += sresults
   return res
findall("")

Notice the solution will take a long time if the distribution of names is not approximately equal because it will endlessly enumerate all suffixes for "ab" if there are 20 people matching "abc" and 1 matching "abd". You can modify ABORT_SIZE to balance time and completeness. It should be higher than log|alphabet|(n), where n is the (unknown) number of final results.


I guess you could use a kind of a "dictionary attack" - did you know that you can download all last and/or first names from US census bureau? My other option was what @phihag proposed.


I've got the book "The Accidental Billionaires" here and it quotes directly from his LiveJournal, and ... that didn't happen. You do know that the guy who wrote the script knows absolutely nothing about code, right?

From what I'm reading, he used something LWP to spider 'Lowell' and 'Adams', then he got to 'Dunster' which is the one with the 20-results problem, and he writes "I'll come back later". So for all I can see, he may have just given up.

He also writes that "It's taking a few tries to compile the script".

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜