A researcher and startup founder saw that he could crawl all facebookpages because there was no interdiction in the robot.txt. So he did, collected the data from facebook and began with his research (who spoke to who for example).
When it was all ready to go public, he got a call from the legal department of Facebook
Their contention was robots.txt had no legal force and they could sue anyone for accessing their site even if they scrupulously obeyed the instructions it contained. The only legal way to access any web site with a crawler was to obtain prior written permission. http://petewarden.typepad.com/searchbrowser/2010/04/how-i-got-sued-by-facebook.html
But as the lonely guy would found himself broke, he couldn't test it in court (if there was even a judge to understand it correctly).
So he obeyed (having a lawyer from such an international company makes you very nervous indeed - especially if you have responsabilities)
But the main question stays, if robot.txt has no legal status and the only status are the rights that are formulated in a 'user agreement' what does this mean for Google and all the other searchengines and researchtools ?