Sure...
Go there yourself and look at the disallow list:
http://whitehouse.gov/robots.txtu-238:/home/jamest %wget
http://whitehouse.gov/robots.txt.
.
.
u-238:/home/jamest %wc -l robots.txt
1734 robots.txt
u-238:/home/jamest %grep -c Disallow robots.txt
1729
You'll see what directives are disallowed to searched by an archiver/search engine.
Now, that's not stopping anyone from using wget's* mirror function and grabbing everything from a site, but it keeps Google, et al from archiving and/or indexing the content.
*: Of course, the site could still have an IP filter in place to block any request from certain locales.
--JamesT