Sure...

Go there yourself and look at the disallow list:

http://whitehouse.gov/robots.txt

u-238:/home/jamest %wget http://whitehouse.gov/robots.txt
.
.
.
u-238:/home/jamest %wc -l robots.txt
1734 robots.txt
u-238:/home/jamest %grep -c Disallow robots.txt
1729

You'll see what directives are disallowed to searched by an archiver/search engine.

Now, that's not stopping anyone from using wget's* mirror function and grabbing everything from a site, but it keeps Google, et al from archiving and/or indexing the content.

*: Of course, the site could still have an IP filter in place to block any request from certain locales.

--JamesT


>--------------< --Chemguru 99 CSVT Frost /Mid. Blue 00 Suzuki SV650 Red, Naked