Previous Thread
Next Thread
Print Thread
#849344 01/21/04 08:49 PM
Joined: Feb 2001
Posts: 750
T
Veteran CEG\'er
OP Offline
Veteran CEG\'er
T
Joined: Feb 2001
Posts: 750
www gurus out there, does this story make sense about this robots.txt stuff ?
I found it hard to believe even harder to explain if it is true.

T.

#849345 01/21/04 09:09 PM
Joined: Jun 2000
Posts: 1,231
S
Hard-core CEG\'er
Offline
Hard-core CEG\'er
S
Joined: Jun 2000
Posts: 1,231
Makes perfect sense, and it's probably legit.


-06 GTO Torrid Red/M6 -98 LS with BPU -05 Honda Odyssey EX-L mv .zig ..\for\great\.justice
#849346 01/21/04 09:15 PM
Joined: Jun 2001
Posts: 682
D
Veteran CEG\'er
Offline
Veteran CEG\'er
D
Joined: Jun 2001
Posts: 682
Makes sense for the Bush admin, but how about the rest of the country?


98.5 Contour SVT "Too many OB/GYNs aren't able to practice their love with women all across this country" --US President George W Bush
#849347 01/21/04 09:59 PM
Joined: Dec 2002
Posts: 2,069
D
Hard-core CEG'er
Offline
Hard-core CEG'er
D
Joined: Dec 2002
Posts: 2,069
Very low believability. The listing isnt very official

#849348 01/21/04 10:05 PM
Joined: Aug 2002
Posts: 68
J
CEG\'er
Offline
CEG\'er
J
Joined: Aug 2002
Posts: 68
yes. that is one way to prevent search engines from indexing certain dir's on your site.


AOL AIM: nerdsalot2 05 Mustang GT DropTop 98 ZJ 4.0L Limited
#849349 01/21/04 10:06 PM
Joined: Jul 2001
Posts: 3,037
J
Hard-core CEG\'er
Offline
Hard-core CEG\'er
J
Joined: Jul 2001
Posts: 3,037
Sounds like x-no-archive for Google Groups/Usenet.


"Think of it, if you like, as a librarian with a G-string under the tweed." Clarkson on the Mondeo.
#849350 01/22/04 02:27 AM
Joined: Dec 2001
Posts: 777
C
Veteran CEG\'er
Offline
Veteran CEG\'er
C
Joined: Dec 2001
Posts: 777
Sure...

Go there yourself and look at the disallow list:

http://whitehouse.gov/robots.txt

u-238:/home/jamest %wget http://whitehouse.gov/robots.txt
.
.
.
u-238:/home/jamest %wc -l robots.txt
1734 robots.txt
u-238:/home/jamest %grep -c Disallow robots.txt
1729

You'll see what directives are disallowed to searched by an archiver/search engine.

Now, that's not stopping anyone from using wget's* mirror function and grabbing everything from a site, but it keeps Google, et al from archiving and/or indexing the content.

*: Of course, the site could still have an IP filter in place to block any request from certain locales.

--JamesT


>--------------< --Chemguru 99 CSVT Frost /Mid. Blue 00 Suzuki SV650 Red, Naked

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.5