Robots.txt designed to prevent search engine to crawl your page or content, some of the content probably you don’t want others to search about it.

Example, below shows that this site prevent all user agent (mostly search engine) to crawl the content of the entire site.

1
2
User-agent: *
Disallow: /

However, this could be a loophole for giving a chance to hacker to hack into your site, because you have exposed the paths.

Take a look on Facebook’s robots.txt, it is

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
# Notice: Crawling Facebook is prohibited unless you have express written
# permission. See: http://www.facebook.com/apps/site_scraping_tos_terms.php

User-agent: baiduspider
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Bingbot
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Googlebot
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: ia_archiver
Disallow: /
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: msnbot
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Naverbot
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: seznambot
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Slurp
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: teoma
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Yandex
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: Yeti
Disallow: /ajax/
Disallow: /album.php
Disallow: /checkpoint/
Disallow: /contact_importer/
Disallow: /feeds/
Disallow: /file_download.php
Disallow: /hashtag/
Disallow: /l.php
Disallow: /p.php
Disallow: /photo.php
Disallow: /photos.php
Disallow: /sharer/
Disallow: /topic/

User-agent: ia_archiver
Allow: /about/privacy
Allow: /full_data_use_policy
Allow: /legal/terms
Allow: /policy.php

User-agent: *
Disallow: /

So the hacker may try to access in this way, e.g. www.facebook.com/topic/ (I have tried it, it shows page not available).

How to prevent this?

You can choose a modern web framework to develop your web application, example like Laravel, the path is you can specify it by your own, e.g.

1
2
3
4
<?php
Route::get('foo/bar', function () {
return 'Hello World';
});

When hacker try to look for www.yoursite.com/foo, he/she won’t get anything here. So becareful when you design your web application.

Update: 4 Apr 2018

You may check your robots.txt here.