Go Back   Cloud Computing > Support > Linux Server Hosting
 

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 12-12-09, 08:02
BOD Member
 
Join Date: Nov 2009
Posts: 100
Default Robots.txt

I have found this file in the root of my web site that has the following code in it:

User-agent: *
Disallow:

What does this mean?
Reply With Quote
  #2 (permalink)  
Old 12-12-09, 08:29
BOD Member
 
Join Date: Nov 2009
Posts: 46
Default

Robots.txt is used to control the way search engine robots visit your site for indexing purposes.

Code that you found states that all parts of the site are allowed for robots to see and index.
Reply With Quote
  #3 (permalink)  
Old 12-12-09, 08:49
BOD Member
 
Join Date: Nov 2009
Posts: 47
Default

Actually you don't even need that code because by default search engine robots, or in short robots, will index your site. But if you put:
User-agent: *
Disallow: /
it will block robots from your entire site.
Reply With Quote
  #4 (permalink)  
Old 12-12-09, 09:04
BOD Member
 
Join Date: Oct 2009
Posts: 100
Default

Having site indexed is the most important thing you need to ensure. Having that code in the root of your site is OK, but if you don't know what code is inside this file it's better to erase the file than to have some code that may prevent robots from indexing it.
Reply With Quote
  #5 (permalink)  
Old 12-12-09, 09:29
BOD Member
 
Join Date: Nov 2009
Posts: 100
Default

I don't like the clutter, so I took you up on your advice and removed the file completely. And of course I realize that indexing is very important.
Reply With Quote
  #6 (permalink)  
Old 12-18-09, 14:36
BOD Member
 
Join Date: Dec 2009
Posts: 16
Default

Are there any true reasons why someone would not want robots indexing their site? The robots are what help you get the page rank and search stats right?
Reply With Quote
  #7 (permalink)  
Old 12-19-09, 10:21
BOD Member
 
Join Date: Dec 2009
Posts: 37
Default

I'm curious to know the answer to Holly's question. I have always thought that allowing robots access to every bit of information was critical to getting a good PR rank.
Reply With Quote
  #8 (permalink)  
Old 12-26-09, 13:43
BOD Member
 
Join Date: Dec 2009
Posts: 22
Default

This is done to stop search engines accessing private or irrelevant sections of your website or at the early development of a website if the client wants to keep the site secret until official launch date.

Last edited by Kratos : 12-26-09 at 14:27.
Reply With Quote
  #9 (permalink)  
Old 12-29-09, 06:30
BOD Member
 
Join Date: Dec 2009
Location: Australia
Posts: 86
Default

Thats right.

Google will crawl for everything it can.
Your running a site you don't want advertised, perhaps its an admin server or such.
Even something a little on the left of the law :p

It's there so YOU have control over what happens with your website.
Reply With Quote
  #10 (permalink)  
Old 12-29-09, 10:20
BOD Member
 
Join Date: Oct 2009
Posts: 45
Default

Robot.txt contains some instructions for the robots which crawls your link. This is very important for a good site
Reply With Quote
  #11 (permalink)  
Old 03-09-10, 01:29
BOD Member
 
Join Date: Jan 2009
Posts: 172
Default

The code means they gives access to all robots to all of your site. But if you disallow it, I think a "/" sign must be placed after the Disallow: label.
Reply With Quote
  #12 (permalink)  
Old 09-21-10, 02:04
BOD Member
 
Join Date: Nov 2009
Posts: 100
Default

Quote:
Originally Posted by Kratos View Post
This is done to stop search engines accessing private or irrelevant sections of your website or at the early development of a website if the client wants to keep the site secret until official launch date.
I agree. There are some things that you just don't want to share with the general public.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off
Forum Jump


All times are GMT -6. The time now is 20:09.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
Copyright © 1999-2012, BODHost Ltd. All rights reserved.