Disabled World LogoWorld Map
Reference Desk

HomeCommunity Disability CommunityVideo VideosDecrease Font SizeIncrease Font Size

Root Directory Files - Robots - Favicon - 404 Pages - htaccess


By Disabled World - Jan 2, 2009 12:49:12 PM

There are many things that can help make your website more professional, while making it more friendly to search engines at the same time. Documents that you can insert in your main directory where your index file is contained include robots.txt, .htaccess file, favicon, and 404 error pages.

Once we have finished the tutorial on building your first website for profit or fun we will assemble all the steps we have previously covered into a logical order.

Now that we have covered how to buy a domain name, and what domain name extensions are available for various countries, we thought it would be a good time to prepare some basic root directory files for your new website.

Four Files you Should Add to the Root Directory (public_html or www on some servers)

There are many little tricks that can help make your website more professional, while making you more friendly to search engines at the same time. Documents that you can insert in your main directory where your index file is contained. We will be listing a few of them here along with explanations of what they do.

 

1/ Robots.txt

When this txt file is inserted to your main directory it will give you control of what pages can or cannot be verified and searched by search engines. You can also use it to exclude a certain seach engine from being able to search your site. You can easily make this by yourself and this is how to do it.

Open your notepad and save it in your main directory as robots.txt. Some malware search engines are used to find weaknesses in your website or to find your E-mail adress so that they can send you spam E-mails. In these cases you can block them out by using the robots.txt. The first thing the majority of search engines will do before searching through your directories is to seek a robots.txt file. If they do not find one they will then go through your index page and move on throughout your website. Here are a few examples of what you can do.

User-agent: *
Disallow: /

By writing the preceding in robots.txt file you will disallow ALL search engines to search your site. In this manner your entire site is locked and the search engines have no access to it. The star that you insert for User-agent means that all search engines are concerned and subject to what you disallow them to view. Something else you can do is the complete opposite.

User-agent: *
Disallow:

By putting nothing for Disallow you are giving access to all search engines, along with the permission to search and verify your entire website. You can also choose to specify what search engines are subject to these instructions as the following example demonstrates.

User-agent: Google
Disallow:

In this case Google's search engine will be able to search your entire website.

User-agent: Bad search engine
Disallow: /

In this case you disallow Bad search engine to search your website. You can also choose to make only a few files in your website private and out of reach for the search engines.

User-agent: *
Disallow: /private/
/cgi-bin/
/jokes/

In this case the search engines are not to verify your cgi-bin, private page and the jokes file.

 

2/ 404 Page

There are many 404 error pages that have been already written up, and that are available all over the World Wide Web. You can simply search on google and you will fall upon some very quickly. What this will do is if ever someone goes through a dead link where no page is available within your website, the 404.shtml will replace the normal Internet Explorer error by a custom message.

For example you may choose to re-direct them to your home page or mention that the page is either expired or simply missing. In most cases you can be more clear about the situation then the normal Internet Explorer error page.

All you have to do is a simple html page like any other page on your website. Then save it as 404.html inside your main directory. This will replace any missing pages that a user may get to within your website by your 404.html page, notifying them that the page is missing. Many people have done unique pages for the 404 error, some have made it funny while others have made it to be professional. Only keep in mind that it is to be only a small page describing that there is nothing at that given location, do not make it too big and keep it to the point so that the search engines don't see it as a doorway to your website.

 

3/ .htaccess file

When used properly a .htaccess file in your main or root directory can make all the difference when it comes to the security of your website. We will now revise all of the possibilities and valid commands you may implement with the .htaccess file. Here is some important information that not all webmasters may know about, many internet users will not write a complete URL. For example if I were to go to www.google.com, I may type it in the adress bar as google.com without the (www). Now what happens is that most search engines will see (www.google.com) and (google.com) as a different website. This also quite often happens when other websites link to your with and without the www (www is really a subdomain of your website) This can penalize you as it is seen as two completely different websites with the exact same content.

There is a way however, to automatically add the World Wide Web abbreviation in front of your domain name when a user leaves it out. Making it impossible to fall on your website without the (www). In this way you cannot be penalized by the search engines, and the user will always fall on www.mysite.com even if the beginning www is left out.

Here is the code that is to be added into your .htaccess file to redirect non www.mysite.com to mysite.com.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^mysite.com [NC]
RewriteRule ^(.*)$ http://www.mysite.com/$1 [L,R=301]

Simply create a notepad file and replace mysite.com with your own url - save the file as htaccess.txt and upload the file to the root directory (public_html) of your webspace. Once uploaded rename the file on your server to .htaccess

 

Favicon

Something that can add a little touch to a website is the well known favicon.ico. What this does is it replaces the Internet Explorer image next to your websites name by a different image. In this case being the image of your choice, adding a logo next to your url when your website is bookmarked. It is incredibly easy to do, simply insert the desired image into an online favicon generator and put the completed favicon.ico into your main directory.

Our next article in our building a website tutorial covers Search Engine Optimization Tips for High Rankings


Email Email article   Printer Print


 Disability
Accessibility
Accessible Website Design

This site is intended for your general information only and is not a substitute for medical advice or treatment.
© Disabled World - Building the most informative disability community online!
 373