Robots, Spiders and Crawlers

A robot, spider, or crawler is a piece of software that is programmed to “crawl” from one web page to another based on the links on those pages. As this
crawler makes it way around the Internet, it collects content (such as text and links) from web sites and saves those in a database that is indexed and ranked according to the search engine algorithm.

When a crawler is first released on the Web, it’s usually seeded with a few web sites and it begins on one of those sites. The first thing it does on that first site is to take note of the links on the page. Then it “reads” the text and begins to follow the links that it collected previously. This network of links is
called the crawl frontier; it’s the territory that the crawler is exploring in a very systematic way.

The crawler sends a request to the web server where the web site resides, requesting pages to be delivered to it in the same manner that your web browser
requests pages that you review. The difference between what your browser sees and what the crawler sees is that the crawler is viewing the pages in a completely text interface. No graphics or other types of media files are displayed. It’s all text, and it’s encoded in HTML. So to you it might look like gibberish.

Here’s a quick list of some of the crawler names that you’re likely to see in that web server log:

  • Google: Googlebot
  • MSN: MSNbot
  • Yahoo! Web Search: Yahoo SLURP or just SLURP
  • Ask: Teoma
  • AltaVista: Scooter
  • LookSmart: MantraAgent
  • WebCrawler: WebCrawler
  • SearchHippo: Fluffy the Spider

“nofollow” tag-SEO

The tag <rel=”nofollow”> is an attribute that tells a search engine crawler not to follow a certain link on your website.

For example, if you want to include an example of a bad site (like a hacker’s site or an SEO spam site) you may want to show that link on your
web site. However, that link could reduce your search engine ranking because it’s a known bad site, and when you include the link to it the crawler thinks you’re endorsing the site.

The url tag with nofollow in it is like this–
<a href=””rel=”nofollow”&gt; Bad Site </a>

The nofollow tag is’nt essential in your SEO efforts. However it could help prevent your site ranking from being reduced, and maybe even increase your ranking a little. Anything that keeps your ranking from falling is a good measure to take.