This site is for sale,
Learn More
Are Search Engine Spammers Exploiting Your Web Pages?
Spammers Use Sitemaps File To Scrape Sites
Originally Published: May 15, 2007
A recent thread in a
webmaster forum
indicated that some search engine spammers might exploit the new XML sitemaps files. Has your sitemaps file been abused by spammers? Can using a sitemaps file harm your search engine rankings?
What Is A Sitemaps XML File?
The big search engines (Google, Yahoo, MSN and Ask) introduced the Sitemaps protocol earlier this year. In its simplest form, a sitemap is an XML file that lists URLs for a site along with additional metadata about each URL: when it was last updated, how often it usually changes, how important it is, relative to other URLs in the site, etc. See our previous article [Sitemaps Protocol Supported By Google, Yahoo, MSN and Ask]
That information helps search engines to more intelligently crawl your site. The Sitemaps
protocol
is a standard that makes it easier to create a sitemap that can be parsed by all search engines.
How Can A Sitemaps File Harm Your Rankings?
Some webmasters reported problems with duplicate content after adding a sitemaps XML file to their web sites.
The content of their websites appeared on dubious websites that had nothing to do with the original sites. The content of the original websites had been duplicated on many other sites. The result was that the original sites might have received ranking penalties due to duplicate content.
How Did This Duplication Happened?
Some search engine spammers used the sitemaps XML files to easily find contents for their scraper sites.
A scraper site is a website that pulls all of its information from other websites using automated tools. The scraper software pulls different contents from other websites to create new web pages that are designed around special keywords. The scraped pages usually show AdSense ads with which the spammers hopes to make money.
The new sitemaps XML files make it very easy for scraper tools to find content rich pages. Although the original intention of the sitemaps files was to inform search engines about every single page of your web site, they can also be used to inform spam bots about your pages.
What Can You Do To Avoid Problems With Your Sitemaps File?
One possible solution is not to use any sitemaps file at all. In that case, scraper bots can still parse your web pages through the normal links on your web pages but that would be more difficult for them than using your sitemaps file.
Another solution is to set up a sitemaps file and delete as soon as search engines have indexed that file.
Do not use free sitemap generator tools. You don't know what they will do with your data and they might even use it to create scraper sites with your content.
(
Editor's Note:
Our
free spider map creator
is a quick and easy way to link to every page on your site so the search engine spiders can find them. Users link to it in different ways so it is not as easy for the spammers to find. We do not retain any data from your site. )
Unfortunately, there's not much that you can do to stop spammers from abusing your content. Use a tool such as
CopyScape
to find sites that have duplicated your content.
Copyright by Axandra GmbH, publishers of SEOProfiler, a
complete SEO software solution.
Try SEOProfiler
for free.
All product names, copyrights and trademarks mentioned in this newsletter are owned by their
respective trademark and copyright holders.
Site Promotion Articles Indexes:
|