How Content Scraping Occurs
Content scraping usually is performed by scripts that extract content from original web sources and takes the content into one site. A person creates a beautiful WordPress site, then installs some plugins to go and scrape content from specified blogs, to be published on his/her site. Ulterior motives drives people to scrape content from other websites. Some people want to exploit the system to make some money by using original content from other sites as there’s so as to drive traffic from that site to there’s. The person scrapes content to his site to attract traffic so he can get money from the site by putting advertisement when the site is made popular from the increase in traffic.
How to Identify your WordPress Site’s Scrapers
It is a tedious task to identify a site that scrapes content from yours not forgetting that it’s a time consuming task. One way to identify content scrapers is to search Google using your site’s post titles. This is a tedious and time consuming procedure especially when are searching a very popular topic that resembles the one in your blog post. Another way is to use trackback by adding internal links in your blog posts. If a site steals your content, you will notice a trackback. Using Akismet in WordPress shows a lot of trackbacks in SPAM folder. But remember this only works if you implement internal backlinks in your posts. Google webmaster tools lets you know of links that come from scraper’s site to your site. Just look under “Traffic”, where you will find “links to you site” that leads you to a page the displays links to scraper’s sites. You can also identify content scraper’s site by using a FeedBurner i.e. if you had installed it on your WordPress blog; check on the Analyze Tab bar under Feed Stats, where you will see “Uncommon Uses”. It contains a list of scraper’s sites.
The Approach for Dealing with Content Scrapers
The easiest approach you can take is take no action considering that it takes a lot of time fighting content scraping. For authority sites in Google’s eyes, this will do no harm but for other sites, they can be flagged as scrape sites when Google thinks the site’s content is scraped but its not. This usually happens during a Panda Update. On the other hand, you can contact the scraper asking them to remove your site content from theirs. Some may refuse, in such a case file a Digital Millennium Copyright Act (DMCA) referencing their host. You can also block their IP. The last approach is taking advantage of scrapers by internal linking to get backlinks from their sites thus increasing your audience or you can auto link keywords with Affiliate links.