Content Scraping: What It Is And What To Do About It

As content marketing continues to become a force to be reckoned with, so does content scraping.

Content scraping is when a third party copies the content from your website or blog and reposts it without consent – and often without attribution. Surprisingly, there are benefits to content scraping, and opinion differs on whether it’s a good or bad thing.

On the pro side is Joan Muschamp, founder and CEO of LemonZest Marketing, who says, “The nature of the blog is that it is internet content, and designed for sharing. Blogs are a form of social media and by definition are meant to be shared to benefit social reach.”

On the other side is Bethany Gonzalez Moreno, founder of B.EcoChic.“Stealing content does not help build a business,” she says. “And it does not provide any real value to your reader or customer.”

So what are the pros and cons of getting your own content scraped?

Pro: scraping can broaden your reach: When a scraper posts your content, you or your company may gain visibility with a niche audience you would not have reached otherwise.  Of course, this only applies if the scraper gives you attribution and your internal links are intact.

Pro: it can help your SEO: Scraped content can increase your website or blog’s SEO. By linking internally to other posts on your blog, you create free backlinks, which will drive traffic to your site and increase your SEO. 

Con: it can also hurt your SEO: If the scraped article ranks higher than your original version, that’s bad for your Google ranking.

Con: decreased brand awareness: Scraping can cause Google to have difficulty deciphering which article is the original. For smaller companies this could be crippling.

“As a company that relies on organic search traffic, we are constantly battling content thieves. Scraped content can result in fewer visitors coming to our site and thus fewer purchases are made,” says Leslie Handmaker, senior marketing manager at Next Day Flyers,.

Con: lack of attribution: Scrapers often remove the author or company’s name from the content, negating thehard work you put into creating it and, in some cases, opening you up to damaging accusations of plagiarizing your own content.

Don’t want your content scraped? Here are six ways to catch it as it happens and protect yourself from it:

  1. Google Alerts: HARO’s Peter Shankman, suggests taking one line from every blog post and creating a Google alert. When the whole post is stolen, an alert will be triggered.
  2. Canonical Links: Add the rel=”canonical” tag to your content and your site will get credit for the post. Google will also see the tag and the scraper site could potentially get penalized.
  3. Copyscape: This service provides a free plagiarism checker to find copies of your content online.
  4. CAPTCHA: Most scraping is propagated by bots. CAPTCHA requires a human to type in a few jumbled letters and numbers. This helps to ensure that your content does not get scraped by a computer and will also reduce the amount of spam on your site.

    Image via:

  5. Pinging: Pinging alerts search engines that your content has been uploaded before including your post or article on an RSS feed. Various services offer this capability, such as Ping-O-Matic.
  6. Internal Links: Adding internal links to your posts will create trackbacks if a scraper tries to steal your content.

If you realize your content has already been stolen, first contact the website owner/scraper and ask them to take your content down. If the scraper ignores you or refuses to take down your stolen content, you can file a DMCA (Digital Millennium Copyright Act) with their host. Be warned: this is time-consuming. It’s easier to take preventative measures like points 2 and 4 above.


Image: Caseorganic (Creative Commons)



Did you like the post?

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

Speak Your Mind