How to deal with crawl errors in Google Search Console (Google Webmaster Tools)

Last updated on Oct 20, 2016

Has this happened to you? You check the “Crawl Errors” report in Google Search Console (formerly known as Webmaster Tools) and you see so many crawl errors that you don’t know where to start. Loads of 404s, 500s, “Soft 404s”, 400s, and many more… Here’s how I deal with big amounts of crawl errors.

If you don’t find a solution to your problem in this article, feel free to leave me a comment at the bottom of this page. I normally reply within a couple of days.

Contents

Here’s an overview of what you will find in this article:

Don’t panic!
First, mark all crawl errors as fixed
Check your crawl errors report once a week
The classic 404 crawl error
404 errors caused by faulty links from other websites
404 errors caused by faulty internal links or sitemap entries
404 errors caused by Google crawling JavaScript and messing it up 😉
Mystery 404 errors
What are “Soft 404” errors?
What to do with 500 server errors?
Other crawl errors: 400, 503, etc.
List of all crawl errors I have encountered in “real life”
Crawl error peak after a relaunch
Summary

So let’s get started. First of all:

Don’t panic!

Crawl errors are something you normally can’t avoid and they don’t necessarily have an immediate negative effect on your SEO performance. Nevertheless, they are a problem you should tackle. Having a small amount of crawl errors in Search Console is a positive signal for Google, as it reflects a good overall website health. Also, if the Google bot encounters less crawl errors on your page, users are less likely to see website and server errors.

First, mark all crawl errors as fixed

This may seem like a stupid piece of advice at first, but it will actually help you tackle your crawl errors in a more structured way. When you first look at your crawl errors report, you might see hundreds and thousands of crawl errors from way back when. It will be very hard for you to find your way through these long lists of errors.

lots of crawl errors in google search console

Does this screenshot make you feel better? I bet you’re better off than these guys 😉

My approach is to mark everything as fixed and then start from scrap: Irrelevant crawl errors will not show up again and the ones that really need fixing will soon be back in your report. So, after you have cleaned up your report, here is how to proceed:

Check your crawl errors report once a week

Pick a fixed day every week and go to your crawl errors report. Now you will find a manageable amount of crawl errors. As they weren’t there the week before, you will know that they have recently been encountered by the Google bot. Here’s how to deal with what you find in your crawl errors report once a week:

The classic 404 crawl error

This is probably the most common crawl error across websites and also the easiest to fix. For every 404 error the Google bot encounters, Google lets you know where it is linked from: Another website, another URL on your website, or your sitemaps. Just click on a crawl error in the report and a lightbox like this will open:

See where crawl errors are linked from

Did you know that you can download a report with all of your crawl errors and where they are linked from? That way you don’t have to check every single crawl error manually. Check out this link to the Google API explorer. Most of the fields are already prefilled, so all you have to do is add your website URL (the exact URL of the Search Console property you are dealing with) and hit “Authorize and execute”. Let me know if you have any questions about this!

Now let’s see what you can do about different types of 404 errors.

If the false URL is linked to from another website, you should simply implement a 301 redirect from the false URL to a correct target. You might be able to reach out to the webmaster of the linking page to ask for an adjustment, but in most cases it will not be worth the effort.

If the false URL that caused the 404 error for the Google bot is linked from one of your own pages or from a sitemap, you should fix the link or the sitemap entry. In this case it is also a good idea to 301 redirect the 404 URL to the correct destination to make it disappear from the Google index and pass on the link power it might have.

404 errors caused by Google crawling JavaScript and messing it up 😉

Sometimes you will run into weird 404 errors that, according to Google Search Console, several or all of your pages link to. When you search for the links in the source code, you will find they are actually relative URLs that are included in scripts like this one (just a random example I’ve seen in one of my Google Search Console properties):

Google crawls the URLs in this script

According to Google, this is not a problem at all and this type of 404 error can just be ignored. Read paragraph 3) of this post by Google’s John Mueller for more information (and also the rest of it, as it is very helpful):


I am currently trying to find a solution that is more satisfying than just ignoring this type of errors. I will update this post if I come up with anything.

Mystery 404 errors

In some cases, the source of the link remains a mystery. I get the impression that the data that Google provides in the crawl error reports is not always 100% reliable. For example, I have often seen URLs as sources for links to 404 pages that didn’t exist any more themselves. In such cases, you can still set up a 301 redirect for the false URL.

Remember to always mark all 404 crawl errors that you have taken care of as fixed in your crawl error report. If there are 404 crawl errors that you don’t know what to do about, you can still mark them as fixed and collect them in a “mystery list”. Should they keep showing up again, you know you will have to dig deeper into the problem. If they don’t show up again, all the better.

Let’s have a look at the strange species of “Soft 404 errors” now.

crawl-errors-google-search-console

What are “Soft 404” errors?

This is something Google invented, isn’t it? At least I’ve never heard of “Soft 404” errors anywhere else. A “Soft 404” error is an empty page that the Google bot encountered that gave back a 200 status code.

So it’s basically a page that Google THINKS should be a 404 page, but that isn’t. In 2014, webmasters started getting “Soft 404” errors for some of their actual content pages. This is Google’s way of letting us know that we have “thin content” on our pages.

Dealing with “Soft 404” errors is just as straightforward as dealing with normal 404 errors:

  • If the URL of the “Soft 404” error is not supposed to exist, 301 redirect it to an existing page. Also make sure that you fix the problem of non-existent URLs not giving back a proper 404 error code.
  • If the URL of the “Soft 404” page is one of your actual content pages, this means that Google sees it as “thin content”. In this case, make sure that you add valuable content to your website.

After working through your “Soft 404” errors, remember to mark them all as fixed. Next, let’s have a look at the fierce species of 500 server errors.

What to do with 500 server errors?

500 server errors are probably the only type of crawl errors you should be slightly worried about. If the Google bot encounters server errors on your page regularly, this is a very strong signal for Google that something is wrong with your page and it will eventually result in worse rankings.

This type of crawl error can show up for various reasons. Sometimes it might be a certain subdomain, directory or file extension that causes your server to give back a 500 status code instead of a page. Your website developer will be able to fix this if you send him or her a list of recent 500 server errors from Google’s Webmaster Tools.

Sometimes 500 server errors show up in Google’s Search Console due to a temporary problem. The server might have been down for a while due to maintenance, overload, or force majeure. This is normally something you will be able to find out by checking your log files and speaking to your developer and website host. In a case like this you should try to make sure that such a problem doesn’t occur again in future.

Pay attention to the server errors that show up in your Google Webmaster Tools and try to limit their occurrence as much as possible. The Google bot should always be able to access your pages without any technical barriers.

Let’s have a look at some other crawl errors you might stumble upon in your Google Webmaster Tools.

Other crawl errors: 400, 503, etc.

We have dealt with the most important and common crawl errors in this article: 404, “Soft 404” and 500. Once in a while, you might find other types of crawl errors, like 400, 503, “Access denied”, “Faulty redirects” (for smartphones), and so on.

In many cases, Google provides some explanations and ideas on how to deal with the different types of errors.

In general, it is a good idea to deal with every type of crawl error you find and try to avoid it showing up again in future. The less crawl errors the Google bot encounters, the more Google trusts your site health. Pages that constantly cause crawl errors will be thought to also provide a poor user experience and will be ranked lower than healthy websites.

You will find more information about different types of crawl errors in the next part of this article:

List of all crawl errors I have encountered in “real life”

I thought it might be interesting to include a list of all of the types of crawl errors I have actually seen in Google Search Console properties I have worked on. I don’t have much info on all of them (except for the ones discussed above), but here we go:

Server error (500)
In this report, Google lists URLs that returned a 500 error when the Google bot attempted to crawl the page. See above for more details.

Soft 404
These are URLs that returned a 200 status code, but should be returning a 400 error, according to Google. I suggested some solutions to this above.

Access denied (403)
Here, Google lists all URLs that returned a 403 error when the Google bot attempted to crawl them. Make sure you don’t link to URLs that require authentication. You can ignore “Access denied” errors for pages that you have included in your robots.txt file because you don’t want Google to access them. It might be a good idea though to use nofollow links when you link to these pages, so that Google doesn’t attempt to crawl them again and again.

Not found (404 / 410)
“Not found” is the classic 404 error that has been discussed above. Read the comments for some interesting information about 404 and 410 errors.

Not followed (301)
The error “not followed” refers to URLs that redirect to another URL, but the redirect fails to work. Fix these redirects!

Other (400 / 405 / 406)
Here, Google groups everything it doesn’t have a name for: I have seen 400, 405 and 406 errors in this report and Google says it couldn’t crawl the URLs “due to an undetermined issue”. I suggest you treat these errors just like you would treat normal 404 errors.

Flash content (Smartphone)
This report simply lists pages with a lot of flash content that won’t work on most smartphones. Get rid of flash!

Blocked (Smartphone)
This error refers to pages that could be accessed by the Google bot, but were blocked for the mobile Google bot in your robots.txt file. Make sure you let all of Google’s bots access the content you want indexed!

Please let me know if you have any questions or additional information about the crawl errors listed above or other types of crawl errors.

Crawl error peak after a relaunch

You can expect a peak in your crawl errors after a website relaunch. Even if you have done everything in your power to prepare your relaunch from an SEO perspective, it is very likely that the Google bot will encounter a big number of 404 errors after the relaunch.

If the number of crawl errors in your Google Webmaster Tools rises after a relaunch, there is no need to panic. Just follow the steps that have been explained above and try to fix as many crawl errors as possible in the weeks following the relaunch.

Summary

  • Mark all crawl errors as fixed.
  • Go back to your report once a week.
  • Fix 404 errors by redirecting false URLs or changing your internal links and sitemap entries.
  • Try to avoid server errors and ask your developer and server host for help.
  • Deal with the other types of errors and use Google’s resources for help.
  • Expect a peak in your crawl errors after a relaunch.

If you have any additional ideas on how to deal with crawl errors in Google Webmaster Tools, I would be grateful for your comments.

Say thanks by sharing this:

98 Comments

  1. Nikkhil Shirodkar
    21. March 2017

    Hi Eoghan. Wonderful article on crawl errors. I’m getting a whole lot of “no sentences found” news errors but when I test the article in news tool I get a success. How does one fix this? Also when I do a fetch and render it only renders the bot view. The visitor view is blank.

    Reply
    • Eoghan Henn
      22. March 2017

      Hello Nikkhil,

      Thanks a lot for your comment. I’m very sorry, but I do not have a lot of experience with Google News. As far as I know, the error “no sentences found” can be triggered by an unusual formatting of articles – too few or too many sentences per paragraph.

      If Google has problems rendering your page, there might be other technical problems. You should definitely check this out. Does the problem occur with all articles or just the ones that also have a “no sentences found” error?

      I’m sorry I can’t give you a better reply. Let me know if you have any additional questions.

      Eoghan

      Reply
      • Nikkhil Shirodkar
        22. March 2017

        Hi Eoghan,

        Thank you for the response. Our site is built on a MEAN stack. We use pre-render IO for the google bot to crawl since the site is in Angular js. There are about 600 articles in the error list with no sentences found. All of them have content! eg http://www.spotboye.com/television/television-news/after-sunil-grover-is-navjot-sidhu-the-next-to-quit-kapil-sharma-s-show/58d11aa18720780166958dc3

        Reply
        • Eoghan Henn
          23. March 2017

          Hello Nikkhil,

          The Google Cache for the example you provided looks fine. I’m not sure if prerendering is still the preferred way of dealing with Angular JS websites though, as Google is now a lot better at rendering JavaScript. Also, I do not know if the Google News bot treats JS differently (although it shouldn’t). The fact that the visitor view in the fetch and render report is not working is something you should probably dig into deeper.

          Sorry again for not having any solid responses for you, but this might help you with your further research. Let me know if you have any other questions!

          Eoghan

          Reply
  2. mido
    20. March 2017

    I’ve got a problem and do not know what to do
    Google appears some of my website pages in search as (https) but i dont have https on my site
    I do not want https just simple http
    plz help me

    Reply
    • Eoghan Henn
      21. March 2017

      Hello Mido,

      Thanks a lot for your comment. My first idea was to suggest that you redirect all https to URLs to their http equivalents, but that would probably still cause problems for most users, if you don’t have a valid SSL certificate: A warning would be displayed before the redirect is processed. I’m not sure how the Google bot would deal with a situation like this (if it will process the redirects or not), but a missing SSL certificate will most likely cause problems in other areas.

      I think your best bet would be to switch to https completely. This is something all webmasters should be doing anyhow. You can get a free SSL certificate from Let’s Encrypt: https://letsencrypt.org/

      Here’s a great resource for making sure you get everything right from an SEO perspective when switching to https: http://www.aleydasolis.com/en/search-engine-optimization/http-https-migration-checklist-google-docs/

      Please let me know if you have any other questions.

      Eoghan

      Reply
  3. Surojit
    26. February 2017

    Hi Eoghan
    Great article! On or around Feb. 19, 2017 our webmaster account saw a spike in 500, 502, 503 errors (‘server error’) and our programmer checked and found an issue with database and got it fixed. Accordingly we checked all the 500/502/503 errors as fixed in webmaster. However, soon thereafter, webmaster again began receiving server errors (mostly 502s, some 500s) and the number of errors keep climbing steadily everyday. We’re not sure now why we’re still getting the server error messages and I’ll be grateful if you can help out in this regard.

    PS – ever since we started getting the server error messages, our traffic got badly hit as well as overall search rank positions.

    Reply
    • Eoghan Henn
      2. March 2017

      Hello Surojit,

      Thanks a lot for your comment. If the errors keep coming back after you marked them as fixed, it looks like the issue with the database was not the only cause for the errors. There are probably more issues you need to fix.

      You can export a list of all errors through the Google Search Console API Explorer including information on where the URLs that cause the errors are linked from. This might help finding the root of the problem.

      Feel free to send me some more information so I can have a closer look.

      Best regards,

      Eoghan

      Reply
  4. Johan Watson
    17. February 2017

    Good Day,

    I need help with all my crawl errors. I will pay you in advance if you could help me to clear all my crawl errors.

    Kind Regards

    Johan Watson

    Reply
    • Eoghan Henn
      22. February 2017

      Hello Johan,

      Thanks a lot for your comment. I will help you for free if you provide me with more info.

      Best regards,

      Eoghan

      Reply
  5. Faniso
    6. February 2017

    Hi there! Thanks for this post.

    I’m not sure if this question has been asked already.

    I recently went into webmaster tools to check for crawl errors. Under the Smartphone tab, I noticed that most of them were for pages with either a m/pagename.html or mobile/pagename.html.

    We have built these pages without any sub-directories. So you will not find
    http://www.victoriafalls-guide.net/mobile/art-from-zimbabwe.html or
    http://www.victoriafalls-guide.net/m/art-from-zimbabwe.html

    Only such pages as http://www.victoriafalls-guide.net/art-from-zimbabwe.html

    What am I missing here?

    Reply
    • Eoghan Henn
      9. February 2017

      Hello Faniso,

      I have seen a similar problem in several other Google Search Console properties. Sometimes it is very difficult to understand where the Google bot initially found the URLs.

      Have you checked the “linked from” information in the detail view of each error URL? This might help you find the source of the link, but often there is no information available.

      There is also an unconfirmed but pretty credible theory that the Googlebot just checks the m/ and mobile/ directories to see if there is a mobile version of a page when it’s not mobile-friendly: https://productforums.google.com/forum/#!topic/webmasters/56CNFxZBFwE

      I recommend you mark the errors as fixed and set up 301 redirects from the non-existent URLs to the correct versions, although the redirects are probably not even necessary.

      I hope this helps!

      Reply
      • Keith
        16. March 2017

        Hi Eoghan

        I’m having a unresolved issue with the ‘link from’ source being pages that haven’t existed for up to 10 years.

        All from recent crawls, both link and ‘linked from’ are asp urls that haven’t existed for a decade. In that time, the site (same root url) underwent three hosting company moves and several complete site rebuilds (no css, scripts, etc. were carried over).

        I can see external sites keeping these old urls in their archives, etc. but how does google come up with phantom internally ‘linked from’ urls that just haven’t existed for this amount of time? Have you any thoughts on this perplexing problem Thanks!

        Reply
        • Eoghan Henn
          20. March 2017

          Hi Keith,

          I’ve encountered the exact same problem with several different websites.

          Here’s my explanation: I’m pretty sure that the “linked from” info is normally not up-to-date. The info that is displayed here is often from previous crawls and it is not updated every time the linked pages are crawled. That would explain why, even years later, pages still show up as linked from pages that no longer exist.

          Also, I have noticed that these errors often don’t come back after you have marked them as fixed for a couple of times and made sure that the pages really aren’t linked to from anywhere any longer. These errors normally won’t harm your SEO performance in any way and thus aren’t a reason to be worried.

          I hope this helps! Please let me know if you have any other questions.

          Eoghan

          Reply
          • Keith
            20. March 2017

            Thanks very much for that, Eoghan. Very reassuring that, at least, I’m not losing my mind. I will persist with the mark ’em fixed tactic.
            Cheers! Keith

            Reply
  6. Andrea
    25. January 2017

    Hi. Few days ago my website (a blog) started to receive so many “calls” from googlebots and when I asked to Google why this is happening they answered that this is normal and that I should down th crawl frecuency at my webmasters tool. The big question for me is: how down is down? Do you have any suggestion? Thanks!

    Reply
    • Eoghan Henn
      31. January 2017

      Hi Andrea,

      Are the requests from Google causing you any problems with your server? If not, I would not recommend you change anything.

      If your server is indeed having trouble with the number of requests from the Google bot, I would first consider the following options:

      – Check if your server is performant enough. Normally, there shouldn’t be a problem with normal crawling by Google and other bots.
      – Check if the requests are actually coming from Google, or from another bot that pretends to be Google. You can compare the numbers from your log files (or wherever else you found that you were receiving lots of hits from the Google bot) with the Crawl stats report in Google Search Console (click on Crawl > Crawl stats in the left navigation).

      All in all, I would really not recommend to limit the crawling frequency for the Google bot.

      I hope this helps! Let me know if you have any other questions.

      Reply
  7. Tomáš Karafa
    19. January 2017

    Hi there,
    Few days ago , most of my website has disappeared from google search results . At the same time , google analytics has registered sharp decline in organich ( search engine ) visitors . Total daily visits dropped from 300 to 100 within about 3 days . Upon checking with webmaster tools , i get hundreds of “404-not found” errors . However what really bothers me is , that those URLs DO EXIST and they DO work ! I suspect that somehow the dynamic URL parameters are to blame . But so far , it has worked just fine … the website is written in several languages and ( being eshop ) is denominated in several currencies . Those languages and currencies are selected by $_GET parameters . To prevent people from trying to browse the pages without selected language or currency , the website automatically fills in those paramenters in case they are not present . Example :

    http://www.eurocatsuits.com/index.php …..redirects to : http://eurocatsuits.com/index.php/?la=en&currency=EUR

    in “fetch as google” , the index.php gets “redirected” status …. of course , it redirects to index.php/?la=en&currency=EUR …… but the “index.php/?la=en&currency=EUR” gets “not found” status …. however , in the browser the page works just fine ….

    any ideas ? … please help … thanks !

    Tomas

    Reply
    • Tomáš Karafa
      19. January 2017

      after sleepless night i found out , that .htaccess was to blame …i will make new one later , but for now i deleted it altogether and everything works just fine ….

      Reply
      • Eoghan Henn
        23. January 2017

        Hello Tomáš,

        I’m glad you managed to fix this.

        One general tip: You might want to get rid of those language and currency parameters in your URLs. They are not very search engine (or user) friendly.

        Please let me know if you have any additional questions.

        Best regards,

        Eoghan

        Reply
  8. Jacob Share
    17. January 2017

    I just received an email alert from GSC about a large increase in soft 404 errors. Turns out spammers from .cn domains are linking to searches on my WordPress blog for queries in another language (I assume Chinese), and the numbers have gone up almost linearly every day since Jan. 5th when it started. I suppose I could block search links from .cn domains but do you have a better idea?

    Reply
    • Eoghan Henn
      23. January 2017

      Hello Jacob,

      First of all, sorry about my late reply. I haven’t been able to keep up with all of the comments these last few days.

      Thanks a lot for sharing this interesting case, even if it sounds very annoying for you. Have you already set all of your search result pages to “noindex”? This is something every website should do anyhow, in order to avoid uncontrolled indexing of search result pages. You can use the Yoast SEO plugin to set this up.

      It might not stop the pages from showing up as soft 404 errors, but at least it will let Google know that you do not want these pages indexed. It should be enough to make sure that these pages don’t harm you.

      Another thing you might want to do is check the domains that are linking to you and see if they might be potentially harmful. It might be a good idea to use the disavow tool to exclude these links. Please note though that I am not an expert on link removal and cleanup and that you should do more research before deciding about this issue.

      Please let me know if you have any further questions.

      Best regards,

      Eoghan

      Reply
      • Jacob Share
        23. January 2017

        No worries, life is busy, just happy you replied at all 🙂

        Yes, my search pages are noindexed, via the AIOSEO WordPress plugin.

        I tried clicking through to one site, it’s a blog with a mass of links, each mostly pointing to other similarly-formatted garbage sites. The links to my site is gone and as far as I can tell, the site is being regenerated on the fly (or regularly) while replacing old links with new ones, spamming as many sites as possible.

        Reply
        • Eoghan Henn
          24. January 2017

          Looks like they might just be after visits from curious webmasters like you so they can generate some ad revenue off them. Similar to the ones that spam Google Analytics accounts with referral spam.

          Do any of these links show up in the “Search Traffic > Links to Your Site” report in Google Search Console?

          The links probably won’t harm you if they disappear again that quickly, but I guess you should keep an eye on it. As for the crawl errors… If you mark them as fixed they probably won’t show up again if the links disappear.

          I hope this helps and I hope that those spammers won’t bother you much longer.

          Reply
          • Kevin
            26. January 2017

            Hi Eoghan,

            I’m actually having the same issue that started right around the same date as Jacob.

            I received over 200 “soft 404 errors” from search URLs that are “linked from” a really strange search results page on my site that doesn’t exist.

            There are also a lot of very strange links from a few .cn websites.

            Hopefully this makes sense, I’m not familiar in dealing with crawl errors. Any help or guidance would be greatly appreciated.

            Thanks!

            Reply
            • Eoghan Henn
              31. January 2017

              Hi Kevin,

              First, I would recommend you mark the crawl errors as fixed in Google Search Console. You find this option in the crawl error report right above the detailed list of crawl errors.

              If the errors don’t show up again after that, you don’t have to worry about them any longer.

              If they do show up again, you’ll have to dig a bit deeper. Feel free to get back to me if you need any support with this.

              Best regards,

              Eoghan

  9. Ali
    1. January 2017

    hello sir this my website kindly help me my search console analytical is not working what is problem can you help i cant see any error for this http://www.subkuchsell.com website

    Reply
    • Eoghan Henn
      4. January 2017

      Hello Ali,

      I am not sure if I can help you with this question. If you do not see any data in Google Search Console, it might be because you only verified your property recently. It takes a few days until data is shown.

      If you do not see any errors, it might also be related to the fact that there simply aren’t any errors.

      Make sure you verify the right website version. The URL you enter for your Search Console property should be http://www.subkuchsell.com/.

      Let me know if there is anything else I can do for you.

      Eoghan

      Reply
  10. leanin
    29. December 2016

    Hey Eoghan,

    thanks for sharing. For an e-commerce website, my friend suggest a way to deal with 400 pages.
    1.download the search crawl error-404,
    2.past the 404 url in to txt file,
    3. put the 404.txt in the ftp,
    4.submit 404.txt to Add/Test Sitemap
    google webmaster–crawl–sitemap–Add/Test Sitemap button
    http://www.xxxxx.com/404.txt

    since we are going to dele around 4k url recently,how to deal with it very important

    Reply
    • leanin
      29. December 2016

      Fix 404 errors by redirecting false URLs or changing your internal links and sitemap entries.

      for this, steps as followings, right?

      1. 301 redirect all 404 error url to the homepage,
      2. update the sitemap
      3. sumit the sitemap,

      which one is correct?

      Reply
      • Eoghan Henn
        4. January 2017

        Yes, this is how I would suggest to do it. Just think about whether there are better targets for your 301 redirects than the home page. I would not recommend to just redirect every old URL to the home page without thinking about it. For most URLs, there will probably be better targets than the home page.

        Reply
    • Eoghan Henn
      4. January 2017

      Hi leanin,

      I am not sure why your friend recommends these steps, but this is not a solution I have ever heard of.

      Reply
  11. mirotic
    26. November 2016

    hi sir
    (i have bad english)

    can u help me fix this issue?

    my site has been block cause the yandex bot (i don really understand how this work)
    http://imgur.com/a/W1JKK

    i register my site at yandex , i couldnt find the crawl setting
    http://imgur.com/a/297Mu

    what should i do ?

    Reply
  12. mikc
    16. November 2016

    Hello Sir!

    I just built a website and google won’t crawl, won’t allow me to upload a site map either. Getting only 2 links showing when I enter site:acousticimagery.net, and one of these shows a 500 error. Also, when trying to crawl, Google doesn’t like my robots.txt file. I’ve tried several edits, removing it altogether, nothing helps. My Site host is worthless, been trying to get this fixed for 2 weeks. Any input you might have would be most appreciated!!

    Reply
    • Eoghan Henn
      16. November 2016

      Hello Mick! Thanks a lot for your comment.

      One important problem I have been able to identify is that your server always returns a 500 error instead of a 404 error when a page does not exist. Can you think of a way to fix this?

      If you want to get your pages indexed quickly, I recommend you go to “Crawl > Fetch as Google” in Google Search Console. Here you can fetch each page that is not in the index yet and then, after it has been fetched, click on “Submit to index”. This will speed up the indexing process.

      I could not find a robots.txt file or an XML sitemap on your server. The robots.txt should be located at http://acousticimagery.net/robots.txt. Right now, this URL returns a 500 error code, so I assume the file does not exist or is not in this location. Your can decide how you want to name your XM sitemap, but I would recommend putting it here: http://acousticimagery.net/sitemap.xml.

      Mind you, you don’t really need a robots.txt file and an XML sitemap for a website with only 4 pages (but they won’t do any harm either). Just make sure you change that thing with the wrong error codes.

      Please let me know if you have any other questions.

      Best regards,

      Eoghan

      Reply
      • Mick
        16. November 2016

        Hello Eoghan,
        Thanks for the response! Google won’t let me crawl the site as I keep getting an error saying they can’t locate the robots.txt file. I removed the file contents and tried again, still no go. Also, everytime I try to upload an XML file it tells me the file is in an invalid format. I see the 500 errors but cannot fix them. Any other ideas? This all started when I updated the site using a website builder available from Fat Cow. Very sorry I ever tried to update as I’m getting no cooperation from them on this at all. I’m thinking of just pulling the site and cancelling my Fat Cow account. You mentioned submitting each page with fetch. How do you do this?

        Reply
        • Eoghan Henn
          19. November 2016

          Hi Mick,

          OK, thanks for the additional information. I now have a better understanding of what is going on. The Google bot tries to access your robots.txt file at http://acousticimagery.net/robots.txt and gets a 500 server error, so it decides not to crawl the page and come back later. You can fix this by fixing the error code problem I described earlier. If http://acousticimagery.net/robots.txt returns a 404 error, everything is fine and Google crawls your page.

          I do not know how this works with Fat Cow, but maybe this page will help you: http://www.fatcow.com/knowledgebase/beta/article.bml?ArticleID=620

          Here’s how to submit each page to the Google index in Google Search Console:

          1. In the left navigation, got to Crawl > Fetch as Google:

          Crawl, Fetch as Google

          2. Enter the path of the page you want to submit and hit Fetch:

          Enter page path and hit Fetch

          3. When fetching is complete, hit “Request indexing”:

          Hit Request indexing

          4. Complete the dialogue that pops up like this:

          Complete this dialogue

          5. Repeat for every page you want to submit to the index. Here are the paths of the pages you will want to submit:
          cd-transfers
          audio-recording
          contact-us

          I hope this helps! It will take a while until the pages show up in the search results. Let me know if there is anything else I can do for you.

          Eoghan

          Reply
  13. Sean
    15. November 2016

    I get a lot of page no found errors and when I checked the linked from info and click the links they clearly go to the actual page, which is not broken? It’s really annoying as the errors keep coming back.

    i.e.

    This error

    /places/white-horse-inn/

    is linked form here

    http://www.seanthecyclist.co.uk/places/white-horse-inn/

    Any idea what might be causing this?

    Thanks

    Reply
    • Eoghan Henn
      16. November 2016

      Hi Sean,

      I think I might need some more information to be able to help you with this. I will send you an e-mail now.

      Best regards,

      Eoghan

      Reply
  14. Donald
    5. November 2016

    I have been getting the same issue as Michael .

    how i do to fixed this error 500 http://imgur.com/a/qE4i3

    It made me lose every single keyword I was ranking for and the more I try to remove they keep coming up. As soon as I fetch the URL , search results pop back up to #2 positions for many keywords but just after a few hours looks like google crawls them again finding errors and sends the site back to the 10th page. Search rankings were gradually lost as soon as this 500 server error was discovered on webmaster.
    Now I have thought about blocking /wp-includes/ but I think you cant block it anymore due to css and js which might hurt rankings even more.

    Any help would be most appreciated.

    Reply
    • Eoghan Henn
      5. November 2016

      Hi Donald,

      You’re absolutely right, /wp-includes/ does contain some .js files that you might want Google to crawl. Your CSS is normally in /wp-content/ though.

      Also, Yoast does not block /wp-includes/ by default any more (Source: https://yoast.com/wordpress-robots-txt-example/)

      Nevertheless, it is probably a good idea to block all URLs that return a 500 error from the Google bot. So far, I’ve never had problems with blocking the entire /wp-includes/ directory (I still do it on this website), but it might be worth the while going through the directory and only blocking URLs that return a 500 server error.

      I hope this helps!

      Reply
  15. Michael
    29. October 2016

    how i do to fixed this error 500 http://imgur.com/a/qE4i3

    Reply
    • Eoghan Henn
      1. November 2016

      Hello Michael,

      You can block your /wp-includes/ directory from the Google bot by putting it in your robots.txt file. I recommend you install the Yoast SEO plugin for WordPress. As far as I know, it does it automatically.

      I hope this helps.

      Eoghan

      Reply
      • Eoghan Henn
        5. November 2016

        Please see my reply to Donald’s comment (above) for an update on this issue.

        Reply
  16. kevin
    30. September 2016

    Henn,
    We have crawl errors in webmasters.When we remove such pages from webmasters.So within how many days that page can be removed from google Webmasters.

    Reply
    • Eoghan Henn
      6. October 2016

      Hi Kevin,

      For me, there are two scenarios in which I would remove a crawl error from the report:

      1. If I know the error won’t occur again because I’ve either fixed it or I know it was a one-time thing.
      2. If I don’t know why the error occured (i.e. why Google crawled that URL or why the URL returned an error code) and I want to see if it happens again.

      WHEN you do this really doesn’t matter much. I hope this helps! Let me know if you have any other questions.

      Reply
  17. kevin
    29. September 2016

    Hi Eoghan Henn,
    This is kevin can u tell me after removing the page from webmasters.How many days after the page can be removed from the webmasters.

    Reply
    • Eoghan Henn
      30. September 2016

      Hello Kevin,

      Thanks a lot for your comment. I am not sure if I understand your question correctly. I will send you an e-mail so we can discuss this.

      Best regards,

      Eoghan

      Reply
  18. Saud Khan
    21. September 2016

    Please help me to fix this error.

    Screenshot: http://i.imgur.com/ydZo4Wv.jpg

    I’ve deleted the sample page and redirected the second url.

    Reply
    • Eoghan Henn
      27. September 2016

      Hi Saud,

      Unfortunately the screenshot URL is not working (any more). I will get in touch with you via email and see if I can help you.

      Best regards,

      Eoghan

      Reply
  19. Ajay Murmu
    14. September 2016

    I am getting HTTP Error: 302 error in sitemaps section. All other sitemap urls are working fine but i am getting error on main sitemap.xml. How can i resolve it?

    Reply
    • Eoghan Henn
      16. September 2016

      Hello Ajay,

      thanks a lot for your comment. I am not sure I understand your question very well. I will send you an e-mail so you can send me a screenshot if you like.

      Best regards,

      Eoghan

      Reply
      • Ray
        18. December 2016

        Hello Eoghan, I would love to know if you resolved the ‘302’ problem
        I’ve had the issue of going through the wayback machine to a website but then when I click the link I need I am greeted with: ‘Got an HTTP 302 response at crawl time’ and redirected to the current website where my information is no longer.
        Would really appreciate some help if you could email me.
        internetuser52@gmail.com

        Reply
        • Eoghan Henn
          4. January 2017

          Hi Ray,

          I’ll send you an e-mail.

          Eoghan

          Reply
  20. Jennifer M
    23. August 2016

    There is a nasty website that injected a redirect on our site. We found the malware and removed it, but their site is still linking to tons of URLS on our site that don’t exist–and hence creating crawler errors.

    How would you suggest we fix this?

    THANKS!
    Jennifer

    Reply
    • Eoghan Henn
      29. August 2016

      Hi Jennifer,

      This does sound nasty :/

      It is not easy to analyse this situation with the little bit of information I have, but I guess you do not have to worry about the crawl errors too much. Look at it this way: Somebody (a spammer) is sending the Googlebot to URLs on your website that don’t exist and have never existed. Google is clever enough to figure out that this is not your fault.

      If you like, you can send me more information via email so that I can have a closer look at it.

      Reply
  21. Chris
    10. August 2016

    That’s great news. Thanks for sharing Eoghan. Keep me posted!

    -Chris

    Reply
    • Eoghan Henn
      12. August 2016

      Hi Chris,

      For now, I recommend you use Google’s Search Console API explorer. If you follow this link, the fields are already pre-filled for a list of your 404 errors with additional information about the sitemaps the false URLs are included in and the pages they are linked from:

      https://developers.google.com/apis-explorer/#p/webmasters/v3/webmasters.urlcrawlerrorssamples.list

      You just need to fill in your site URL (make sure you use the exact URL of your GSC property in the right format). You can then copy and paste the output and forward it your IT. I want to build a little tool that will make this easier and nicer to export, but that will take a while 🙂

      Hope this helps for now! Let me know if you have any questions.

      Reply
      • Chris Smith
        13. August 2016

        Eoghan,

        That works perfectly. Thanks a ton for the detailed response and customized URL. I hope I can return the favor someday. 🙂

        Thanks again,

        Chris

        Reply
  22. Chris Smith
    3. August 2016

    I like this strategy.

    Is there a way to download the “linked from” information in the 404 report? Would make it much easier to send the complete details to my IT team.

    Reply
  23. Dr Emixam
    21. July 2016

    Hi,

    Following a misconfiguration of another of my websites, google indexed a lot of non existant pages and now all of these pages appear in crawl errors.

    I tried to set them as 410 errors to tell they doesn’t existe anymore but google keeps them in crawl errors list.

    Do you know what is the best thing to do in this case ? And in a general way, for any page which is permanently deleted.

    Reply
    • Eoghan Henn
      2. August 2016

      Hello Dr Emixam,

      Thanks a lot for your comment snd sorry about my late reply.

      From what you described, you did everything right. Just remember to mark the errors as fixed once you’ve made changes to your page. This way they should not show up again.

      Let me know if you have any further questions.

      Reply
      • Eoghan Henn
        20. October 2016

        Just to clarify this: Giving back a 410 code alone will not prevent the URLs from showing up in the 404 error reports – Google currently shows 410 errors as 404 errors. In order to stop the URLs from showing up in the reports, all links to the URLs need to be removed too. Otherwise, Google will keep on following the links, crawling the URLs and showing the errors in the reports. If there are external links to the URLs that cannot be removed, it might be better to use a 301 redirect to point to another URL that is relevant to the link.

        Reply
  24. Steven
    19. July 2016

    Hi Eoghan-

    Thanks for the great info in the article! I have an interesting (to me) issue with some of the crawl errors on our site. The total number of 404 errors is under 200 and some of them I can match to your info above. But, there are quite a few URLS that are not resolving properly due to “Chief%20Strategy%20Officer” having been appended on to each of the URLs. For example, the URL will end with “…personal-information-augmented-reality-systems/Chief%20Strategy%20Officer” and the Linked From URL is the link on our site.

    I’m going to go ahead and mark all as “fixed” and see what happens, but I was wondering if you had any idea how this may have happened?

    Thanks y ¡Saludos! from BCN…
    Steven

    Reply
    • Eoghan Henn
      29. July 2016

      Hi Steven,

      Thanks for your comment! I found your website, crawled it, and found some interesting stuff that might help you. I will send you an email about this.

      Best regards,

      Eoghan

      Reply
  25. Vicky
    17. July 2016

    Hi Eoghan Henn,

    I have over 1000, 404 not found errors on google search console for deleted products. What i will do to fix those errors. Can you please suggest me any way to fix them.

    Thanks
    Vicky

    Reply
    • Eoghan Henn
      28. July 2016

      Hello Vicky,

      When you have to delete a product page, you have a few options:

      • Is there a replacement product or a new version of the product? If so, you can 301 redirect the URL of the deleted product to this new URL. Make sure that the new URL you redirect the old URL to is very similar to the old URL though. Do not overuse 301 redirects!
      • If you want to delete a product and send a signal to Google saying that the page has been removed intentionally, you can give back a 410 status code instead of a 404.
      • If none of the above is possible, you can just mark the 404 errors in Search Console as fixed. Make sure you do not link to the old URLs internally any more. Google should stop crawling them then and the errors should not return. If a URL is linked from another website, you should definitely 301 redirect it to a relevant target (see first option).

      I hope this helps!

      Reply
      • Vicky
        28. July 2016

        Hi Eoghan,

        Thanks for reply,

        I marked them fixed so google stop crawling them. Yes there is some deleted pages linked internally & externally. I will redirect those deleted product to similar product.

        Will inform you soon about update on it.

        Again thanks for reply!

        Reply
        • Eoghan Henn
          2. August 2016

          Hi Vicky,

          Just to clarify: Marking the errors as fixed will not make Google stop crawling them. This can only be achieved by removing all links to the URLs.

          I’m looking forward to hearing about how it went for you!

          Reply
  26. Chris
    26. June 2016

    Hi Eoghan,

    Just wanted to give you a thumbs up! Great post and super useful to me today, right now, when I discovered a bunch of 404s on a client site who just moved from an html site to a wordpress site for some of the pages they used to rank for, ie. somepage.html.

    I had used SEMRush to find as many pages as possible that they were previously ranking for and redirected them to any specific, relevant page and, when not possible, to category or broad topic pages.

    The remaining crawl errors (404s) in Search Console are some pages that didn’t show up in SEMRush and of course things like “http://myclientsite.com/swf/images.swf” Since we are sensibly no longer using Flash I guess I just don’t worry about those? Not really sure.

    Anyway, thanks for the great post!

    Reply
    • Eoghan Henn
      30. June 2016

      Hi Chris,

      Thanks for your kind words! I’m glad this article helped you.

      Yes, you can just ignore the swf file errors. If you mark them as fixed I guess they won’t show up again.

      Reply
  27. Daniel
    23. June 2016

    Hi Eoghan,

    Any thought on bigger classified ad sites handling search console?

    For instance, real estate with multiple ads that expire after a certain date, having around 30k “404” or so. What would you suggest to deal with such amount of expired content?

    Thanks in advance,

    Reply
    • Eoghan Henn
      29. June 2016

      Hi Daniel,

      Thanks a lot for your interesting question. I have no practical experience with a case like this, but let me share some thoughts:

      • One thing you can do to make it clear that the pages were removed intentionally, so there is no “error”, would be to serve a 410 status code instead of a 404 status code (see https://searchenginewatch.com/sew/how-to/2340728/matt-cutts-on-how-google-handles-404-410-status-codes)
      • Also, ask yourself: Do you really need all of these temporary pages crawled and indexed? Do you get any valuable organic traffic from them? Do they rank for queries that you could not cover with pages that are permanently available? Maybe you can build an SEO strategy for your website that takes into account the fact that you have a big number of pages that disappear after a while.
        • I hope this helps!

      Reply
      • Jules
        11. September 2016

        Google treats a 410 as a 404. https://support.google.com/webmasters/answer/35120?hl=en under URL error types > Common URL errors > 404: “If permanently deleting content without intending to replace it with newer, related content, let the old URL return a 404 or 410. Currently Google treats 410s (Gone) the same as 404s (Not found).”

        Reply
        • Eoghan Henn
          13. September 2016

          Hi Jules,

          Thanks for linking to this source. Still, I guess that a 410 error code is the better choice when intentionally removing content. We do not have power over how Google interprets our signals, but we should do everything we can to make them as consistent as possible.

          Reply
  28. Josh
    24. April 2016

    Hey Eoghan – I see your website is made with WordPress, so I was hoping you’d be able to answer my question.

    I recently re-submitted my sitemap (since I thought it might be a good thing to do after disallowing /go/ in my robots.txt for affiliate links) and a few days after recrawling, I now see a new 500 error:

    /wp-content/themes/mytheme/

    Other notes:

    – This was not present before I resubmitted my sitemap, and it’s the only 500 error I’ve seen since I launched my website a month or two ago.
    – I also know that my webhost (Bluehost) tends to go down at times. Maybe this is because Google tried crawling when it was down?
    – I updated my theme a few days before the 500 error appeared.

    Do I need to take any action? Is there any other info I can provide?

    Thanks – appreciate it.

    Reply
    • Eoghan Henn
      25. April 2016

      Hi Josh! Thanks for your comment.

      First of all: This is not something you should worry about, but if you have time, you might as well try to fix it 😉

      Apparently, the type of URL you mentioned above always gives back a 500 server error. Check this out: I’m using a WP theme called “Hardy” and the exact same URL for my page and my theme also returns a 500 server error: https://www.rebelytics.com/wp-content/themes/hardy/. So it’s not Bluehost’s fault. (Fun fact: I will probably receive a 500 error for this now because I just placed that link here).

      Now the question is: Why did the Google bot crawl your theme URL in the first place? Are you linking to it in your new sitemap? If so, you should remove the link. Your sitemap should only contain links to URLs that you want indexed. You can check where the Googlebot found the link to the URL (as mentioned in the article above). Here’s a screenshot of that:

      See internal links for crawl errors in Google Search Console

      If you find a link to that URL anywhere, just remove it. Otherwise, I guess you can just ignore this crawl error. It would be interesting to mark it as fixed and see if it shows up again. Let me know how it goes! And just give me a shout if you have any additional questions.

      Best regards,

      Eoghan

      Reply
      • Josh
        25. April 2016

        Awesome – thanks for the reply. It’s not linked in my sitemap and clicking on the link in GWT doesn’t show where it’s linked from, but I’ll remove it. Glad to hear it’s not really a problem.

        I also had 2 other quick questions:

        In general, do I only need to worry about crawl errors/warnings for relevant webpages (webpages that I want indexed and webpages that should be redirected since they’re being clicked on)? Some warnings are shown for:
        /m
        /mobile
        /coming-soon

        No idea how these appeared, and it shows they’re linked from my homepage, even though I have no idea how that is possible.

        Also, my Amazon affiliate links (cloaked with /go/) got indexed a few weeks ago, and roughly a week ago, I put rel=”nofollow” for each link and also added “Disallow: /go/” under “User-agent: *” in my robots.txt.

        It’s been a week, and my affiliate links are still indexed when I enter “site:mysite.com”. Do you think I’m missing anything, and how can I find out if I’m still being penalized for them?

        Thanks for the help – greatly appreciated.

        Reply
        • Eoghan Henn
          16. May 2016

          Hi Josh! Sorry it took me so long to reply to this one. You’re right, you should worry more about crawl errors for relevant pages that you want indexed. Nevertheless, it is always a good idea to also have a closer look at all the other crawl errors and try to avoid them in future. Sometimes, though, there’s nothing much you can do (like in the JavaScript example in the article).

          What kind of redirects are you using for your Amazon affiliate links? Make sure you use a 301 redirect so they don’t get indexed.

          I hope this helps!

          Reply
  29. Dermid
    25. February 2016

    Eoghan,
    Thanks for the very good input. A related question: I’m getting three different numbers for Google indexed pages. 1) when I type site:mysite.com I get 200,000 and 2) when I look in Google Search Console index status it reports 117,000 and 3) when I look at crawled site map it reports only 67 pages indexed. Can you help me understand these varying index numbers? Thank you very much.
    Dermid

    Reply
    • Eoghan Henn
      25. February 2016

      Hello again! You get different numbers here because you are looking at three different things:

      1) site:mysite.com shows you all the pages on your domain that are currently in the index. This includes all subdomains (www, non-www, mobile subdomain) and both protocols (http and https).
      2) shows you all indexed pages within the Search Console property you are looking at. A Search Console property can only include URLs with a combination of one protocol and one subdomain, so if the name of your Search Console is https://www.mysite.com/, only URLs that start with https://www.mysite.com/ (and that are indexed) will show here.
      3) shows you all URLs that are included in this exact sitemap and that are indexed.

      I found https, http, www, non-www, and m. (mobile subdomain) pages of your domain in the Google index. You should make sure all of your pages are only available with https and decide whether you want to use www or not (this is decision a matter of taste). You can easily set this up with two 301 redirect rules: One that redirects every http URL to its https equivalent and on that redirects all non-www URLs to their www equivalents (or vice versa). Last but not least, make sure you are using the right Search Console property (so https://www.mysite.com/ or https://mysite.com/, depending on how you decide on the www or non-www matter) and post a sitemap with all the URLs you want to be indexed.

      Once you’ve followed this, you should work on closing the gap between 1), 2) and 3). If you have a healthy website and you’re in control of what Google is indexing, all three numbers should be on a similar level.

      Reply
  30. Dermid
    23. February 2016

    Eoghan,,
    Thank you. Our development team is using your advice because we have very similar issues with crawl errors. On a related note, I’m trying to understand the relationship between crawl errors and indexed URLs. When our URLs are indexed we do very well with organic search traffic. We have millions of URLs in our submitted sitemap. Within the Google Search Console our indexed URL number jumped from zero to 100K on Janurary 4th but has stayed at about that level since then. Should we expect that when we fix the crawl errors the indexed URLs will rise?
    Thank you,
    Dermid

    Reply
    • Eoghan Henn
      24. February 2016

      Hi Dermid,

      Thanks a lot for your comment and your interesting questions.

      Crawl errors and indexed URLs are not always directly related. 404 errors normally occur when the Googlebot encounters faulty URLs that are not supposed to be crawled at all, for example through broken links. Server errors, on the other hand, can occur with URLs that are supposed to be indexed, so fixing these might result in a higher number of indexed URLs.

      If you have millions of URLs in your sitemap, but only 100k of them are indexed, you should work on closing this gap. First of all you should check if you really want millions of URLs in your sitemap or if maybe lots of those pages aren’t relevant entry pages for users that search for your products or services in Google. It is better to have a lower number of high quality pages than having a higher number of low quality pages up for indexing.

      Next, check why a big part of the URLs you submitted in your sitemaps hasn’t been indexed by Google. Note that submitting a URL in a sitemap alone will normally not lead to indexing. Google needs more signals to decide to index a page. If a large number of pages on your domain is not indexed, this is normally due to poor internal linking of the pages or poor content on the pages. Make sure that all pages you want in the Google index are linked properly internally and that they all have content that satisfies the needs of the users searching for the keywords you want to rank for.

      I hope this helps!

      Eoghan

      Reply
  31. Arun
    14. January 2016

    Hi….Eoghan Henn

    Actually my site was getting hacked after that i got a lot of errors is search console i don’t know what have to do by your tips am going to markup the errors as fixed because those URL’s are not available in my website and i removed those URL’s but the issue is one main landing page is getting 521 error code i was googled about this but i didn’t find a good solution about that and the big issue is my home page only crawled by google another pages not able to crawled even i have submitted sitemaps and using fetch us google please help me and check my website error details below and please command me a good solution about this or mail me please help me……..

    hammer-testing-training-in-chennai.php
    521
    11/2/15

    2
    blog/?p=37
    500
    12/27/15

    8
    blog/?m=201504
    500
    12/7/15

    13
    userfiles/zyn2593-reys-kar-1623-moskva-rodos-ros6764.xml
    521
    11/2/15

    14
    userfiles/cez3214-aviabileti-kompanii-aer-astana-myv9933.xml
    521
    11/3/15

    17
    userfiles/wyz5836-bileti-saratov-simferopol-tsena-gif9086.xml
    521
    11/3/15

    Reply
    • Eoghan Henn
      15. February 2016

      Hi Arun,

      First of all I would like to say that I am very sorry about my late reply. I have been very busy lately and didn’t find time to reply to any comments on here.

      You did the right thing marking the errors as fixed and waiting to see if they occur again. Especially 5xx errors are normally temporary. Did any of these show up again?

      The other problem about important pages not being indexed is probably not related to the crawl errors problem. I am not able to determine the cause of this problem without further research, but I did find one very important problem with your website that you need to solve if you want your pages to be indexed properly:

      In your main navigation, some important pages are not linked to directly, but through URLs that have a 302 redirect to the target. Example:

      /hammer-testing-training-in-chennai.php is linked in the main navigation as /index.php?id=253.

      /index.php?id=253 redirects to /hammer-testing-training-in-chennai.php with a 302 status code. I am not surprised that Google will not index any of the two in this case. you should make sure that you always link directly to the target URL and you should absolutely avoid redirects in internal links. And, in general, there are very few cases where a 302 reidrect is needed. Normally you will need a 301 redirect if you have to redirect a URL.

      I am not sure if this is going to solve all of your problems, but fixing your internal links is definitively an important item on your to-do list. Please let me know if you have any other questions.

      Reply
  32. Greg
    30. October 2015

    I have launched a new site and for some reasons I am getting an error 500 for a number of urls in webmaster tools, including the sitemap itself, When I check my logs for access to the sitemap example it shows google has accessed the sitemap and no errors where returned:

    66.249.64.210 – – [29/Oct/2015:02:19:31 +0000] “GET /sitemap.php HTTP/1.1” – 10322 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”

    Also if I access any of these urls they appear perfectly fine.

    Thanks

    Reply
    • Eoghan Henn
      1. November 2015

      Hello Greg, this looks like a temporary problem to me. What I would suggest is to mark the corresponding errors as fixed in the crawl error report and see if they show up again. If they do not show up again, everything is fine.

      Reply
  33. Artin
    18. September 2015

    Hi
    Good post! I get alo of 500 errors in GWT because of i have disabled my feeds! what should i do with that?
    I have disabled feeds because other websites steal my contents!
    Can you help me?
    Thanks

    Reply
    • Eoghan Henn
      22. September 2015

      Hello Artin, I am not quite sure if I understand your problem correctly. Which URLs return 500 errors? The URLs of your feeds? Are you still linking to them? If so, you should definitely remove the links. Also, you can check if it is possible to return a 404 instead of a 500 for your feed URLs. This would be a better signal for Google. It might even be a good idea to 301 redirect the URLs of your feeds to pages on your website, if you find a good match for every feed URL. If you explain your problem in more detail I will be happy to help.

      Reply
      • Artin
        17. October 2015

        Hey Eoghan
        Thanks for your reply! i found a plugin named ”disable feeds” and it redirects all feeds to homepage, so i got rid of those 500 errors! and plus it dosn’t let those spammy sites to steal my contents.

        Reply
        • Eoghan Henn
          19. October 2015

          Hello Artin, thanks a lot for sharing the info about the plugin. Sounds useful!

          Reply
  34. Ossai Precious
    14. September 2015

    Good post! I have a similar problem and just don’t know how to tackle it. Google search console shows that 69 pages have errors and I discovered that the 404 errors come up whenever a ‘/’ is added after the URL.

    Reply
    • Eoghan Henn
      15. September 2015

      Hello Ossai,

      Google only crawls URLs that are linked somewhere, so you should first of all try to find the source if this problem. In Search Console, you can find information on where the URLs with errors are linked from. It is very likely that somewhere on your page or in your sitemap you link to those URLs with trailing slash that return 404s. You should fix those links.

      The next thing you can do is make sure that all URLs that end with a slash are 301 redirected to the same URL without a trailing slash. You should only do this if all of your URLs work without trailing slash. It only requires one line in your htaccess file.

      If you have any other questions I will be happy to help.

      Reply
      • Chris
        8. December 2016

        I exactly have this issue right now. Can you explain the correct htaccess code?

        Reply
        • Eoghan Henn
          14. December 2016

          Hi Chris,

          I am really not en expert on creating rewrite rules in htaccess files, so don’t rely on this, but this one works for me:

          RewriteRule ^(.*)/$ /$1 [R,L]

          Make sure you only use it for the URLs you want to use it for by setting a rewrite condition.

          I hope this helps!

          Reply

Leave a Reply