December 03, 2017

SEO #2 - Manage Your Blog / Website in Google Webmaster Tools


How are you getting along with Google Webmaster Tools (GWMT)? Last time we have been discussing about using and managing settings in Configuration section. I hope it was helping you to understand the settings. At least you know how to get started and what's in Configuration section. Don't you? If you missed the previous post, read Manage Your Website in Google Webmaster Tools: Part 1. Now it's time to move on and continue the series. Let's take a look at Health section and what it has for bloggers and webmasters.

Health


Health section is dedicated to display status of your site from Google's point of view. Yeah, according to the data that Google collected regarding your site. If Google identified important issues with your site, such as blocking important pages from Googlebots or being infected by malware, you will able to find necessary information here.


Crawl Errors


Now I believe you are familiar with the term - Crawling. Here you can view errors that Googlebots were experiencing when crawling your blog or website. Simply, the errors occurred when Google trying to access your site to scan through your content.

Site Errors


Would you mind if your site has errors? If your site cannot be accessed, obviously your content cannot be crawled too. As a prerequisite, Google examines statuses of your site's DNS, Hosting Server and Robots.txt before step into crawling process.


You can click on each button to access reported data for the period of last 90 days.

DNS - Domain Name System aka DNS is responsible for translating your domain name into an IP address. Imagine a translator :) Googlebots can't read your human readable site address (www.cracko.net) like we do, but the IP address allocated for it. You know, computer technology is based on 1s and 0s.

If Googlebots couldn't communicate with the DNS server properly, Google won't be able to access your site. So you will be notified about the DNS issues being experienced here.


Server Connectivity - Hosting servers are responsible for holding your site data and files, and offering relevant services to make your site accessible online. If hosting server responds properly, Google will be able to access your site content for crawling. Else, Google have to wait until it can access your site.

Common causes for connectivity errors are that your server is completely down or it's busy enough to respond to the Googlebot, may be due to exceeded bandwidth limit. In addition, there could be server configuration issues which occurs conflicts with Googlebot too.


Robots.txt Fetch - Simply, robots.txt is a simple text file with instructions for search engine bots. Search engine bots can't crawl it alone? They CAN, but robots.txt is a way to instruct what's NOT to crawl and who are NOT allowed to crawl content of the site.

Googlebots look for robots.txt before it starts crawling your site to make sure if Googlebots were allowed to crawl and what pages were disallowed from indexing. If Googlebots are blocked from crawling your site or robots.txt is inaccessible, Google won't crawl your site at all.

Wanna see robots.txt file of your site? View your site's robots.txt by appending /robots.txt to your site address - i.e: www.cracko.net/robots.txt

URL Errors


You can view URL errors if Google was unable to crawl your pages on your blog or website. It offers URL errors occurred when crawling the mobile version of your site too.


Server error - It's similar to what we have discussed under Server Connectivity. But here the reporting is specific for individual URLs. GWMT reports you about the URLs that couldn't be crawled successfully due to server-specific errors. You will need to pay attention to the reliability of your hosting partner, if it reports server errors frequently.

Not found - Not found errors will be occurred if Google trying to crawl a page that not existing on your site. Mostly it could be a page that removed from your site. Further Googlebots may visit your pages from external sites via backlinks.

If someone has linked to a page removed from your site or misspelled the URL, it will lead to a non-existing page which occurs the response code 404 aka not found.

Reviewing not found errors is a good opportunity for you to find if someone links to a non-existing page on your site. Why not, it may coming from your own site. Another way to catch some broken links.

Just click on Not found box and you will able to see URLs been identified as non-existent.


Further, clicking on each URL will allow you to explore more details about the error and how Googlebots found that URL.

Jump to Linked from tab to find out backlinks pointing to that specific URL. Warning: You will find some broken links ;) You can click Mark as fixed button after fixing the issue for URL to be disappeared from the list of Not found URLs.



Other - Other errors would be the errors experienced by Google other than server and not found errors. Still they were preventing Google from crawling your pages. For example, the protected content where it requires user credentials to access the content.

If your site has pages not allowed for public and appears beneath Other, you can ignore such URLs.

Crawl Stats


You can find crawling statistics for your blog or website here. I don't think you will need a detailed explanation, as the graphs reflects it all.

Most importantly, you can view number of pages being crawled per day for last 90 days. Further it will allow you to access download information related to the crawling process.

Blocked URLs


Robots.txt is use to instruct search engine bots that what pages are disallowed to crawl on your site. Earlier I've mentioned how you can view your site's robots.txt file manually.

You can see how many pages being blocked from Google through robots.txt file.

Further, you can test your robots.txt file against different URLs of your site and it will show if Googlebots are allowed to crawl a page or not. Here it's better to be familiar with the use of robots.txt file in order to change values and test it out with Googlebots.


It's only for demonstration purpose and won't make changes to actual robots.txt file associated with your blog or website. By default, Googlebot is selected and you can select another Googlebot you may wish to test too.


Once you test out robots.txt against an URL, the results will be shown below.


As you see in above result, the URL has been blocked from Googlebot and not indexed by Google. Simply, that page won't appears in Google search results.

Fetch as Google


This is a very helpful tool if you are experiencing issues for your site or pages in Google search results. Let's say you can't find a specific page listed in Google search results. You can use this tool to crawl it as Google and see if Googlebot can crawl it successfully.


Enter rest of the URL in the given text field or just click Fetch button to crawl homepage of your blog or website.

You can select Web option to crawl as Googlebot or other options to crawl as Googlebot-Mobile. If all fetch statuses come as Success, Google is capable of crawling your page.


You can click on individual fetch statuses to view how Googlebot fetched the particular page too. A detailed report.



Index Status


How many pages are indexed by Google right now? You are most curious to know.

Basically you will see the number of indexed pages at Basic tab. It doesn't include duplicates and pages that has been blocked from indexing. You can switch to Advanced view to find more details.


I find the graphical representation is very helpful. You can see if Google keep indexing your new content or a drop may indicating a problem when indexing your pages.


Malware


Would you aware of the security of your blog or website? If Google identifies your site has been infected by a malware, you will be able to find information here.

No one can add malware on your blog or website without having access to change settings or content of your site. The common reasons could be,

● A plugin / gadget or a code snippet you have added to your site is acting as an active malicious software

● Someone else taking control of your site and adding malicious content. Simply we'd say, you site has been hacked.

You can find a detailed view about the security of your site via Google Safe Browsing Diagnostic page. Replace www.cracko.net by your site address in the URL below and navigate to see the security status.

No comments:

Post a Comment