Websites with clean and organized HTML source are appreciated by search engines. Respecting some guidelines in the process of the page building improves our chance to get a better evaluation when Googlebot visits our website. It is certain that you can't trick search engines on a long term basis because they'll probably find out the trick sooner or later and they will ban or penalize your site. This article is about the rules on how to build clean web pages which are appreciated by crawlers. Do onpage SEO without trying to fool search crawlers.
Meta tags are short text snippets in the head section of the page that describe the content of a web page. They are not visible to the visitors but helps search engines determine what the page is about.
the meta description is a short text containing about 160 characters to briefly describe what is the page about.
<meta name="description" content="This page is about .... ">
Google had announced that they no longer care about the meta keywords which is a list of the most important keywords related to the page.
<meta name="keywords" content="bad,html" />
With the robots meta tag we can ask the search engines not to index the page.
<meta name="robots" content="noindex, nofollow">
Declared in the head section of the HTML document, the title tag is probably the most important property of the page. Its length should be 50-60 characters, containing the most important keywords of the page. This text shows up in the browser tabs and usually this is the title written with blue on the Google search results page.
robots.txt is a text file placed in the root folder of the website. When a crawler visits the domain this is supposed to be the first address to visit to check if it's allowed for it to crawl the site. Search engine crawlers like the Googlebot is looking for a sitemap location here.
The following example allows the robots to crawl the site, except the denied folder, and defines an xml sitemap
The following code in the robots.txt denies all crawlers to crawl the pages of the site:
When a specific content can be accessed from multiple URLs, we can tell the search engines which page we'd like to be included in the search index. For example Bad HTML website can be accessed from two different URLs: https://badhtml.com/ and from https://badhtml.com/#white and we can specify that the first should be indexed.
The canonical URL definition of the home page of this website: <link rel="canonical" href="https://badhtml.com/" />
Make sure to properly redirect visitors and robots when you delete or change a URL on the website, even if you don't have any links to that specific address on the site.
Good SEO Checklist
page easily accessible by crawlers, included in sitemap, robot.txt doesn't block it, and there's no dynamic content.
- Unique Content
long enough, not copied from another website.
- Good User Experience
easy navigation, aesthetic layout, fast page load, mobile support.
the targeted key phrases are included in the URL, page title, meta description, the content and in the links.
the visitors find the article interesting and they recommend to others in social media.
- Mobile Support
the pages must render correctly on desktop, tablet and mobile devices.
- Meta Data, Schema
include meta tags, define author/publisher, respect the recommendations by schema.org