Single Page Apps – Notes on Search Engine Optimization (SEO)

imageOne of my readers has mentioned that there are issues regarding search engine optimization (SEO) for single page apps. Because content is dynamically loaded via JavaScript calls rather than as part of the initial page load, search engine crawlers won’t see all the content.

Let me explain.

You Really Want Crawlability?

“If I’m really concerned about SEO, the crawlability of my site, and the persistence of my links should I use Sammy or other single page app routers?”. The answer without a doubt is No.

Sammy and the ‘#’ are for applications. The hash tag provides a way to maintain state in a world where you can require JavaScript and even require the presence of certain browsers. And if you’re application requires login/signup, the content you show can be dependent on the particular user.  So there are times you probably don’t even want the crawlability, especially when you are using ‘#’ to maintain state for a specific user in a specific session.

Are you building an ‘application’ or a site? Does your content need to be searchable and reachable by the entire web? Do your links require true permanence and will changing them in the future “break” traffic to your site?

So you still want crawlability?

But, Yes I Do

imageThis presents a problem: your application is dependent on JavaScript and ‘#’ and Google don’t know about those. What to do?

You can provide snapshots. A service like Aerobatic specializes in hosting for single page apps. As part of the service, Aerobatic provides an Snapshot module which allows even single page applications to be fully discoverable by search engines.

OR

You can tell the crawlers that you want your app to be crawlable. Google has provided a Crawling Specification that gives you the roadmap for what to do if you do not do snapshots.

#!

Your application must tell Google (and probably other search providers) that your app uses the AJAX crawling scheme to notify the crawler to request ugly URLs. An application can opt in with either or both of the following:

  • Use #! in your site’s hash fragments.
  • Add a trigger to the head of the HTML of a page without a hash fragment (for example, your home page):
    <meta name="fragment" content="!">

This will instruct Google’s spider to use the Ajax crawling specification with your site. When it sees this tag it’ll then proceed to fetch your site again, this time with the _escaped_fragment_ parameter. Google detects this query parameter, and serve up spider safe content.

Once the scheme is implemented, AJAX URLs containing hash fragments with #! are eligible to be crawled and indexed by the search engine.

SammyJS

This means that when you define your SammyJS  routes you should use:

this.get('#!/', function() {
  // load some data
  //render your template
  // ...
});

And history

NOTE: While there are indications that Google’s spiders can index some URLs that Google finds by exercising your JavaScript, your page may not render and index complex JS web apps without a little help.

The Ajax crawling specification was originally intended for JS apps that use the hash fragment in the URL, which was a popular technique for creating permalinks when the spec was initially developed. However, you can still use the same spec, with a few tweaks, for modern JS apps using HTML5’s pushState to modify the browser’s URL and history.

If you are using SammyJS, it preserves your back button and your history as your user walks through your app.

And title

Make sure that you provide at least a title, meta description, header and text content on each page. Also make sure the meta description matches what you want to be displayed on the search results page.

There’s a Sammy plugin to help you with the title: Sammy.Title is a very simple plugin to easily set the document’s title. It supplies a helper for setting the title (title()) within routes, and an app level method for setting the global title (setTitle())

Roll Your Own Snapshots

OR

According to Google, if a lot of your content is created in JavaScript, you may want to consider using a technology such as a headless browser to create an HTML snapshot. For example, use HtmlUnit. HtmlUnit is a Java-only headless browser emulator. If your server-side technology is not Java, you may still want to use a headless browser technology to produce HTML snapshots, which makes Aerobatic’s service look pretty cool.

More About Aerobatic

Aerobatic is a cloud platform for front-end developers that makes it fun to build nimble HTML5 web apps in record time. So what is Aerobatic? In a nutshell, it’s a platform as a service (PaaS) for single page web apps. You could think of it as Heroku for front-end client apps.

Resources

#-ish

SEO in JS Web Apps

Making AJAX Applications Crawlable

How do I create an HTML snapshot?

AngularJS vs Knockout – SPA Routing/History (8 of 9)