User blog:TOR/Caching Explained

If you’ve been at Wikia for a while, chances are you’ve heard about caching. If you have written into Wikia Support with a bug report, you’ve probably been asked to “clear cache”, or told that something is a “caching issue”. In this blog post I’ll explain what cache is and how it makes your wiki faster on various levels.

How cache works
Caching is, to put it simply, storing any kind of information close at hand, so that it can be retrieved faster. What does that mean, exactly? Here’s an example:

When you type the address of your wiki into your browser (or click a link), your browser will download the contents of the page, and all related JavaScript files, styling information and images. Now, when you go to a different page on the same wiki, your browser would have to download all that information again, even though the article content is the only part that has actually changed. The area surrounding the content - the Wikia logo, the wiki’s wordmark, the skin’s colors, most of the JavaScript, and many other elements are identical to corresponding elements on your first page view.

So, to save time, your browser stores, aka caches, everything it can and tries to download as little as possible. The browser then combines freshly downloaded elements that are unique (such as the article content) and bits of previously saved -- or cached -- information and displays that to you. Because there’s less data to fetch from the server, the page loads faster.

This particular level of cache is called browser cache, because the information is stored in your browser.

There are many more levels (or layers), each working in a slightly different context to achieve the same goal: get the page displayed to you as fast as possible.

An intermediate, shared layer
If you think about the example above, you’ll notice that the effects of browser cache are limited to your computer. The information is kept as close to you as possible (on your computer), and so is extremely fast to load (good!), but if your friends want to view the same page you just viewed, each of them will have to download everything from Wikia.

So what would happen if we put an intermediate server between you and Wikia, have it cache everything that goes through it and make it shared between users? It would mean that the benefits of caching would be shared among all users. This is exactly what we do, and it’s called the Varnish cache.



The Varnish cache is an intermediate layer, between you and Wikia, and is run and managed for Wikia by Fastly (a company started by former Wikians). Fastly has servers setup all around the world, and connects you to the nearest one for extra fast response times.

The Varnish cache couples with browser cache to form a chain: if your browser has an up-to-date copy of the requested information, great, it just uses that; if not -- it asks Varnish; if Varnish doesn’t have a copy, it will ask the Wikia servers.

Deep under the hood
But even when you get through both cache layers and end up connecting to Wikia servers, there are still more layers of cache on the servers, working to keep the site as fast as possible.

We know that the content of a page stays the same until somebody makes an edit, so we store the content in a ready to use form in parser cache. This is cleared and regenerated when an edit is made to that page.

We also know, that the total number of pages on a wiki doesn’t change unless somebody creates one or removes some, so we keep that number handy in memory using memcache.

Many more tricks like that are used on every page view to minimize page load time. We are constantly working to improve site speed time, and you can get a recent update from Piotr’s blog post a few weeks ago.

Clearing cache and resolving issues
If you think you’re getting old content, you might have an out of date cached copy of the information you need. It’s always a good idea clear your browser cache and see if that helps. That is our first recommendation for when you see a potential site issue. If it doesn’t clear the issue, that means we may have done something wrong and out of date content got stuck at a lower level of caching. When you see that happen you should let us know through our contact form.

I hope this post gave you a good intro to caching. If you have any questions or comments on the above, let us know in the comments below.