Full Page Cache (FPC): Scaling Magento Part 1Pradip Shah
The most important and often ignored factor to scaling is the quality of code. Well written code will scale better. The next most important factor is perhaps caching. There are many types of caches that developers, managers and store owners need to understand. Full Page Cache (FPC) is seen by store owners as a magic solution to speed issues. Understanding the benefits and compromises of a caching mechanism is important to understand scaling.
Magento Enterprise 1.x and Magento Community & Enterprise 2.x comes with a FPC module inbuilt. There are many plugins available for Magento Community 1.x. Some hosting providers will help setup a Varnish based FPC with appropriate hole punching. The discussion below applies to all these mechanisms as well as to Magento 1.x and Magento 2.
What is FPC?
FPC will cache most of a full page – except variable content such as a login status or items in cart or stock status of a product or sometimes even price of a product. When a hit is encountered, FPC will save CPU by returning the cached content vs re-generating the content. When a accessed page is not cached, the normal procedure to access the page will be followed and the newly generated page will be cached. While when it works well FPC is great, one needs to understand the problems a FPC might create. FPC may store the cache in files, but more likely for maximum benefit, it will be stored in memory. FPC affects resource utilization – memory and CPU. In a way it trades memory for CPU time.
Let us understand FPC better
- Memory needed to store the entire site in FPC
Let us say each page is 100KB and you have 10000 pages to cache. That would take about 1GB of RAM. The problem is when the number of pages or page size starts rising above this, the RAM requirement goes up. So, if you now had 20000 pages (result of each option in layered navigation for example), you would need 2GB or if each page was 120KB the 20000 pages would need 4GB. Pages are not just products – they are category pages as well. If layered navigation is added the pages multiply fast as each combination is unique and needs to be stored independently. If you start exceeding the RAM available, you need to decide what to do when you hit the memory limit.
- Cache warming.
Cache warming is the process of automatically adding pages to the cache before a real visitor hit comes to the cached page. When a cache is cleared, you may need to warm the cache to make FPC effective early. Cache warming uses a crawler to artificially visit pages of a site. A typical crawler will recursively crawl the site starting from the home page. This sounds logical but here are some things to think through
- If possible find the most likely pages you need to be in the cache and warm the cache with only those pages. This will give the maximum benefit.
- If you cannot fit all the pages in memory, the use of crawling to warm the caches becomes a problem – they will recycle pages out of memory at random, not based on the end user popularity of the pages.
- When the cache is being warmed your resource requirement in terms of CPU will rise as both the crawler and real traffic are being served.
- If possible crawl the site in parallel – the earlier the pages get cached the more likely a visitor to a page will already be in the cache (scoring a hit).
The above figure shows the bad response immediately when a FPC that had built to 1.5GB was cleared completely. The top image is from redis usage graph from munin and the one below is AWS cloudwatch latency averaged per minute. The latency was moderated as AWS Autoscale added more instances.
- Invalidating the cache :
Figuring out when cached pages are invalidated (purged / cleared) is a challenge. A more aggressive approach to invalidating (i.e. clear as little as possible) will lead to stale pages with incorrect infomration. A less aggressive approach will lead to slowdown. A FPC has to be cleared when either a code update or a data update happens.
- FPC and robotic crawlers (BOTS)
Even if you do not use a crawler for warming, robotic crawlers on the internet (such as google’s indexer Googlebot) will start filling the FPC cache with pages they happen to crawl. It is our advice that a site with FPC should have robots.txt and a front end processor (nginx, WAF) restricting BOTs.
- CPU and time needed to re-generate a page
A FPC can fully invalidate (clear) due to a (p)html or css file changing or partially due to a data change such as a product update. A miss from FPC results in the page being regenerated. The CPU requirement for a miss is much higher than a hit. If a crawler is used to warm the cache or if traffic is high, CPU requirement can be quite high as the FPC fills up. Yet, the visitor experience is not good during this period. Using autoscale, this performance degradation can be contained to some extent as additional instances are launched to handle the high CPU requirement.
- Discipline when using FPC – know when invalidation happens
It is important to add discipline for code update as it has the worst effect on user experience.
- Code update should be done at low traffic times.
- Category changes should be carefully planned at low traffic times.
- Product additions should not have a major impact on FPC performance due to invalidations.
Our recommendation for FPC
- Do not use a random crawler to warm the FPC cache. Use a page popularity based crawler to warm the cache if necessary.
- Avoid using a crawler during high traffic – the crawler will compete for system resources with live traffic
- If possible update code during low traffic times as it causes FPC to invalidate
- If your site is horizontally scaled, pre-launch instances to your load balancer before invalidating FPC, either explicitly or indirectly, so the latency of starting an instance does not further worsen the user experience
Should FPC be a part of scaling strategy?
FPC is concerned with speed. Scale is concerned with the process that helps the site add resources when needed. FPC helps in scalability by reducing the use of resources per hit to the website under certain conditions. It changes the dynamics of when and how many resources will be needed.
FPC has to be considered to be part of scaling strategy, one of many parts that our scaling Magento series will delve into.