Magento Caches – Scaling Magento Part 2
Introduction
Caches are an integral part of a strategy for performance and scaling a Magento website. Managing caches is a core function of infrastructure management. A Cache is the term used to describe a mechanism to store calculated values such as a query or HTML so a subsequent request does not need to recalculate.
While Part 1 looked at FPC specifically, in this article we review all other caches needed for a Magento store. Some of these are commonly found to be referred in best practices, but a deeper understanding will help put them in perspective of Magento.
[Tweet “Performance of a website is like that of cars. Scale is like building a highway and interchanges.”]
In general, for any cache the factors to consider are
- Effectiveness – measured in terms of impact on page speed. Caches should be highly effective.
- Invalidation flexibility – is an inbuilt automatic mechanism available? Is the mechanism too aggressive? A High score means the mechanism is very flexible.
- Performance of miss – if a miss were to occur what would be the performance in terms of page speed. A low score represents bad performance.
- Cost of refill –how much would it cost to refill the cache completely to be back in its most effective state?
- Hit ratio achievable – In a real world environment if hit ratio were to be measured periodically what can be expected. A high would be over 99% hits over a reasonable period such as a day.
- Memory required : How much memory is needed on the server side to cache the content.
Cache Type | Effectiveness | Invaliation Flexibility | Peformance of miss | cost to refill | Hit ratio achievable | Memory required |
Browser | High | High | Low | Moderate | High | N/A |
CDN | High | Low | Moderate | Moderate | High | N/A |
FPC | High | Low | Low | High | Moderate | High |
HTML Block Cache | Moderate | High | Moderate | Moderate | High | Low |
Mysql cache | High | Low | Low | High | High | Moderate |
Browser cache
- Easy to setup – at the hosting level. Static resources like css, js and images should have caching enabled.
- There is no need for invalidation as the browser checks for each cached resource if a newer version is available.
- The ability of browsers to find changed resources means cacheable resources should have a very long expiry – say over a year.
- Browser’s request for detecting change has a performance impact based on number of resources.
- A key question to be answered is how merging css and js files impact browser caching performance. If merging is enabled, each page type (home, category, product, CMS) are likely to have different css and js files. Not merging allows the browser to cache individual files and reuse them across pages. However, in HTTP/1 browsers limit the number of connections to each domain. So, the advice is to not merge css and js files if using HTTP/2 or splitting domains for skin and js if using HTTP/1.
CDN
- CDN networks cache at edge servers closer to users – reducing round trip latency from browser to server.
- CDN also offloads the server from serving static cacheable resources, improving network performance of the server as well as freeing up server CPU to serve dynamic content.
- CDNs may also take the load of SSL validation however, caution is needed here as the traffic between CDN and server may be unsecure making the site vulnerable to some type of attacks.
- CDNs are notorious for invalidations – some charge for APIs, others take a few minutes before the invalidations are effective across all edge points. When evaluating a CDN, this is a key factor that is not evaluated.
- Having many edge locations may not be a good thing – as each edge records the first access to a resource as a miss
- While a single miss is easily retrieved from the server or backup store, a full invalidation requires multiple GBs to be transferred to make the cache effective again
- CDNs when full give great performance benefit on page load times
- Modern CDNs like section.io can also do FPC (html) caching using a distributed varnish cache architecture.
FPC
We have reviewed aspects of FPC in part 1 here.
[Tweet “FPC & Varnish is not the solution to bad Magento php code.”]
Recap
- FPC caches full HTML pages – except for variable content
- Excellent for caching dynamic content
- Requires very high amount of RAM on the server side
- Depending on the quality of code, FPC invalidation or miss can have impact on resource utilization
- Best implemented with autoscale – so servers are added automatically when cache is invalidated
HTML Block Cache
- Also caches HTML but cached at the block level. Magento uses HTML blocks for building a page.
- Since blocks may be shared across pages, these blocks do not have a high impact on invalidation as they will be regenerated once and used multiple times
- Can dramatically improve performance if used consistently and correctly.
- Needs developer help as many blocks are not cached by default. To cache a block, one needs a unique key that correctly identifies its variation. Check this technical blog info.
- Invalidation can be either via a key or time (TTL). If using a key, developer needs to write appropriate event callbacks to detect change.
- Examples of major speed up include home page blocks where latest products are shown. Depending on frequency of store updated, a 10 minute to 1 hour TTL on the block will result in dramatic improvement of home page speed.
mysql cache
Mysql can store result queries in memory. The amount of memory is specified in my.cnf as a combination of query_cache_limit (max memory for a single query result) and query_cache_size (max memory all cached queries).
- mysql automatically invalidates a cached query if any table that was used in the query changes.
- to access the cache mysql uses a lock on a table thereby reducing the effectiveness of the cache
- Many mysql articles recommend smaller cache values due to the lock problem but it is best to test the size of the cache for your situation. It is best to monitor “low mem prunes” and “table locks wait”. The first gives the number of times a query was cached by removing another and the latter gives the number of times table lock was not immediately available. Both should be as low as possible.
Our experience with mysql cache suggests when front caches are empty, they are useful. The cache has a negative effect on add to cart and checkout.
Mysql caches also have negative side effects when a large catalog is loaded. However, we recommend to keep indexing to be on schedule, disable mysql cache before indexing starts and enable when indexing ends.
There are other caches too
- Magento has caches other than block. These should always be enabled on a production server.
- Php opcache
php code is read and converted to “opcodes” which are then interpreted. An opcode cache stores the opcodes reducing one step each time. Opecode cache should have sufficient RAM and keys (equal to number of php files).
Using opcache.php the status should be checked regularly. - Operating system cache
linux has an excellent file system cache – whenever additional RAM is available, linux will store opened files in RAM. This is important if for example you do not have a CDN. Files will actually be retrieved most likely from RAM rather than disk.