hosting

php opcache explained

What is php op-cache?

Php is an interpretive language. The interpreter has to read each line of php code, parse and tokenize it – i.e. convert to internal format also called opcode or operation codes. It can then interpret by “running” the opcodes. With php 8, the JIT changes it a bit.

A php project like Magento has a lot of files – so if php had to read each file everytime there is a reference to it, it will spend a lot of time reading the file from disk, parsing it and converting to opcodes. In order to speed up that process, php has opcode cache (or opcache), where the opcodes are kept in cache.

This cache is stored in memory or on disk. Using and configuring memory gives the best performance.

Opcache and JIT

JIT transforms php from a purely interpretive language to a compiling language that generates opcodes that the underlying CPU can run. Unlike pure compiling languages like “C”, php uses a Just-in-time option and generates these CPU opcodes or machine language instrutions and stores them in memory. The opcache module is responsible for this.

Opcache and php-fpm

Typically php is a single threaded application. So, each hit requires a different php process. Typically at the end of the execution, the php interpretor terminates.

php-fpm is the process manager for php. It runs keeps manages a pool of php processes and keeps track of which php process is busy. It has an ability to start and stop processes. Php-fpm communicates with a web server like nginx using the fastcgi protocol.

Php-fpm keeps a single opcache that all the processes in the pool can share.

How much memory does php opcache need?

Php opcache allocates different blocks of memory for different types of data. The amount of memory needed is dependent on the number of files in your project, the number and total size of string literals and the size of each of executable code in all your files.

JIT needs additional memory to store the machine opcodes.

Opcache configuration parameter are stored in /etc/php.d/opcache.ini (or the opcache.ini in your system).

memory (opcache.memory_consumption) : the memory where opcaches are stored
string (opcache.interned_strings_buffer) : string literals is shared in a separate block of memory
keys (opcache.max_accelerated_files) : opcache has a hash table with the filename as key. The keys are stored in this block of memory.
JIT (opcache.jit_buffer_size) : the memory where the JIT generated machine code is stored.

Each has a separate configuration value. The exact value you need depends on the number of files and the number of shared projects using this same php-fpm pool.

Is there an easy way to see how much memory is being used?

A small php program can help :

<?php>
$status = opcache_get_status();
print “Memory (opcache.memory_consumption) : \n“;
print "  used_memory=" . $status['memory_usage']['used_memory'] . "\n";
print "  total_memory=" . ($status['memory_usage']['free_memory'] + $status['memory_usage']['used_memory']) . "\n";
print "  hit_ratio=" . ($status['opcache_statistics']['hits'] / ($status['opcache_statistics']['hits'] + $status['opcache_statistics']['misses']) * 100) . "\n";
print “string (opcache.interned_strings_buffer) : \n“;
print "  used_memory=" . $status['interned_strings_usage']['used_memory'] . "\n";
print "  total_memory=" . $status['interned_strings_usage']['buffer_size'] . "\n";
print "hit_ratio=" . ($status['opcache_statistics']['hits'] / ($status['opcache_statistics']['hits'] + $status['opcache_statistics']['misses']) * 100) . "\n";
print “keys (opcache.max_accelerated_files): \n”;
print "  used_memory=" . $status['opcache_statistics']['num_cached_keys'] . "\n";
print "  total_memory=" . $status['opcache_statistics']['max_cached_keys'] . "\n";
print "  hit_ratio=" . $status['opcache_statistics']['opcache_hit_rate'] . "\n";
if ($status[jit']['buffer_size']){
  print “jit (opcache.jit_buffer_size) : \n“;
  print "  used_memory=" . ($status[jit']['buffer_size'] - $status['jit][ buffer_free_memory']) . "\n";
  print "  total_memory=" . $status[jit']['buffer_size'] . "\n";
}else{
  print"jit is disabled”
}

Save this as a file name opcachestatus.php in a folder where php scripts can be executed from the browser. (Warning : do not ship this to production. Treat it like phpinfo.php).

If used memory in any section is more than total memory, you will get better performance by increasing the corresponding value in the opcache.ini file.

What happens when the cache memory runs out?

  • When memory is full, opcache will essentially do a restart.
  • We do not know how the jit memory behaves on being full.

Invalidating opcache

There are only two hard things in Computer Science: cache invalidation and naming things.
- Phil Karlton

Once an item is in cache, it can serve stale content - i.e. it needs to detect a php file has changed.  What makes cache invalidation hard, is that there is a tradeoff between serving stale content vs speed.

Opcache gives you options.

  • You can ask php-fpm to always check if a file has changed
    opcache.validate_timestamps 1;
    opcache.revalidate_freq 0;
  • You can ask php-fpm check if the file has changed atmost once every x seconds – replace x by the number of seconds you want
    opcache.validate_timestamps 1;
    opcache.revalidate_freq x
  • You can ask php-fpm to never check until you clear the cache explicitly
    opcache.validate_timestamps 0

Opcache for production

  1. We like these settings
    validate_timestamps 0
    opcache.max_wasted_percentage 50
    opcache.enable_file_override 1
    opcache.max_file_size 0
    opcache.consistency_checks 0
    opcache.preferred_memory_model ‘’
    opcache.file_update_protection 0
    opcache.huge_code_pages 1
    opcache.file_cache_only 0
    opcache.file_cache ‘’
  2. JIT : for magento production server, where the only one application is running, we like to use the following for JIT
    opcache.jit=1205
    opcache.jit_buffer_size=200M
  3. If using horizontal scaling with a load balancer, if you query opcache settings from a web application, you will get the result of the server the hit was executed from.
    This is because each php-fpm server will have its own opcache.

Opcache and luroConnect

luroConnect enables opcache across all our customer servers. On dev/staging, opcache is run with
opcache.validate_timestamps 1;
opcache.revalidate_freq 0;

Our production servers run with
opcache.validate_timestamps 0

We continuously monitor opcache memory usage. Since we use a horizontally scaling architecture, we need to make sure if any app server exceeds the memory limits, all other servers are updated as well. We start with known magento required memory limits and tune to higher if needed.

Since we turn off validate_timestamps, each code deploy results in a reload of the php.

On php 8 servers, we use JIT with
opcache.jit=1205

this means we tell php to compile all functions into JIT code. We do this as all servers are running a single application and that never changes until a deployment is done.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

Enterprise Class Hosting that does not hurt your bank!
With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

Magento and WordPress site performance can improve with proxysql

proxysql is a small infrastructure change that can bring about a huge performance difference in large traffic Magento and Wordpress websites.

What is proxysql?

Proxysql is a high performance mysql proxy. As a proxy it sits between the application and mysql. mysql requests from the application are routed to the proxy, which in turn routes the request to mysql.

Since it sits between the application and mysql, it can intercept the query and do many things - like routing select queries to a read-only replica, or load balance between a set of write replicas (such as a galeria cluster). It also collects a lot of interesting statistics that can help optimize the application.

Unexpected performance gain from an issue with mysql threading model

But, the a huge gain is achieved by increasing performance of mysql in a high traffic environment. This is achieved due to a drawback in the mysql threading model.

Mysql follows a "one thread per connection" threading model. Moreover, a single mysql thread corresponds to a operating system thread. A high traffic php application does not share mysql connections and as a result can have a large number of threads.

Some of the drawbacks of this model :

  • Mysql consumes memory per connection used and once consumed, this memory is never given back. As a result, mysql performance can deteriorate over time.
  • A CPUs performance is very heavily dependent on hardware level caches. These are dependent on the "principle of locality". With too many threads executing simultaneously, the cache hit ratios go down, reducing CPU performance.
  • With too many threads executing in parallel, context switching overhead is high. Context switch is when the operating system decides to switch execution from one thread to another. A single CPU core can only execute on thread at a time.
  • Too many transactions executing in parallel increases resource contention in mysql. In InnoDB, this increases the time spent holding central mutexes. This results in queries running slow for no obvious reason.

Proxysql solves this problem by not opening a new connection for each client request. It reuses connection it may already have open with mysql.

Reference : MySQL Enterprise Documentation

Effect of context switching due to high number of mysql threads

The figure alongside shows the effect of context switching.

Upto 3rd May 2024, there is high CPU usage. The green bars are for CPU used by the operating system in context switching between threads. The DB server consuming almost 100% CPU and at the same time not responding to queries.

3rd May saw the introduction of proxysql. The CPU usage is now stable, with not much used by system in context switching.

Towards the leftThe green bars are CPU spent by the system in context switching between threads. Adding proxysql immediately dropped CPU usage.

The image alongside is the number of mysql threads for the same period. The correlation between the system CPU and the number of threads makes it obvious cause and effect.

Upto 3rd May, mysql has a surge in threads - caused by a spiral of slow insert or select queries adding more threads and slowing down the db. 3rd May saw the introduction of proxysql - and mysql now uses a small number of threads.

We will explore Mariadb Thread Pool shortly to see how it compares with the proxysql approach.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

Enterprise Class Hosting that does not hurt your bank!
With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

luroConnect Magento CI/CD adds meta information to builds

luroConnect CI/CD

luroConnect managed hosting platform includes CI/CD from any git source.

luroConnect CI/CD is an opinionated, platform specific CI/CD. It currently supports Magento (luma, Hyva, ScandiPWA themes), PWAStudio, Angular, nodejs applications. Nodejs builds include yarn and npm based installation.

Builds are done in docker with environment variables. (Note : Our Magento CI/CD also builds in docker without a database).

Builds once made, are ready to be deployed.

Advantage of opinionated CI/CD

  • Built for the target application stack with options.
  • Improved over time in a platform specific way
  • Magento CI/CD builds support features like magepack bundling, css and js minification, luma, Hyva and ScandiPWA themes.
  • Magento CI/CD support multi website and multi store builds
  • PWAStudio supports multi website and mutli store builds with different default currencies.
  • Is built using resources on your AWS account and created to deploy on your AWS account.

Meta Data information

Builds are identified with git commnit-id (shortened). Meta information includes the branch, the build number (generated from git as a serial count of commits in the branch upto this commit-id), and the git comment for this release.

luroConnect dashboard now displays meta data with the current deployed version.

When deciding which version to deploy, meta data information is displayed as well.

Build counts are useful as they are serial numbers - a larger number indicates a later commit serially. Commit comments are useful to know the PR or merge comment used so one is sure of the deployment.

luroConnect build and deploy dashboard

Magento CI/CD metadata

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

Using AWS Autoscale “warm pools” to reduce costs

AWS Autoscale added a new feature “Warm Pool”.  Let us explore this feature and see how luroConnect uses this to reduce hosting costs.

The autoscale latency problem

Usually, AWS Autoscale will launch a new server with the given AMI image based on the launch configuration or launch template configured. Launching a new server takes about 4 minutes or more. So let us say a scale-out event is configured for launching a server when the CPU across all autoscale instances exceeds 70% for 1 minute. Now, let us say a sale promotion on facebook causes a surge in traffic causes this event to trigger. It takes AWS 4+ minutes to respond and add a new server. If during this 4 minute period, the surge goes past 70% and say reaches 90-100%, it is likely that visitors will see a slowdown or even errors. The 4+ minute period is called the autoscale latency and in designing the scale-out and scale-in parameters, it plays a crucial role.

For a website that sees frequent surge in traffic in short spurts, one would be prompted to use a lower threshold for a scale-out event. A lower threshold will result in frequent triggering of scale-out events.

At the same time the scale-in threshold will also have to be reduced to ensure enough spread between scale-out and scale-in events. A lower spread will result in an unhealthy sequence of a scale-out event adding a resource for it to be immediately removed.

Autoscale designers then tend to add higher number of minimum instances, possibly of larger sizes. That reduces the effectiveness of autoscale – and increases AWS costs.

Lowering the autoscale latency results in a better autoscale system. As the latency reduces, the need for larger number of minimum instances or larger size instances reduces. This results in savings in the AWS bill.

Introducing the warm pool

AWS now introduces the concept of a warm pool. The costs saving of a warm pools come from AWS policy for not charging for instances in stopped state – except for the disks. A warm pool is a set of autoscale instances that are launched but kept in stopped state. When a scale-out event happens, the latency Is now reduced to the boot time of an instance and any initialization needed – we measured adding 3 instances took about 35 seconds to start serving traffic for Magento.

A scale-in policy simply stops the selected instance and add it back to the warm pool.

Warm Pool For Autoscale

How to use a warm pool?

If you are using launch template for your autoscale, creating a warm pool is easy and documented here. If using lifecycle events, newer events have been introudced.

If using a launch configuration, we suggest upgrading to a launch template before using a warm pool. While upgrading to a launch template is easy, it is advisable to read about launch templates as they are a different and a larger concept.

Changing your instance image when in a warm pool

AWS has support for “instance refresh” – a term used by AWS to indicate an update in AMI image for all running and warm pool instances in a single command. However, this update has a crucial flaw – it can keep your website inaccessible for a short time. This is due to AWS terminating an instance before adding one. If an image has to be updated – such as a new code deploy – a custom strategy has to be deployed to ensure the website does not go down.

luroConnect support for warm pool

luroConnect now supports warm pools across all its autoscale plans, with a scripted image update policy that ensures 0 downtime during image change as well as a code deploy strategy that ensures 0 downtime on code deploy.

Issues with AWS Reference architecture and tools for a Magento application

At luroConnect we implemented our autoscaling system after addressing flaws in many implementations we had seen.

As AWS autoscale by default is integrated into AWS load balancers – ELB or ALB. Using AWS reference implementation will put the code in a autoscale instance with nginx or apache with php and the code. Traffic can be routed through the ELB/ALB which will handle SSL and route the traffic to each autoscale instance.

When code has to be updated, a new AMI will be created and AWS instance refresh can be run to update the instances.

You could use AWS CodeDeploy as described here but you need to set it up to make sure Magento setup upgrade can be run when required.

Problems with autoscale implementations for Magento

  1. Issues configuring FPC (Full Page Cache) with this configuration : If varnish is configured on all autoscale instances (as we have seen many implementations do), each server will warm caches on its own. Clearing pages from cache will also be difficult. Using redis as a FPC increases per page latency for cached pages.
  2. Media and var folders are needed to be shared across all servers. NFS is typically used to share. However, the configuration of each autoscale instance has to be such that it can discover and mount the folders from the NFS server.
  3. When a code change has to be deployed, it is not clear how it can be done without causing a downtime of the website. Using AWS Code Deploy requires a complex setup to ensure setup upgrade is run before one of the 0 downtime strategies can be used.
  4. When a new server is launched, conditions to check the health of the website are not easy to write. This results in a few error responses before the server is ready to serve traffic.
  5. It is difficult to use a AWS ALB to route traffic for specific purposes – for example, routing traffic to a wordpress server for /blog urls.

luroConnect Autoscale on AWS : Smooth setup and running.

luroConnect Autoscale solves these problems.

luroConnect lets AWS monitor instances and decide when to add or remove (scale out or scale in) instances. luroConnect autoscale for AWS adds cloudwatch events and lifecycle management generated by AWS Autoscale to ensure a very smooth Autoscaling operation. luroConnect uses nginx as a load balancer and does not require a ALB/ELB to operate. luroConnect Autoscale supports AWS Autoscale with warm instances and has a mechanism to update the AMI when needed without any downtime.

  1. Using nginx as a load balancer allows high flexibility in deciding which urls go to varnish for full page cache and which should be directly served by php. varnish as a full page cache gives the maximum impact of full page caching.
  2. A nfs server holds shareable content of magento - specific media and var folders for example. Using NIS, autofs and NFS, each new app server is able to discover the NFS share.
  3. When a code change has to be deployed, php code using nfs is shared to each app server. A php reload and opcache configuration will ensure the new code is kept in the php opcache memory for all future operations. A php file from NFS share is loaded only once.
  4. Before a server is added to the nginx load balancer, extensive checks are done to ensure the new autoscale instance is ready to take traffic, including warming the opcache.
  5. nginx as a load balancer brings in a lot of flexibilty in routing traffic such as a /blog to a wordpress website, custom rewrites, etc.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

Do you know what size server you want for your eCommerce site?

Leaving the toughest question unanswered

When signing up for your Magento hosting, the first question you see asked, before you place an order, is what size server you want. It has become so ubiquitous, that everyone just expects to answer it looking at the cost.

But this is much like Mathematics books leaving tough problems as exercise to the readers!

It should not be that way! The size and architecture of the server you need depends on many factors.

Factors to consider

  1. The traffic and pattern. We routinely ask for 2 google analytics graphs - one for a typical day and one for a high sale day. This drives the architecture and server size.
  2. Your hosting stack - are you vanilla magento? or do you use headless / PWA? or use some software for image optimisation on your server?
  3. If the live site is already hosted, current CPU and memory usage.
  4. The size of the magento database.
[porto_blockquote]luroConnect always starts an engagement with a server sizing sheet that is filled on behalf of the merchant. This allows us to propose a hosting plan on the customers cloud account and an appopriate luroConnect support plan.[/porto_blockquote]

Take the guesswork out of server sizing with horizontal scaling

A classic 3-tier architecture.

  • The web layer (WAF, apache / nginx /varnish, cron, rabbitmq),
  • the application layer (php, nodejs) and
  • the database layer (mysql, elasticsearch, redis).

Horizontal Scaling :

  • app servers can scale independently - indeed they can be autoscaled.
  • Low traffic websites can fold either the app or the db layers or both into the web layer
  • The db layer can be extended to have master slave
  • A proxy layer can load balance read traffic between master and slave, giving scalability at the database level

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

Sansec reports new Magento 1 hack

Over the weekend of Sept 11, 2020, Sansec reported a web skimming attack on Magento 1 stores. It was the largest single day automated attach recorded by Sansec.

What are skimming attacks?

"Skimming" attacks are malicious code added to your website so when a site visitor is entering any personal information including credit card, the content is "skimmed" and sent to the attacker.

The website looks completely normal.

In September 2018, British Airways revealed that 380,000 passenger information had been skimmed from the website. The modus operandi for this attack was access to the code (possibly the version control of a 3rd party javascript module used on the website). The attack went undetected for months.

Some attacks are also called "Magecart" attacks.

In Magento we see 2 popular ways

  • Break the admin password and upload content to "Miscellaneous Header" or "Miscellaneous Footer" sections.
  • Upload a php code file which in turn loads the real malicious code in javascript in the page - either directly into the page or modify a known javascript file.

The current attack is of the second variety.

Stores on luroConnect were not attacked!

The attack as described by Sansec in the article used the Magento connect to bypass Magento admin and upload malicious code in javascript files to facilitate skimming of credit card information.

luroConnect has many rules that helped prevent this attack from affecting any of our Magento 1 websites.

Rule 1 : "/downloader" URL  is not accessible on any live or staging website. We expect code to be deployed through git and expect the developer to use a manual process to install modules. We disallow magento connect based installation in any of our managed websites.

Rule 2 : Our web directory owner and hosting users are different. Hosting user is the user php code runs as. Moreover, /skin folder is not writable by the hosting user.

Rule 3 : We use a static minifier and deploy the code to a folder skin.min which is not in git. The /skin folder itself is never used.

Rule 4 : Staging and dev environments are protected using a HTTP Basic Authentication. Automated attack vectors would need to add a password guesser before they can reach the staging URLs. This is assuming a developer would have relaxed permissions in the dev environment.

Rule 5 : Our platform bars ssh access to the hosting user. This prevents any accidental change in permissions being permanent. Even in the rare case ssh access is given (for debugging purposes), upon relinquishing the access, we sanitize the environment with default permissions.

How to protect your store?

One of the best ways is to sign up for Sansec's security scanner eComscan

luroConnect is a very secure platform for Magento hosting. We call it layered security - from a secure file system and strict folder permissions, to an inbuilt WAF with configurable rules to partnering with Sansec for security scans.

We host you on your cloud or physical hardware using our stack. Learn more about our plans here.

Supercharge your Magento with a Varnish cluster

What is a varnish cluster?

A varnish cluster is a set of varnish nodes, each in a different geographical location, in front of the same Magento backend.

As shown a Magento hosted in the US East region can serve varnish nodes from across the world. Access to the website from each region is directed towards the nearest varnish, benefiting from lower latency and faster page loads.

If you serve customers in different regions - internationally or across the USA, your store can benefit from a varnish cluster.

Varnish and Magento performance

Magento 2 was architected to work with Varnish for improved performance. A typical webpage – category listing or product detail page - when returned from a cached varnish page (called a HIT in varnish) typically has a TTFB of a few milliseconds. An uncached page (also called a MISS in varnish) results in TTFB of a few seconds. Optimizing a MISS is very crucial, but we will not cover that in this article. We believe varnish is one portion of optimizing your website.

What is Network Latency

Latency is the time it takes for a network packet to go from your computer to the server with a request and come back with a response - assuming the response from the server was available immediately. When a page scores a HIT in varnish, the response is almost immediate from the server. Any  Time To First Byte (TTFB) recorded on the browser can be attributed to network latency.

If you market your website to many regions around the world, and you host varnish in a single location, your visitors may be faced with higher latency. When latency of access is in a few hundred milliseconds, it becomes the bottleneck and needs optimization.

The below information from https://wondernetwork.com/pings/ gives an idea of expected latencies. To read the table use the row header as source and column header as destination. So, a ping latency from Los Angeles to Mumbai would be 267 msec.

What is the cause of latency?

Core latency is a function of distance – even light will take ~40msec to travel 13,000 km – the distance from New York to Mumbai. A network packet travels through wires at that speed, and network wires are much longer than a straight line. Moreover, a response packet has to travel all the way back.

Latency is also caused as traffic goes through network equipment. The number of switches / servers a packet has to go through depends on the network and service providers – yours as well as at the server you are connecting to.

A varnish cluster architecture

Using a GeoIP based DNS service such as AWS Route53 a users request for a domain is redirected to one of several IPs. The Magento backend is in the “default” region – where maximum traffic is expected. A varnish in the regions desired – shown US East and Europe below.

Geo-IP based DNS

A geoip based dns router such as AWS route53 can help direct traffic to the nearest varnish node based on the guessing the country the IP requesting name resolution is coming from. So users on a browser say in Australia would be directed to be served from the varnish node in Australia and one from the west states of California would be directed to the varnish there.

Since IP to region or country can never be so accurate, it should be possible for all varnish nodes to serve customers from any region. Specifically, a language or currency switcher should be available on the website.

Magento 2 supports multiple varnish nodes out of the box

An important feature in using a cache in production is its need to automatically and quickly clear contents on demand from the user and application. For example, when a product goes out of stock, the corresponding page should display the out of stock label. Magento uses a tag-based system to flush appropriate content from the cache. Magento allows setting up multiple varnish hosts and a tag-based cache clear is sent to all the hosts.

bin/magento setup:config:set --http-cache-hosts=<varnish internal ip>:6081,<varnish internal ip 2>:6081

Refer : https://devdocs.magento.com/guides/v2.4/config-guide/varnish/use-multiple-varnish-cache.html



Challenges in a varnish cluster

A varnish cluster is more complex to manage

Ensure the varnish vcl files, front end security configuration (WAF, rate limiting, etc) is managed and kept in sync on all edge nodes.

Managing includes monitoring to ensure none of the servers go down. Now, your site can be down in a specific region for example. Typical monitoring tools such as Pingdom would not work. A purposeful monitoring solution is needed.

A varnish cluster costs for additional servers

Since these would be frontend servers, the amount of RAM and their network speed requirement would depend on what traffic they get.

Number of varnish nodes

Increasing the number of nodes in a varnish cluster does not always help in improving site speed. That is because each varnish node has a different hit ratio. A lower hit ratio leads to more users getting the latency and performance penalties combined – due to a varnish MISS. Traffic pattern and latency have to be taken into account to decide on how many nodes to use in a cluster.

Difficult to warm the cache

Given the distributed nature of the cache, warming each cache independently takes more resources on the server side as well as some changes to the way a cache warmer works.

luroConnect : A modern cloud hosting platform

Schedule a call of a free evaluation and demo!

  • Is horizontal scaling manually or with autoscale right for you?
  • Evaluate if a varnish cluster will help your website performance
  • Show managed dev, staging and production environments
  • How we measure application performance every minute with alerts for a slowdown
  • Can your hosting costs be optimized?
  • How improved hosting can lead to better ROI!

On Magento Cloud? We have special offers if you switch your enterprise license to luroConnect managed AWS cloud.


Magento Cloud and Fastly

It is important to discuss the Magento 2 cloud decision to use Fastly as a frontend.

 

Magento 2 cloud pro version architecture is given below. (Reference : https://devdocs.magento.com/cloud/architecture/pro-architecture.html)

The architecture uses Fastly as a Full Page Cache. Varnish is not installed on the Instances in AWS.

Fastly is a CDN that uses varnish. With the Magento 2 Cloud integration, a custom plugin is used on Magento along with a custom vcl file that runs on Fastly.

Fastly has many “POP” locations. As per fastly documentation, there are 20 POP locations in North America, https://www.fastly.com/network-map

mostly in the USA. Each has its own varnish cache. When a page is not in a POP, it fetches content from the origin. A single page may have to be rendered 20 times for each POP location in North America.

Drawbacks of this architecture from a varnish cluster perspective

  • More POPs do not lead to better experience as a higher percentage of MISS on varnish results in a worse experience for more users.
  • As POPs increase the load on Magento infrastructure increases.
  • It is not possible to use a cache warmer to warm the Fastly cache.

What next?

An ideal situation would be a layered varnish configuration - each "satellite" varnish node serving a local user, caching a subset of the "main" varnish node, reducing the penalty of a varnish MISS.

Share your thoughts here or on social media.


Choosing Hosting platform for Magento : Digital Ocean

One in a while you may want to check on the technology and options available for Magento hosting – technologies change, hosting costs reduce, life becomes easier for you and the merchant. In this multi-part series we will explore cloud platforms and their suitability to hosting a production Magento website. In this article we will explore how seriously you should consider Digital Ocean as a platform for hosting. Digital Ocean (DO) offers VMs it calls “droplets”. With an excellent blog and a friendly attitude towards developers, DO is a serious hosting contender. Many developers naturally recommend using DO to merchants for hosting Magento. How real is it? Let me explore a few pros and cons.

DO Choice of vCPU and memory

DO offers “shared CPU” and “dedicated CPU”. For eCommerce hosting we always prefer dedicated CPU. Shared CPUs work differently on all cloud providers and quite often when you need it the most, you do not get enough. Each vCPU is a Intel hyper-thread. As of this writing (July 2020), we see use of Intel Xeon Gold 6140 @ 2.30 GHz with DDR4 memory at 2.6 GHz. Our simple memory speed test showed about 1.8GHz effective throughput to memory.

DO disk : Love / hate relationship!

I love the disk speed – you get good SSD performance with no throttling. Even for attached storage. Unlike other cloud platforms, you don’t have to juggle with figuring out how many IOPS you need, once you understand for that platform how IOPs translates to speed. To test the disk I use a simple but effective method to test the disk speed. I create a file with /dev/zero using dd of 1GB. dd gives me the write speed. Here is the command I use : dd if=/dev/zero of=$tempFile bs=1M count=1024 conv=sync oflag=direct tempFile has the path to a temp file in the mounted disk I am testing For both direct disk as well as mounted block device we see 200-350 Mbps disk speeds across all droplets our customers use. This is the best we have seen in cloud platforms. Physical hardware can give upto 450Mbps speeds.

So, why the hate?

We have seen disk crashes – thankfully on staging servers. So, when it comes to production servers, we always recommend customers to have a DR plan to minimize loss of data when this happens. Our suspicion is that the storage is not in a managed RAID, hence a disk crash is a droplet specific event.

Network

File transfer speed test

: A large file transfer using scp on the internal network is done in about 130 MB/sec – about a 1 Gb/sec speed.

NFS

NFS of block storage performs poorly. We use nfscache so most reads are served from cache. However, if there is a need for a large number of reads or writes, the performance can drop dramatically.

Examples of NFS hurting performance

Magento 1 : As images are created in the frontend, there is a check to see if an image file exists before an image url is included in a page. This invariably results in slowdown. Magento 1 and 2 : large log files. Magento stores log files in a shared folder. Multiple lines per hit of logs will result in slowdown. Experience :  We were in the process of taking a Magento 2 website live that had about 2500 errors written to a log file. The issue was related to M1 migration resulting in some attributes not defined in M2. App servers saw 30% CPU in I/O wait and the site came to a crawl. Php access logs showed under 10% CPU utilization.

Non availability of autoscale

DO does not come with an autoscale option (you could get autoscale in their Kubernetes which we are not considering here). This means you may have to keep capacity for a holiday sale for example.

Managing server costs

Server costs even when you shut them down

DO charges for servers that are shut down. If you do not want to be charged fully for a server, storing an image is an option.

Recommendation

We host customers on Digital Ocean if they do not have a need for scaling and are willing to have a DR plan. This generally increases the cost of the overall solution. Also stores that do not need scaling such as :
  • PWA - Vue-storefront based stores which use nodejs
  • A high hit ratio varnish FPC magento store (depends on many factors, but essentially requirement for scaling multiple app servers is less likely)
 

How Magento can get near 0 downtime deployment

Factor III of the 12 Factor App says "Store config in the environment".

12 Factor App is what devops lives by - a set of 12 principles written by Adam Wiggins for predictable web app deployments.

Storing configuration in environment, separate from code has the advantages of reliable deployment along with reduced time to deploy. It allows separation of the build stage from the deploy stage, with some deploys being just a change in a softlink to the web root folder.

Historical preview : Magento 1

Magento 1 did not have much of a build process - js and css were not versioned, magnification was "online" first access based as was database upgrade information, configuration was stored in the database.

The most reliable way to go from a dev configuration to a live configuration would require a set of known steps that would work or changes directly to the database.

luroConnect developed its own build and deploy process. In our build step we

  • get source code from git
  • minify css and js files in the skin and js folders using a grunt based process
  • set appropriate file ownership and permissions

During the deploy phase, we

  • Copy app/etc/local.xml from a secure deployment configuration area (our environment)
  • modify the core config data to add a version string in the skin and js URLs
  • access the website once through the index.php to cause the update scripts to run

Deploy process is of course run with the site in maintenance - we prefer to do this at the nginx level. Mostly it is a small blip.

external

Historical preview - pre Magento 2.2

Early Magento 2 builds were similar - except there was some help from the bin/magento command. Our deploy process did not need to version the static access anymore. Plugin enable / disable was given via config.php. Our deployment environment contained env.php.

However, developers had to manually configure and experiment with some options.

Site bringup required devops to access the admin panel or update the database with custom sql - enabling varnish, setting up CDN with a static URL, etc.

Magento 2.2 and beyond

Magento adopted the direction of the 12 factor app and presented in Magento Live UK 2017 a new set of features that would help in ensuring an ability to split the application configuration and environment configuration. Application configuration was defined in app/etc/config.php which is advised to be in git and hosting environment and secure details are kept in env.php which should not be kept in git.

It is a slightly weak conformance - as commented by 12factor app "This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific."

Magento has fixed this in 2 ways

  1. The language specific aspect is addressed to some extent in Magento by allowing to use bin/magento cli to edit env.php for sensitive data. The config:sensitive:set directly writes to env.php. These commands no not require the database, hence, can be set in a pre-deploy step.
  2. Use of scoped environment variable names. These would be set in Nginx configuration or an include file such as fastcgi_params.

However, there is no documented way to set database details - except to manually edit the env.php file.

external

The app:config:dump command

A great help in maintaining a known configuration of the application (which 12factor app suggests be committed to git). This ensures communication between developer to operations.

The app:config:dump command writes to config.php and env.php. While config.php is suggested to be committed to git, env.php should not be committed to git.

If a value is in config.php, the Magento admin panel does not allow the parameter to be edited. This locking helps with giving stability to the application configuration. It ensures the application is developed and tested with a known configuration.

The figure alongside shows the suggested flow.

Suggested flow for using app:config:dump

Why is Magento deployment yet keeping site in maintenance?

However, we find that even after 2 1/2 years of announcement, the acceptance and understanding of these features is weak. Leaving websites in maintenance mode as code is deployed.

Developers are failing to maintain a discipline to own the configuration or devops to understand the application's build and deploy process.

There are some practical problems as well. An eCommerce manager would like to have control on the live website on say, when backorders would be allowed storewide. Since this is locked in config.php, this request has to go through developers or devops.

luroConnect near 0 downtime deploy

luroConnect's Magento 2 build is in a pipeline - such as a bitbucket pipeline. A commit triggers the pipeline that does the following

  • composer install (with the compose cache to speed this process)
  • bin/magento setup:di:compile
  • bin/magento setup:static-content:deploy

The contents are then tarred and sent to the staging and production servers.

Upon deploy the contents are untared, deployment related files like env.php are copied, media and var are softlinked. The web root softlink is changed to point to this new release. The process is slightly more complicated when multiple autoscale instances are running, as running instances are replaced with ones with new code.

If required the bin/magento setup:upgrade command is run and only then is it required to keep the site in maintenance.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

12 factor app and Magento

Adam Wiggins’ 12 factor app (https://12factor.net) is a highly respected standard for web apps. While written with SaaS applications in mind, let us explore and see how Magento and the ecosystem stands up to these factors.

1. Codebase. One codebase tracked in revision control, many deploys.
Magento is in git and hence a typical Magento project should not have a problem with this.
However, if you use vue-storefront, a popular PWA frontend to Magento, this is broken. Vue-storefront has 2 repos of its own in addition to the Magento repo, all becoming one app.
Another violation happens when a plugin vendor gets ssh access to your live server to fix a plugin issue. Plugin vendors have a serious problem integrating their code into multiple source bases without Magento supporting a versioned plugin architecture out-of-the-box.

2. Dependencies. Explicitly declare and isolate dependencies.
With composer Magento solves this problem.
Violation of plugins is a case in point – many plugins are installed not as composer dependencies. Instead they make it to the merchant repo.

Magento uses php and typical websites are deployed using php-fpm. One may argue that the php-fpm plugins that Magento depends on are not explicitly declared. Leading to the application not working exactly in 2 environments. Another case in point is dependency on php version.

3. Config. Store config in the environment.
12 factor app requires environment variables to be used. Magento has split application and environment configuration between config.php and env.php.
Here is what 12 factor says.
“Another approach to config is the use of config files which are not checked into revision control, such as config/database.yml in Rails. This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific.”

However, Magento has worked towards this. Specifically, with bin/magento config:set and bin/magento config:sensitive:set commands are a useful way for hosting providers to be 12 factor compliant.

luroConnect has always stored hosting configuration settings separately from the release. Upon deployment of code, the contents of deployment folder are copied. Sometimes they have settings for the application. These include hosting specific as well as sensitive settings. We are moving to using config:set and config:sensitive:set for versions of Magento that support it. We will also move towards storing sensitive variables in secure key stores.

4. Backing services. Treat backing services as attached resources.
“Resources can be attached to and detached from deploys at will.”

While Magento is very good at storing key connections outside the application and database, violations exist in 3rdparty plugins. To “ease” the deployment most store credentials and connectivity details in the database. Another issue is with SMTP plugins, instead of depending on magento’s default use of localhost and let postfix configuration manage the actual email sending, developers see the convenience of storing this information in the database.

Check out this post on SMTP and postfix configurations.

5. Build, release, run. Separate build and run stages.
Magento has been improving the code deployment process. The setup upgrade is the only command that, if needed, requires the site under maintenance.

6. Processes. Execute the app as one or more stateless processes.
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.

Magento is very good on this count if used with nginx and php-fpm.

7. Port binding. Export services as port binding.
“PHP apps might run as a module inside Apache HTTPD” is flagged as a violation if the apache is also used as a webserver.
nginx + php-fpm gives the best isolation and performance of any stack. Php processes can be independently controlled in a server running php-fpm while nginx can be used for routing and handling web requests, terminating SSL, etc.

8. Concurrency. Scale out via a process model.
Magento is very good at this. Aided by php-fpm process model that complies with the 12 factor app, it is possible to build a cluster to handle only checkout urls for example, with routing handled by an application load balancer such as nginx.

9. Disposability. Maximize robustness with fast startup and graceful shutdown.
While Magento and php are good at this, some notes are in order.
A reload of php-fpm by default will kill all php processes even though they may be executing a request. Ensuring no new traffic is coming to the php-fpm, and waiting for draining by checking the status for number of active processes (with a timeout ofcourse) will ensure gracefulness in shutdown.
In order to ensure robustness against sudden death of the php-fpm process, it is best to keep the queue length (listen.backlog) to a small number. Turns out managing the queue to scaleout helps in application performance as well.

10. Dev/prod parity. Keep development, staging, and production as similar as possible.
The 12 factor app describes 3 gaps – time, personnel and tools. Based on our experience, the personnel gap is eliminated by automation. A commit trigger based automated CI/CD pipeline with an automated deploy to staging and production ensures there is no personnel gap.

A development environment with write access to git can be created with a similar infrastructure to help developers debug issues.

11. Logs. Treat logs as event streams.
Magento allows creation of multiple log files. Modern logging such as monolog allows more control of what is and what isn’t logged. Logs are also generated by nginx, php-fpm and other services used.
Streaming logs for querying and analysis is typically done by your hosting provider.

luroConnect uses fluentd to capture logs. Logs are sent to our Insight service, which analyzes data per minute, hour or day.

12. Admin processes. Run admin/management tasks as one-off processes.
Magento supports cron and rabbitmq based processes. In addition, setup upgrade is also used to change the state of the database during deployment.
However, suggested access to developer for “run arbitrary code or inspect the app’s models against the live database” is not recommended by luroConnect due to security and the risks of the application stability with the state being altered arbitrarily.