Author - Pradip Shah

PWA - Progressive Web Apps

Progressive Web Apps in Magento 2 – First Opinion

Magento released 2.3 and that included PWA Studio to help build progressive web apps.

Progressive Web Apps is a technology that allows a website to act like an app on a mobile device – in that it does not need continuous connectivity to the internet. This gives the benefit of easy and quick navigation.

When it comes to Magento software technology, there is no better person than Hashid Hameed, CEO of Codilar to get some first hand information. Long before the official release, their team has been playing around with PWA technology so they are ready when the official announcement actually came. Here is excerpts from the interview :

luroConnect : What type of Magento stores is PWA most suitable for?

Hashid : PWA is suitable for all types of stores regardless of the industry and target audience. However it will be a must-have in the near future for stores who have significant mobile audience. With the growing experiential demands of the modern shopper and limited reach & high cost of native apps, PWA will become the saviour for such stores.

luroConnect : In order to make the most of this technology stores have to upgrade to Magento 2.3 – how much of a effort is that? (Hey we just moved from 1.x not long ago!)

Hashid: Haha, you don’t have to worry. It’s just like another incremental build to the existing platform. Magento 2.3 is one of the most promising upgrades so far to the platform. It adds in a a lot of new features. However possibilities of conflicts with custom code and non-supported extensions cannot be ruled out.

luroConnect : OK, I am on Magento 2.3, how long before I can get a PWA app up and running?

Hashid : Magento has released a PWA Studio for developers to speed up the PWA development. Roughly, a PWA can go live in 2-3 months for a usual Magento store.

luroConnect : Will my PWA app share my website theme?

Hashid : Not really. PWA will have its own theme. Magento has a default theme called Venia for PWA.

luroConnect : Do I need to tell my website visitor to install an app?

Hashid : Certainly not. Not having to install an app is one of the biggest advantage of PWA. You get an app-like experience without installing an app, how cool is that? Plus users can easily add the PWA to their homescreen and use it like a regular app. Check out the demo below and see for yourself.

luroConnect : How is PWA different from AMP?

Hashid : AMP is great for blogs and news websites where the main content is usually static. Amp comes with a lot of feature restrictions which is simply not suitable for a highly dynamic needs of an online store. For example, to be accepted as an AMP page and enable Google AMP network to deliver the cached AMP pages, developers must use Google’s library of approved HTML, Javascript, CSS and analytics tags. You will not be able to use A/B Testing, personalization, recommendations etc.

On the other hand, PWA is completely in the hands of the developer. There is no restriction at all.

luroConnect : Does my desktop experience enhance with PWA or is it just mobile?

Hashid : Though PWA talks more about mobile experience, most benefits can be reaped for desktop as well. Faster load time (no page loads), easier  product discovery, intuitive navigation are some of these. However, majority of the merchants are mostly likely to roll-out the PWA only for mobile audience initially.

luroConnect : If I have updated the category or price how quickly can my PWA store know?

Hashid : Ideally, there should not be any delay as the PWA directly talks to Magento’s new GraphQL API. If the API responses are cached for even faster response by the developers, there could be a delay. But this is totally in the hands of the developer. It depends on the store requirement.

luroConnect : How about stock updates? I want that information to be out in the forefront.

Hashid : The above answer applies here as well.

luroConnect : Can a checkout happen without internet connectivity?

Hashid : Yes, in an ideal PWA, the complete checkout can happen offline. The order will be synced in Magento when the internet connectivity is back. Just like, how some of the cloud POS systems work offline. However, having an offline checkout could lead to other complexities in business such as ensuring stock availability.

If you want to try out the new feature, Codilar has arranged a demo site https://www.codilar.com/blog/magento-2-pwa-demo-venia/ If you want to get in touch with Hashid email to hello@codilar.com

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Analysis of recent website security breaches

Introduction

Recent high profile website security breaches are nice to study to ensure you have defense mechanisms in place.
We strongly believe in “defense in depth” or “multi layered security”. This includes

  • Defense in Multiple Places. Meaning that an organization need to deploy security mechanisms in different places.
  • Layered Defenses. This means to have many layers of defense.

Let us analyze these hacks and lessons we can learn.

Analysis of 4 hacks to known websites

britishairways.com does not even store credit cards!

British Airways eCommerce site britishairways.com reported leak of 380,000 credit cards.

Let us first look at their privacy policy

Under “How we secure your payment information when you book online” it mentions

  • All of your personal information is encrypted as it travels over the Internet, to and from www.ba.com. When information is encrypted, it is scrambled between your computer and our server. The information is only unscrambled when it safely reaches us. It’s fast and safe, and it ensures that your personal information cannot be read by anyone else.
  • When you send your personal details to us, none of the information is stored on the website, it is passed straight back to our secure servers at our Heathrow headquarters, where it only exists as part of the record of your transaction.

Factually, what does “none of the information is stored on the website” mean?

Also based on the clarification given by BA we can assume they do not store credit card information along with the customer data.

The breach :

  • Magecart created a secure fake site baways.com which was used to store the siphoned credit card information
  • As is typical of magecart they broke into the britishairways.com servers, possibly through an admin interface to change a javascript file (modernizr.js), an open source library used on the website. Magecart inserted a short javascript that would monitor keystrokes and report back to baways.com that they owned.

That clearly indicates there was a breach. However, if one assumes the main servers of britishairways.com are well protected, there can be 2 possible options of breakin

  • It is possible in the website to use a administrative interface to change or upload a javascript file that gets used in the build
  • Or, they were able to gain access to the build process of britishairways website and ensure a modified version of the required javascript file was inserted.

What lessons can we learn?

  1. Admin interface to either update HTML or javascript code directly is a bad idea. If your environment has that convenience, it is important to protect it. Protection can be IP based so only certain IPs can access or be hidden in general and unlocked if needed.
    It is also important to periodically check if HTML or JavaScript stored in the database is clean. For example do not allow any JavaScript tag in the product description.

Live site should be hosted to ensure directories from where any code is executed is not allowed to be written to.

The breach :

Due to a vulnerability in “Apache Struts” that was exploited, access to execute any shell script was given and the hacker was able to download the entire credit history data stored.

Exact vulnerability in Apache struts – “incorrect exception handling and error-message generation during file-upload attempts, which allows remote attackers to execute arbitrary commands via a crafted Content-Type, Content-Disposition, or Content-Length HTTP header”

Patch was available but was not implemented on the server

What lessons can we learn?

  • While the obvious lesson is to update when patches are available, we think that it is not always possible or there will atleast be a lag between the patch being released and the website being patched.
  • A web user – the user that the web application is running as – should have a need to know access permissions.
  • Web root should be clean – do not have temporary folders or files that are not required – or limit permissions of the web user.
  • Monitor network activity for any unusual pattern – download of 143m records should not go unnoticed.
  • Use reverse proxy as we web server (like nginx) and do not expose the actual application servers to the internet. Infact, ensure the web server is a separate container or instance that only serves static content and passes upstream requests to the application server.

The breach :

As per facebook …

The vulnerability was the result of a complex interaction of three distinct software bugs and it impacted “View As,” a feature that lets people see what their own profile looks like to someone else. It allowed attackers to steal Facebook access tokens, which they could then use to take over people’s accounts. Access tokens are the equivalent of digital keys that keep people logged in to Facebook so they don’t need to re-enter their password every time they use the app.

First, the attackers already controlled a set of accounts, which were connected to Facebook friends. They used an automated technique to move from account to account so they could steal the access tokens of those friends, and for friends of those friends, and so on, totaling about 400,000 people. In the process, however, this technique automatically loaded those accounts’ Facebook profiles, mirroring what these 400,000 people would have seen when looking at their own profiles. That includes posts on their timelines, their lists of friends, Groups they are members of, and the names of recent Messenger conversations. Message content was not available to the attackers, with one exception…

The attackers used a portion of these 400,000 people’s lists of friends to steal access tokens for about 30 million people

What lessons can we learn?

  • Do not generic and hidden tokens that give cross access to APIs. Instead, each token should have a ACL (Access Control List) and select the minimum ACL required.
  • While this is particularly true for web services, in eCommerce websites payment gateway tokens should not be sent to frontend until required, even in hidden fields.

The breach :

  • On the evening of Saturday, June 23rd (2018), we received notice from our customers at Ticketmaster that the personal data of their users had been compromised. Upon further investigation by both parties, it was confirmed that the source of the data breach was a single piece of JavaScript code, customized by Inbenta to serve Ticketmaster’s particular requests. The attacker(s) located, modified, and used this customized script to extract the payment information of Ticketmaster customers processed between February and June 2018.
  • After a careful analysis of all clues and snapshots from our systems, the technical team at Inbenta discovered that the script had been implemented on the payment page. We were unaware of this, and would have advised against doing so had we known, as it presents a point of vulnerability.
  • Inbenta has conducted a detailed analysis of all the file systems used for development and production systems, thoroughly analysing any difference between the original source code and the version in the production environment. We can confirm that just 3 files were altered that affected 3 specific websites for Ticketmaster.
  • One of the advantages of hosting scripts at Inbenta’s servers that are embedded in our customer’s website is the flexibility that Inbenta can offer to our customers to have new functionalities or updates up and running quickly. The downside is that we cannot monitor which web pages our customers are embedding those scripts on and therefore we cannot prevent customers putting them in pages that collect sensitive information.

What lessons can we learn?

  • Avoid including 3rd party libraries from 3rd party servers. Implementing this is a matter of size sometimes – if the vendor is bigger and changes the content often, this cannot be done. However, there are many instances that can be avoided.
  • Ensure version in original source code and that in the production environment are the same, else there has been a breach.
  • Production site should be hosted to ensure directories from where any code is executed is not allowed to be written to.
  • Secure the CI/CD process so files in version control are not modified on the way. Some need to be compiled / compressed, etc but the possibility of inserting code has to be prevented. Securing CI/CD process is important.

Conclusion : Do not take Security for granted

Just having a SSL or a WAF is not enough security. Even having PCI certification is not enough security.

Instead, building security in mulitple layers, following first principles of security such as “need to access” and monitoring are needed. Since we cannot assume there is full proof security, we have to actively both prevent and know early if there is a breach.

To secure your Magento website, refer our article “Enterprise Grade Security For Your Magento Website

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

The DIY Guide to Magento Hosting

Introduction

Quite often we depend on the hosting provider to give us a fast website. But with the availability of affordable hosting options in terms of VPC that requires some DIY skills to setup.

In this DIY guide we help you setup a high performance Magento hosting environment. This guide includes configurations and tips. As with any DIY, you need to have some knowledge and tools. We expect that you have a basic working knowledge of linux and its commands, access to root, ability to install standard packages  as well use of a text editor in linux – such as vim. An idea of users, uids, groups, gids and process will be useful for trouble shooting if you need.

Much like getting a car with a good engine is only the starting point if you love fast cars, so is a good server if you want great page load times. A good transmission would be the next thing you need to look for. Transmission in hosting is equivalent to the components of hosting and how they communicate.

Choosing a hosting provider

All hosting providers claim best hardware but sell you on price. How do you compare one from the other? Knowing how they work and some tests may help!

How clouds and virtualization work : Todays virtualization technology allows hosting providers to share a larger computing resource they own amongst many customers. The technology also allows overcommit – much like an airline that would overbook seats in the hope someone cancels, a hosting provider can overcommit resources such as CPU in the hope that not all customers would use all CPUs given at the same time. However, unlike an airline, virtualization technology can deteriorate performance without failing.

Testing the resources : We have a quick way to test the quality of your hardware. We measure the speed of RAM and disk in these simple tests. In each case we write data serially to a chunk of memory or disk and measure the performance. Warning : Free memory and disk are required for the test.

The memory test

(tempDir=`mktemp -d -t linux-benchmark-XXX`; mount -t tmpfs -o size=128m $tempDir $tempDir; (dd if=/dev/zero of=$tempDir/test_ bs=1M count=128 conv=fdatasync &&rm -f $tempDir/test_)

The disk test
This test gives you a momentary result. Run the test multiple times to get a good average. Anything below 150Mbps is slow. A typically SSD disk should give about 450Mbps, good ones above 700Mbps.

tempFile=<drive>/disktest; dd if=/dev/zero of=$tempFile bs=1M count=1024 conv=sync oflag=direct; rm -f $tempFile

The architecture

We will assume a simple architecture – all services on the same server. This is for simplicity and we have seen it to work well with sites on physical hardware with rather consistent traffic in a 24 hour period.

There can be alternate architectures

  • separate db server
  • multiple app servers – service all traffic in front of a load balancer
  • separate app server – say for admin URLs with a path based load balancer

Each of these bring additional complexities in configurations. Please use the comments section below if you have a more complex architecture.

The nginx web server

Note : We use nginx – read our blog article on why we prefer nginx over apache.
The basic nginx configuration for Magento is available
Click here.

## define both backends
## upstream  socketbackend{
  server unix:/var/run/php-fcgi-www.sock;
}
upstream tcpbackend{
  server 127.0.0.1:9000;
}
server {
  listen 443 ssl http2;
  server_name example.com www.example.com;
  root /home/example/www;
  ssl on;
  ssl_certificate       /etc/pki/tls/certs/example_certs/www_example.com.crt;
  ssl_certificate_key   /etc/pki/tls/certs/example_certs/www_example_com.key;
  ssl_protocols         TLSv1 TLSv1.1 TLSv1.2;
  ssl_prefer_server_ciphers on;

  ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:ECDHE-RSA-AES128-GCM-SHA256:AES256+EECDH:DHE-RSA-AES128-GCM-SHA256:AES256+EDH:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";  
  ssl_dhparam /etc/nginx/dhparams.pem;

  access_log /var/log/nginx/example.access.log;
  error_log /var/log/nginx/example.error.log;
  #/* deny should come first */
  location ^~ /app/ { deny all; }
  location ^~ /bin/ { deny all; }
  location ^~ /lib/ { deny all; }
  location ^~ /media/downloadable/ { deny all; }
  location ^~ /dev/ { deny all; }
  location ^~ /vendor/ { deny all; }
  location ^~ /update/ { deny all; }
  location ^~ /var/ { deny all; }
  location ^~ /downloader/ { deny all; }
  location ^~ /admin/ { deny all; }
  location ^~ /phpserver/ { deny all; }
  location ^~ /setup/ { deny all; }
  location ^~ /run/ { deny all; }
  #/* system . files */
  location /. {
    return 404;
  }
  #/* main php handler */
  location / {
    index index.php index.html;
    try_files $uri $uri/ @handler;
  }
  #/* set long expiry for static content. Update types as needed */
    location ~* \.(jpg|jpeg|gif|png|css|js|ico|swf|woff2|svg|TTF)$ {
    access_log off;
    log_not_found off;
    expires 360d;
  }
  #/* directories where you do not want php to be executed from */
  location ~* /(images|cache|media|skin|js|uploads|logs|tmp)/.*\.(php|pl|py|jsp|asp|sh|cgi)$ {
    return 403;
  }

  #/* we you need setup remove from deny list */
  location /setup {
    try_files $uri $uri/ @setuphandler;
  }
  # Rewrite Setup's Internal Requests
    location @setuphandler {
    rewrite /setup /setup/index.php;
  }
  # Rewrite Internal Requests
  location @handler {
    rewrite / /index.php;
  }
  location /pub/static {
    try_files $uri $uri/ @static;
  }
  location @static {
    rewrite ^/pub/static/(.*)$ /pub/static.php?resource=$1? last;
  }
  location ~ \.php$ { ## Execute PHP scripts
    try_files $uri =404;
    expires off;
    fastcgi_pass socketbackend;
    #fastcgi_pass tcpbackend;
    fastcgi_read_timeout 1s;
    fastcgi_param HTTPS $fastcgi_https;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
  }

Php-fpm – the php interpreter

An important factor is to not overcrowd the CPU and memory you may have, as the default configuration does. The main configuration file is www.conf and specifically the section on servers.
Click here is a production php-fpm.d/www.conf file

; Start a new pool named 'www'.
[www]

; Unix user/group of processes
; luroConnect : 2-user system requires the php-fpm and nginx processes run as the same user
; give the nginx group read only access to the hosting directory typically in /home//www
; if a directory needs to be written into by php (such as upload, consider media, var) those directories
; need write permission for groups
user = nginx
group = nginx

; The address on which to accept FastCGI requests.
; Valid syntaxes are:
;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific IPv4 address on
;                            a specific port;
;   '[ip:6:addr:ess]:port' - to listen on a TCP socket to a specific IPv6 address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses
;                            (IPv6 and IPv4-mapped) on a specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
; luroConnect : Match listen to the value in nginx. Single server system can use socket
; listen = 9000
listen /var/run/php-fcgi-www.sock

; Set listen(2) backlog.
; Default Value: 511 (-1 on FreeBSD and OpenBSD)
;listen.backlog = 511

; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions.
; Default Values: user and group are set as the running user
;                 mode is set to 0660
;listen.owner = nobody
;listen.group = nobody
;listen.mode = 0660
; When POSIX Access Control Lists are supported you can set them using
; these options, value is a comma separated list of user/group names.
; When set, listen.owner and listen.group are ignored
;listen.acl_users =
;listen.acl_groups =

; List of addresses (IPv4/IPv6) of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any
; 
;listen.allowed_clients = 127.0.0.1

; Specify the nice(2) priority to apply to the pool processes (only if set)
; The value can vary from -19 (highest priority) to 20 (lower priority)
; Note: - It will only work if the FPM master process is launched as root
;       - The pool processes will inherit the master process priority
;         unless it specified otherwise
; Default Value: no set
; process.priority = -19

; Choose how the process manager will control the number of child processes.
; Possible Values:
;   static  - a fixed number (pm.max_children) of child processes;
;   dynamic - the number of child processes are set dynamically based on the
;             following directives. With this process management, there will be
;             always at least 1 children.
;             pm.max_children      - the maximum number of children that can
;                                    be alive at the same time.
;             pm.start_servers     - the number of children created on startup.
;             pm.min_spare_servers - the minimum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is less than this
;                                    number then some children will be created.
;             pm.max_spare_servers - the maximum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is greater than this
;                                    number then some children will be killed.
;  ondemand - no children are created at startup. Children will be forked when
;             new requests will connect. The following parameter are used:
;             pm.max_children           - the maximum number of children that
;                                         can be alive at the same time.
;             pm.process_idle_timeout   - The number of seconds after which
;                                         an idle process will be killed.
; Note: This value is mandatory.

; luroConenct : We set to static so we can dedicate CPU resources to php
; Create separate pools for processing URLs with different levels of CPU utilization
; For example if you have URLs that do a CURL,  make a curl pool. These will consume less CPU.
pm = static

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes when pm is set to 'dynamic' or 'ondemand'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI.
; Note: Used when pm is set to 'static', 'dynamic' or 'ondemand'
; Note: This value is mandatory.
; luroConnect : we set it to slightly more than the cores available for php.
; 5 is ideal for a 4 cores reserved for php.
pm.max_children = 5

; The number of child processes created on startup.
; Note: Used only when pm is set to 'dynamic'
; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2
pm.start_servers = 5

; The desired minimum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.min_spare_servers = 5

; The desired maximum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.max_spare_servers = 35

; The number of seconds after which an idle process will be killed.
; Note: Used only when pm is set to 'ondemand'
; Default Value: 10s
;pm.process_idle_timeout = 10s;

; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
; luroConnect : Magento and wordpress need a limit as inadvertently there are memory leaks
; set this number based on hits and the avg process size that is available through the status page
pm.max_requests = 500

; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. It shows the following informations:
;   pool                 - the name of the pool;
;   process manager      - static, dynamic or ondemand;
;   start time           - the date and time FPM has started;
;   start since          - number of seconds since FPM has started;
;   accepted conn        - the number of request accepted by the pool;
;   listen queue         - the number of request in the queue of pending
;                          connections (see backlog in listen(2));
;   max listen queue     - the maximum number of requests in the queue
;                          of pending connections since FPM has started;
;   listen queue len     - the size of the socket queue of pending connections;
;   idle processes       - the number of idle processes;
;   active processes     - the number of active processes;
;   total processes      - the number of idle + active processes;
;   max active processes - the maximum number of active processes since FPM
;                          has started;
;   max children reached - number of times, the process limit has been reached,
;                          when pm tries to start more children (works only for
;                          pm 'dynamic' and 'ondemand');
; Value are updated in real time.
; Example output:
;   pool:                 www
;   process manager:      static
;   start time:           01/Jul/2011:17:53:49 +0200
;   start since:          62636
;   accepted conn:        190460
;   listen queue:         0
;   max listen queue:     1
;   listen queue len:     42
;   idle processes:       4
;   active processes:     11
;   total processes:      15
;   max active processes: 12
;   max children reached: 0
;
; By default the status page output is formatted as text/plain. Passing either
; 'html', 'xml' or 'json' in the query string will return the corresponding
; output syntax. Example:
;   http://www.foo.bar/status
;   http://www.foo.bar/status?json
;   http://www.foo.bar/status?html
;   http://www.foo.bar/status?xml
;
; By default the status page only outputs short status. Passing 'full' in the
; query string will also return status for each pool process.
; Example:
;   http://www.foo.bar/status?full
;   http://www.foo.bar/status?json&full
;   http://www.foo.bar/status?html&full
;   http://www.foo.bar/status?xml&full
; The Full status returns for each process:
;   pid                  - the PID of the process;
;   state                - the state of the process (Idle, Running, ...);
;   start time           - the date and time the process has started;
;   start since          - the number of seconds since the process has started;
;   requests             - the number of requests the process has served;
;   request duration     - the duration in µs of the requests;
;   request method       - the request method (GET, POST, ...);
;   request URI          - the request URI with the query string;
;   content length       - the content length of the request (only with POST);
;   user                 - the user (PHP_AUTH_USER) (or '-' if not set);
;   script               - the main script called (or '-' if not set);
;   last request cpu     - the %cpu the last request consumed
;                          it's always 0 if the process is not in Idle state
;                          because CPU calculation is done when the request
;                          processing has terminated;
;   last request memory  - the max amount of memory the last request consumed
;                          it's always 0 if the process is not in Idle state
;                          because memory calculation is done when the request
;                          processing has terminated;
; If the process is in Idle state, then informations are related to the
; last request the process has served. Otherwise informations are related to
; the current request being served.
; Example output:
;   ************************
;   pid:                  31330
;   state:                Running
;   start time:           01/Jul/2011:17:53:49 +0200
;   start since:          63087
;   requests:             12808
;   request duration:     1250261
;   request method:       GET
;   request URI:          /test_mem.php?N=10000
;   content length:       0
;   user:                 -
;   script:               /home/fat/web/docs/php/test_mem.php
;   last request cpu:     0.00
;   last request memory:  0
;
; Note: There is a real-time FPM status monitoring sample web page available
;       It's available in: @EXPANDED_DATADIR@/fpm/status.html
;
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
pm.status_path = /status

; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
ping.path = /ping

; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong

; The access log file
; Default: not set
; luroConenct : access log is useful for measuring CPU utilization percent of a request
; lower percent indicates either high mysql or curl calls
; if no curl calls, check the mysql cache and tune
access.log = log/$pool.access.log

;access.format = %R - %u %t %m %r%Q%q %s %f %{mili}d %{kilo}M %{total}C%%
access.format = %R - %u %t %m %{REQUEST_URI}e %r%Q%q %s %f %{mili}d %{kilo}M %{total}C%%

; The access log format.
; The following syntax is allowed
;  %%: the '%' character
;  %C: %CPU used by the request
;      it can accept the following format:
;      - %{user}C for user CPU only
;      - %{system}C for system CPU only
;      - %{total}C  for user + system CPU (default)
;  %d: time taken to serve the request
;      it can accept the following format:
;      - %{seconds}d (default)
;      - %{miliseconds}d
;      - %{mili}d
;      - %{microseconds}d
;      - %{micro}d
;  %e: an environment variable (same as $_ENV or $_SERVER)
;      it must be associated with embraces to specify the name of the env
;      variable. Some exemples:
;      - server specifics like: %{REQUEST_METHOD}e or %{SERVER_PROTOCOL}e
;      - HTTP headers like: %{HTTP_HOST}e or %{HTTP_USER_AGENT}e
;  %f: script filename
;  %l: content-length of the request (for POST request only)
;  %m: request method
;  %M: peak of memory allocated by PHP
;      it can accept the following format:
;      - %{bytes}M (default)
;      - %{kilobytes}M
;      - %{kilo}M
;      - %{megabytes}M
;      - %{mega}M
;  %n: pool name
;  %o: output header
;      it must be associated with embraces to specify the name of the header:
;      - %{Content-Type}o
;      - %{X-Powered-By}o
;      - %{Transfert-Encoding}o
;      - ....
;  %p: PID of the child that serviced the request
;  %P: PID of the parent of the child that serviced the request
;  %q: the query string
;  %Q: the '?' character if query string exists
;  %r: the request URI (without the query string, see %q and %Q)
;  %R: remote IP address
;  %s: status (response code)
;  %t: server time the request was received
;      it can accept a strftime(3) format:
;      %d/%b/%Y:%H:%M:%S %z (default)
;      The strftime(3) format must be encapsuled in a %{}t tag
;      e.g. for a ISO8601 formatted timestring, use: %{%Y-%m-%dT%H:%M:%S%z}t
;  %T: time the log has been written (the request has finished)
;      it can accept a strftime(3) format:
;      %d/%b/%Y:%H:%M:%S %z (default)
;      The strftime(3) format must be encapsuled in a %{}t tag
;      e.g. for a ISO8601 formatted timestring, use: %{%Y-%m-%dT%H:%M:%S%z}t
;  %u: remote user
;
; Default: "%R - %u %t \"%m %r\" %s"
;access.format = "%R - %u %t \"%m %r%Q%q\" %s %f %{mili}d %{kilo}M %C%%"

; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
slowlog = log/$pool-slow.log

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
; luroConnect : use for debugging slow access
; request_slowlog_timeout = 1

; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_terminate_timeout = 0

; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024

; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0

; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: chrooting is a great security feature and should be used whenever
;       possible. However, all PHP paths will be relative to the chroot
;       (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =

; Chdir to this directory at the start.
; Note: relative path can be used.
; Default Value: current directory or / when chroot
;chdir = /var/www

; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Note: on highloaded environement, this can cause some delay in the page
; process time (several ms).
; Default Value: no
;catch_workers_output = yes

; Clear environment in FPM workers
; Prevents arbitrary environment variables from reaching FPM worker processes
; by clearing the environment in workers before env vars specified in this
; pool configuration are added.
; Setting to "no" will make all environment variables available to PHP code
; via getenv(), $_ENV and $_SERVER.
; Default Value: yes
;clear_env = no

; Limits the extensions of the main script FPM will allow to parse. This can
; prevent configuration mistakes on the web server side. You should only limit
; FPM to .php extensions to prevent malicious users to use other extensions to
; exectute php code.
; Note: set an empty value to allow all extensions.
; Default Value: .php
;security.limit_extensions = .php .php3 .php4 .php5 .php7

; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp

; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
;   php_value/php_flag             - you can set classic ini defines which can
;                                    be overwritten from PHP call 'ini_set'.
;   php_admin_value/php_admin_flag - these directives won't be overwritten by
;                                     PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.

; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.

; Default Value: nothing is defined by default except the values in php.ini and
;                specified at startup with the -d argument
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f www@my.domain.com
;php_flag[display_errors] = off
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 128M

; Set session path to a directory owned by process user
php_value[session.save_handler] = files

; Note : Important to make /var/lib/php/session writable by nginx user
php_value[session.save_path]    = /var/lib/php/session
php_value[soap.wsdl_cache_dir]  = /var/lib/php/wsdlcache

SSL Certificate

Getting a https certificate is now easy with letsEncrypt.

The nginx configuration enables http2, which is a newer version of the http protocol. Refer our blog article on advantage of http2 for Magento.

Mysql

There are many flavors available based on the original open source. We use percona mysql. Configurations may vary depend on the version but equivalent should be available.

Database performance depends on cache configuration based on available memory, a fast disk and your Magento indexing settings. Magento sites have a high percentage of reads to writes to the database – i.e. each product is browsed many times compared to the times you change them. As a result, mysql can perform really well as long as the cache hit rates are high. The following parameters determine the cache configuration.

query_cache_type = 1

; set query cache size to total RAM you can assign to query cache. Magento can use a lot of query cache
query_cache_size = 4G

; max size of an individual query that is cached. In Magento the category menu is often the largest query – ensure that is cached
query_cache_limit = 128M

; generally equal to the size of your database – ensures the database is stored in RAM
innodb_buffer_pool_size = 3G

The values depend on your website – such as number of categories displayed in the main menu or the length of the description of a product. The following queries help in determining the cache performance.

show status like 'qcache_hits'
show status like 'Qcache_inserts'
show status like 'Qcache_not_cached'
show status like 'Qcache_lowmem_prunes

Make special note of the parameter Qcache_lowmem_prunes that tells that a query was not cached as there was insufficient memory. This typically means an increase in query_cache_limit or query_cache_size may be required.

Redis Configuration for cache and session

Click here for sample configuration file

# By default Redis does not run as a daemon. Use 'yes' if you need it.
# Note that Redis will write a pid file in /var/run/redis.pid when daemonized.
daemonize yes

# When running daemonized, Redis writes a pid file in /var/run/redis.pid by
# default. You can specify a custom pid file location here.
pidfile /var/run/redis/redis.pid

# Accept connections on the specified port, default is 6379.
# If port 0 is specified Redis will not listen on a TCP socket.
# luroConnect - 6379 for Magento cache, 6380 for session
port 6379

# If you want you can bind a single interface, if the bind option is not
# specified all the interfaces will listen for incoming connections.
#
bind 0.0.0.0

# Specify the path for the unix socket that will be used to listen for
# incoming connections. There is no default, so Redis will not listen
# on a unix socket when not specified.
#
# unixsocket /tmp/redis.sock
# unixsocketperm 755

# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0

# Set server verbosity to 'debug'
# it can be one of:
# debug (a lot of information, useful for development/testing)
# verbose (many rarely useful info, but not a mess like the debug level)
# notice (moderately verbose, what you want in production probably)
# warning (only very important / critical messages are logged)
loglevel notice

# Specify the log file name. Also 'stdout' can be used to force
# Redis to log on the standard output. Note that if you use standard
# output for logging but daemonize, logs will be sent to /dev/null
logfile /var/log/redis/redis.log

# To enable logging to the system logger, just set 'syslog-enabled' to yes,
# and optionally update the other syslog parameters to suit your needs.
# syslog-enabled no

# Specify the syslog identity.
# syslog-ident redis

# Specify the syslog facility.  Must be USER or between LOCAL0-LOCAL7.
# syslog-facility local0

# Set the number of databases. The default database is DB 0, you can select
# a different one on a per-connection basis using SELECT  where
# dbid is a number between 0 and 'databases'-1
# luroConnect : set databases to 1 - we use a separate redis process  for each type
databases 1

################################ SNAPSHOTTING  #################################
#
# Save the DB on disk:
#
#   save  
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving at all commenting all the "save" lines.

# save 900 1
# save 300 10
# save 60 10000
# luroConnect - for Magento disable save except for session
# without this a restart will cause recently added items to guest cart to disappear
# or a user to get logged out
# session save every minute 
# save 60
# FPC and Magento cache - do not ssave. Cost of saving in a busy site can be high
# redis restarts only on service start which is not often

# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
# 
# Also the Append Only File will be created inside this directory.
# 
# Note that you must specify a directory here, not a file name.
dir /var/lib/redis/

################################# REPLICATION #################################

# Master-Slave replication. Use slaveof to make a Redis instance a copy of
# another Redis server. Note that the configuration is local to the slave
# so for example it is possible to configure the slave to save the DB with a
# different interval, or to listen to another port, and so on.
#
# slaveof  

# If the master is password protected (using the "requirepass" configuration
# directive below) it is possible to tell the slave to authenticate before
# starting the replication synchronization process, otherwise the master will
# refuse the slave request.
#
# masterauth 

# When a slave lost the connection with the master, or when the replication
# is still in progress, the slave can act in two different ways:
#
# 1) if slave-serve-stale-data is set to 'yes' (the default) the slave will
#    still reply to client requests, possibly with out of data data, or the
#    data set may just be empty if this is the first synchronization.
#
# 2) if slave-serve-stale data is set to 'no' the slave will reply with
#    an error "SYNC with master in progress" to all the kind of commands
#    but to INFO and SLAVEOF.
#
slave-serve-stale-data yes

# Slaves send PINGs to server in a predefined interval. It's possible to change
# this interval with the repl_ping_slave_period option. The default value is 10
# seconds.
#
# repl-ping-slave-period 10

# The following option sets a timeout for both Bulk transfer I/O timeout and
# master data or ping response timeout. The default value is 60 seconds.
#
# It is important to make sure that this value is greater than the value
# specified for repl-ping-slave-period otherwise a timeout will be detected
# every time there is low traffic between the master and the slave.
#
# repl-timeout 60

################################## SECURITY ###################################

# Require clients to issue AUTH  before processing any other
# commands.  This might be useful in environments in which you do not trust
# others with access to the host running redis-server.
#
# This should stay commented out for backward compatibility and because most
# people do not need auth (e.g. they run their own servers).
# 
# Warning: since Redis is pretty fast an outside user can try up to
# 150k passwords per second against a good box. This means that you should
# use a very strong password otherwise it will be very easy to break.
#
# requirepass foobared

# Command renaming.
#
# It is possilbe to change the name of dangerous commands in a shared
# environment. For instance the CONFIG command may be renamed into something
# of hard to guess so that it will be still available for internal-use
# tools but not available for general clients.
#
# Example:
#
# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
#
# It is also possilbe to completely kill a command renaming it into
# an empty string:
#
# rename-command CONFIG ""

################################### LIMITS ####################################

# Set the max number of connected clients at the same time. By default there
# is no limit, and it's up to the number of file descriptors the Redis process
# is able to open. The special value '0' means no limits.
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
#
# maxclients 128

# Don't use more memory than the specified amount of bytes.
# When the memory limit is reached Redis will try to remove keys
# accordingly to the eviction policy selected (see maxmemmory-policy).
#
# If Redis can't remove keys according to the policy, or if the policy is
# set to 'noeviction', Redis will start to reply with errors to commands
# that would use more memory, like SET, LPUSH, and so on, and will continue
# to reply to read-only commands like GET.
#
# This option is usually useful when using Redis as an LRU cache, or to set
# an hard memory limit for an instance (using the 'noeviction' policy).
#
# WARNING: If you have slaves attached to an instance with maxmemory on,
# the size of the output buffers needed to feed the slaves are subtracted
# from the used memory count, so that network problems / resyncs will
# not trigger a loop where keys are evicted, and in turn the output
# buffer of slaves is full with DELs of keys evicted triggering the deletion
# of more keys, and so forth until the database is completely emptied.
#
# In short... if you have slaves attached it is suggested that you set a lower
# limit for maxmemory so that there is some free RAM on the system for slave
# output buffers (but this is not needed if the policy is 'noeviction').
#
# maxmemory 

# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached? You can select among five behavior:
# 
# volatile-lru -> remove the key with an expire set using an LRU algorithm
# allkeys-lru -> remove any key accordingly to the LRU algorithm
# volatile-random -> remove a random key with an expire set
# allkeys->random -> remove a random key, any key
# volatile-ttl -> remove the key with the nearest expire time (minor TTL)
# noeviction -> don't expire at all, just return an error on write operations
# 
# Note: with all the kind of policies, Redis will return an error on write
#       operations, when there are not suitable keys for eviction.
#
#       At the date of writing this commands are: set setnx setex append
#       incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd
#       sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby
#       zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby
#       getset mset msetnx exec sort
#
# The default is:
#
# maxmemory-policy volatile-lru

# LRU and minimal TTL algorithms are not precise algorithms but approximated
# algorithms (in order to save memory), so you can select as well the sample
# size to check. For instance for default Redis will check three keys and
# pick the one that was used less recently, you can change the sample size
# using the following configuration directive.
#
# maxmemory-samples 3
# luroConnect - set maxmemory depending on use and requirement
# typically for Magento cache and session do not set but monitor
# for FPC set to the max available memory as it tends to grow
# if maxmemory is set, use allkeys-lru pollicy
maxmemory 2G
maxmemory-policy allkeys-lru

############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. If you can live
# with the idea that the latest records will be lost if something like a crash
# happens this is the preferred way to run Redis. If instead you care a lot
# about your data and don't want to that a single record can get lost you should
# enable the append only mode: when this mode is enabled Redis will append
# every write operation received in the file appendonly.aof. This file will
# be read on startup in order to rebuild the full dataset in memory.
#
# Note that you can have both the async dumps and the append only file if you
# like (you have to comment the "save" statements above to disable the dumps).
# Still if append only mode is enabled Redis will load the data from the
# log file at startup ignoring the dump.rdb file.
#
# IMPORTANT: Check the BGREWRITEAOF to check how to rewrite the append
# log file in background when it gets too big.

appendonly no

# The name of the append only file (default: "appendonly.aof")
# appendfilename appendonly.aof

# The fsync() call tells the Operating System to actually write data on disk
# instead to wait for more data in the output buffer. Some OS will really flush 
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log . Slow, Safest.
# everysec: fsync only if one second passed since the last fsync. Compromise.
#
# The default is "everysec" that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving the durability of Redis is
# the same as "appendfsync none", that in pratical terms means that it is
# possible to lost up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
# 
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size will growth by the specified percentage.
# 
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (or if no rewrite happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a precentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

################################## SLOW LOG ###################################

# The Redis Slow Log is a system to log queries that exceeded a specified
# execution time. The execution time does not include the I/O operations
# like talking with the client, sending the reply and so forth,
# but just the time needed to actually execute the command (this is the only
# stage of command execution where the thread is blocked and can not serve
# other requests in the meantime).
# 
# You can configure the slow log with two parameters: one tells Redis
# what is the execution time, in microseconds, to exceed in order for the
# command to get logged, and the other parameter is the length of the
# slow log. When a new command is logged the oldest one is removed from the
# queue of logged commands.

# The following time is expressed in microseconds, so 1000000 is equivalent
# to one second. Note that a negative number disables the slow log, while
# a value of zero forces the logging of every command.
slowlog-log-slower-than 10000

# There is no limit to this length. Just be aware that it will consume memory.
# You can reclaim memory used by the slow log with SLOWLOG RESET.
slowlog-max-len 1024

################################ VIRTUAL MEMORY ###############################

### WARNING! Virtual Memory is deprecated in Redis 2.4
### The use of Virtual Memory is strongly discouraged.

# Virtual Memory allows Redis to work with datasets bigger than the actual
# amount of RAM needed to hold the whole dataset in memory.
# In order to do so very used keys are taken in memory while the other keys
# are swapped into a swap file, similarly to what operating systems do
# with memory pages.
#
# To enable VM just set 'vm-enabled' to yes, and set the following three
# VM parameters accordingly to your needs.

vm-enabled no
# vm-enabled yes

# This is the path of the Redis swap file. As you can guess, swap files
# can't be shared by different Redis instances, so make sure to use a swap
# file for every redis process you are running. Redis will complain if the
# swap file is already in use.
#
# The best kind of storage for the Redis swap file (that's accessed at random) 
# is a Solid State Disk (SSD).
#
# *** WARNING *** if you are using a shared hosting the default of putting
# the swap file under /tmp is not secure. Create a dir with access granted
# only to Redis user and configure Redis to create the swap file there.
vm-swap-file /tmp/redis.swap

# vm-max-memory configures the VM to use at max the specified amount of
# RAM. Everything that deos not fit will be swapped on disk *if* possible, that
# is, if there is still enough contiguous space in the swap file.
#
# With vm-max-memory 0 the system will swap everything it can. Not a good
# default, just specify the max amount of RAM you can in bytes, but it's
# better to leave some margin. For instance specify an amount of RAM
# that's more or less between 60 and 80% of your free RAM.
vm-max-memory 0

# Redis swap files is split into pages. An object can be saved using multiple
# contiguous pages, but pages can't be shared between different objects.
# So if your page is too big, small objects swapped out on disk will waste
# a lot of space. If you page is too small, there is less space in the swap
# file (assuming you configured the same number of total swap file pages).
#
# If you use a lot of small objects, use a page size of 64 or 32 bytes.
# If you use a lot of big objects, use a bigger page size.
# If unsure, use the default :)
vm-page-size 32

# Number of total memory pages in the swap file.
# Given that the page table (a bitmap of free/used pages) is taken in memory,
# every 8 pages on disk will consume 1 byte of RAM.
#
# The total swap size is vm-page-size * vm-pages
#
# With the default of 32-bytes memory pages and 134217728 pages Redis will
# use a 4 GB swap file, that will use 16 MB of RAM for the page table.
#
# It's better to use the smallest acceptable value for your application,
# but the default is large in order to work in most conditions.
vm-pages 134217728

# Max number of VM I/O threads running at the same time.
# This threads are used to read/write data from/to swap file, since they
# also encode and decode objects from disk to memory or the reverse, a bigger
# number of threads can help with big objects even if they can't help with
# I/O itself as the physical device may not be able to couple with many
# reads/writes operations at the same time.
#
# The special value of 0 turn off threaded I/O and enables the blocking
# Virtual Memory implementation.
vm-max-threads 4

############################### ADVANCED CONFIG ###############################

# Hashes are encoded in a special way (much more memory efficient) when they
# have at max a given numer of elements, and the biggest element does not
# exceed a given threshold. You can configure this limits with the following
# configuration directives.
hash-max-zipmap-entries 512
hash-max-zipmap-value 64

# Similarly to hashes, small lists are also encoded in a special way in order
# to save a lot of space. The special representation is only used when
# you are under the following limits:
list-max-ziplist-entries 512
list-max-ziplist-value 64

# Sets have a special encoding in just one case: when a set is composed
# of just strings that happens to be integers in radix 10 in the range
# of 64 bit signed integers.
# The following configuration setting sets the limit in the size of the
# set in order to use this special memory saving encoding.
set-max-intset-entries 512

# Similarly to hashes and lists, sorted sets are also specially encoded in
# order to save a lot of space. This encoding is only used when the length and
# elements of a sorted set are below the following limits:
zset-max-ziplist-entries 128
zset-max-ziplist-value 64

# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
# order to help rehashing the main Redis hash table (the one mapping top-level
# keys to values). The hash table implementation redis uses (see dict.c)
# performs a lazy rehashing: the more operation you run into an hash table
# that is rhashing, the more rehashing "steps" are performed, so if the
# server is idle the rehashing is never complete and some more memory is used
# by the hash table.
# 
# The default is to use this millisecond 10 times every second in order to
# active rehashing the main dictionaries, freeing memory when possible.
#
# If unsure:
# use "activerehashing no" if you have hard latency requirements and it is
# not a good thing in your environment that Redis can reply form time to time
# to queries with 2 milliseconds delay.
#
# use "activerehashing yes" if you don't have such hard requirements but
# want to free memory asap when possible.
activerehashing yes

Varnish vs Redis FPC

Varnish sits at the edge and caches pages as they are served. Redis FPC sits inside the Magento application and stores the pages as they are served.
A Varnish application gets the variable components either via VCL or ajax. FPC being inside the application can use the block directly inside the page.

Magento 2 supports both varnish and FPC out-of-the-box. Varnish outperforms redis FPC but takes a bit more chutzpah to get working with https. Magento has a good explanation on the setup at https://devdocs.magento.com/guides/v2.2/config-guide/varnish/config-varnish.html

Setting up the hosting user

Magento suggests using a 2-user system for hosting. What this means is that the php files are not writable by the web server or the php process. Create a user that we will call as the hosting user and add nginx to the hosting user’s group

useradd -G nginx <hostinguser>

Setup directory www with Magento.

Ensure media and var folders have permissions for group write but other folders do not

cd /home/;/www
find . -type d -exec chmod 750 {} \;
find . -type f -exec chmod 640 {} \;
find ./media -type d -exec chmod 770 {} \;
find ./media -type f -exec chmod 660 {} \;
find ./var -type d -exec chmod 770 {} \;
find ./var -type f -exec chmod 660 {} \;

Conclusion

We have presented with detailed configuration files in a DIY guide to Magento hosting. If you need to setup multiple servers, the basic idea is similar some configurations may have to change.

Feel free to post your comments and we will improve this guide.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Watch our webinar on performance and scaling in Magento

Its free!

Using analogy to vehicular traffic we explain performance and scaling in Magento.
Key takeaways

  • Know how to compare hosting options
  • Importance of good code
  • How to scale
  • Tuning Magento

Enterprise grade security for your Magento website

At a time when data theft and leaks are more real than ever, don’t be the next one to explain the reason for a security breach. With threats growing in number, type and modality of attack every day, multiple layers of security are needed, so if one layer were to be breached, another would yet protect.

Introduction

Today, security of a Magento web server is a major requirement for the success of the business that runs on it. With threats growing in number, type and modality of attack every day, multiple layers of security are needed, so if one layer were to be breached, another would yet protect.
With attackers using automated scanners or bots, being small or infamous is no reason to feel secure. Attackers will probe for vulnerabilities and break in before deciding if you had anything of worth.

In this article we discuss various aspects of security and how luroConnect helps in getting your website an enterprise grade security.

Securing the Magento Application

Magento patch updates

Magento is a well maintained software. Magento routinely releases patches even for the community edition. Keeping patches uptodate is important. When magento releases a patch, while it helps you keep your site secure, it also announces vulnerabilities of an unpatched system to the world.
A major hurdle in patch application is when the Magento core files have been modified by developers. Developers do this in answer to high time and cost pressures or for not knowing better.

luroConnect service always patches customers at the earliest — if you have not opted out of the service.

Purchasing plugins from dependable vendors

While Magento is good at giving updates using patches, plugin vendors are not so motivated. Plugins are paid for once and quite often updates require one to be on a maintenance plan. So, while selecting a plugin vendor, keeping the vendors track record would be good idea in the future.

There are many areas where vendors fail to think about security and the following are but a few examples.

Magmi is popular extension for product upload. However, it is known to have security issues. The developer acknowledges this and has advised not using the web interface. In our research and experience, most Magmi users use their UI, leaving security holes open. luroConnect makes magmi installation more secure by providing htaccess based second login, securing the magmi installation with permissions that disallow updates to this from the UI as well as disallowing php execution from Magento’s media folders.

Some extensions require ability to write to folders and files that are unreasonable. With default access this seems to work but opens security holes. In our opinion rather than store plugin configurations in the database (core_config_data), plugins use files, sometimes generating php code. An example of a plugin (Aitoc_AdvancedPermissions) requires ability to change the etc/modules/Aitoc_AltPermissions.xml file.

Some extensions use encrypted code — typically requiring ion decrypter. Apart from being legally on the fence, the requirement of installing a decrypter increases the risk of encrypted malware, making it difficult to detect the existence of malware, even with disk scans.

PhpMyAdmin is another popular extension that makes a site unsecure. PhpMyAdmin uses the database username and password. If accessed from a unsecured url, the credentials of the database are transmitted over the internet in plain text form. Even with a secure URL, the database is left to be probed via username / password detection BOT that can try combinations.

Magento’s open architecture has made a marketplace for developers. Custom development is crucial to the Magento’s success. Custom development from vendors who are aware of security is also crucial. Some aspects that we recommend asking developers include awareness in use of form keys, sql injection and avoidance techniques — specifically validating all input parameters before processing and use of Magento’s model architecture vs writing sql.

Securing the Magento admin panel

Using host file entries instead of DNS update for non public sites

DNS entries are the starting point for all BOTs. Windows, Macs and linux allow an entry in the host file that helps the browser resolve a domain name (such as staging.domain.com or phpMyAdmin.domain.com) to a IP address. While not easily doable on a mobile for testing, we encourage use of host file entries to hide the actual domain name used.

2-password Secure admin access

Using htaccess (even in nginx) to protect admin URLs with a double authentication system, with both passwords being different. This thwarts most BOTs as they do not expect a basic authentication.

Changing the admin url, possibly once a year.

Magento allows obfuscation of the admin url. Selecting a non obvious url for admin and changing it as often as practical helps in preventing leakage of the url. In addition using a different admin url for development and staging compared to production will help keeping the URL on a need to know basis.

Managing roles in admin

Magento admin supports role based access. Giving roles on a need to know basis will protect against a threat popularly referred as “malicious insiders”.

Use of captcha in admin and end user login

Magento supports enabling captcha in admin. That ensures BOTs do not access the site and it thwarts possible username / password combination hacks.

Using strong admin passwords and change them often.

This is general advice for any password. Changing them often is also a general advice.

Hiding Administrators role.

Recently admin logins have been used to insert malicious code in header or footer “miscellaneous” html section. That has prompted us to advice to hide the administrator role – i.e. have no user with the permission to change that. When needed the role can be enhanced temporarily to access that setting.

Database containment and Filesystem Security

Limited privileges to write to code directories

System level file, directory (folder) and process permissions are crucial. A different hosting user and web server user with limited permissions to the web server user is crucial to protect code directories. The hosting user can update the code but the web user is not able — preventing any attempt to upload code.

Plugin and code updates only allowed via deployment process

The deployment process may be as simple as git pull (we prefer git stash before git pull) or a CI based mechanism such as Jenkins.

Readonly git access to live site

If git pull is used to update a site, it is important to ensure that the git user used has readonly access. Even using a CI ensure the git user has readonly access.

Configuration file protection against unauthorized changes

Ensuring site specific configuration files such as env.php or local.xml are in their own git and versioned ensures a protected master copy. Monitoring changes to these files and alerting would be a good way to ensure their security.

Only authorized access to code

The webserver should not have permission to write to the code directory. That requires plugin updates via download manager are done on a staging or development server and code moved via git. Ensuring automation of code deployment will ensure only controlled access to the code.

Watch our webinar on performance and scaling Magento

Its free!

Using analogy to vehicular traffic we explain performance and scaling in Magento.
Key takeaways

  • Know how to compare hosting options
  • Importance of good code
  • How to scale
  • Tuning Magento

Security at the edge — before traffic hits your application

Nginx has many features that allow a very decent edge security — including protection against basic DOS attacks. However, one has to understand the terse configuration options and deflecting a DOS attack may require some manual steps during the attack. There are more automated solutions that one can consider for edge security.

Rate limiting to block bots from accessing the site — with whitelist

Nginx supports rate limiting. Rates can be set for a subset of URLs. Whitelisting of IPs is also allowed so for example your own IP need not have the limitation. Rates are configurable and can be specified in hits per seconds or minute and are best tested for a site for a appropriate number, and then monitored. luroConnect / Insight dashboard shows all IPs that were blocked helping in deciding to either block the IP or add it to whitelist.

Automatic blocking of spambots by whitelisting bots

Nginx also supports an ability to block BOTs using regular expressions for the User Agent identify text each BOT gives. Note that User Agents are easily faked, so this is not a very reliable way of detecting BOTs.

Blacklist IPs from site access

Nginx allows blacklisting IPs from accessing a site. A more advanced geoip resitrction can also be availed. luroConnect / Insight dashboard gives top 10 IPs accessing the site on any particular day. This information can be used to decide which IPs need to be blocked. Note that attackers quite often do not use static IPs, so the banned list needs periodic review or release from blacklist.

Preventing php scripts from running in media folder

Media folders will be open to update and write by the web process. Restricting execution of php on these folders will prevent a malicious upload to execute.

Encryption of user data

SSL certificate

Being fully secure (using https over TLS) is becoming a requirement with Google’s ranking and expected indication. However, Magento has secure and unsecure URLs and by default all user data uses a secure URL. With LetsEncrypt project issuing auto renewable SSL certificates for free, there is no reason for a site not be fully secure. luroConnect service includes free LetsEncrypt certificates and auto redirect rules for http to https.

Securing backup

Security is as strong as the weakest link — securing the backup

Backup file transmission encrypted

Securing backup is an essential component of securing data. Using sftp or rsync over ssh one can ensure backup files are transmitted securely.

Backup server shut after backup — ensure the backup is secure

Backup data can be secured at rest. luroConnect goes a step further — backup server is different for each customer and run only when needed. The server is started before a backup starts and shut down at the end of the backup, ensuring its security.

Disk scans

Scanning disks for known vectors regularly is important.

OS Security

Patch updates

Securing the server is ensured by patching the server with latest security updates and ensuring only required ports are open ensures security.

Enable logs and monitor them

Logs for outgoing emails possibly with subject line, mysql, web server, Magento logs should be enabled and monitored either automatically and continuously or manual and periodically. For example a spam email hack into the site may have left a malicious script that will send out emails. These emails and subject lines will be in the mail log. A high count of emails for example can be a reason to suspect a compromised system.

Conclusion – Securing Magento requires many steps

With new and sophisticated attacks on the rise, having multiple layers is important — a layer may occasionally break but the harm may not come as another layer will kick in. Nothing less than an entire set of enterprise grade security will be needed to secure a Magento site so it is always up for business. luroConnect managed hosting gives you all this and more.

For a study of recent Web security breaches, read our article “Analysis of Recent Website Security Breaches

luroConnect is a managed hosting service for Magento, wherever you are hosted. Multi layered security is one of the aspects.

6 tips to speedup Magento websites

Introduction

Website speed is very crucial to conversions. There are many studies that indicate how website slowdown of even one second can cause drop in conversions. And websites that have taken this learning have seen conversions grow. Take Pier 1 for example in this article.

Our tips are oriented towards what you need to do on the server side to improve performance especially during high traffic times, when server response times tend to be inconsistent. These tips will speedup Magento websites.

Tip # 1 : Restrict Bots

When times are a busy, you need to be mindful of who you allow into your store. Restricting BOTs is easy to implement by checking the ‘user agent” field. With lowering of server load this will speedup Magento or help reduce resources.

Tip # 2 : Rate limit hits from the same IP

This is one more reduce tip that may work in your situation. Hits from the same IP – until your target audience is a group that is behind a firewall – will restrict simple robotic attempts to crowd your site – especially if you are running a time bound flash sale.

Bonus Tip : Use CDN to serve static content

During peak traffic, CDNs help in offloading not (just) servers but actually network bottlenecks of your servers. This causes a speedup in Magento by reducing the load on the entire system.

Tip # 3 : Use php7 instead of php5.x

php7 performance is almost 50% higher than php5. We have seen this in processing CPU intensive Magento hits. But, Magento 1.x does not officially support php7 and 3rd party patches. We like the one from inchoo. Magento 2 should be on php7.

Tip # 4 : Consider adding separate servers or pools for your checkout flow (or admin access)

While site performance is important, checkout flow performance is even more important as it is on the business end of your sales funnel. Consider adding a separate checkout servers or pools. They are akin to High Occupancy (HOV) or diamond lanes in traffic. This helps to speedup Magento’s checkout process.

The same concept can be applied to restrict number of resources you reserve for admin access.

Tip # 5 : Manage Magento Indexing

Leaving Magento Indexes to “Update on Save” can get your newly uploaded products available for shopping faster, but can slow down the site! If business permits, move indexing to slightly low volume times of the day (or night). This setting leads directly to speedup Magento front end.

Tip # 6 : Monitor the performance of your caches

A Magento site has many caches and each one helps in reducing the server load and improve response times. But you need to monitor for hit to miss ratios and out of memory conditions. This achieves a speedup in Magento as more cache hits mean faster response. The caches one should look at are php opcache, Magento block cache, FPC (Full Page Cache) and mysql cache.

Conclusion

By reducing, redirecting and monitoring server, the overall performance of a website can be improved.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Watch our webinar on performance and scaling in Magento

Using an analogy to vehicular traffic we explain performance and scaling in Magento.
Key takeaways

  • Know how to compare hosting options
  • Importance of good code
  • How to scale
  • Tuning Magento

Case Study : How bad code can hurt performance and even break

Many websites, notably eCommerce and WordPress, live with bad code. They end up spending more effort in improving performance by adding more resources. It is like adding more horsepower to an inefficient car engine – will consume more gas and in the long run fixing the engine would be a better thing.

Not all such attempts will succeed, however. Bad code takes many forms. There are times when your application is dependent on an external service. The problem comes when this service is executed inline when your visitor is waiting for your server to respond. The two popular dependencies are sending email (smtp) and curl calls for external http access. (curl is a popular library to access remote servers programmatically).

In this article we will analyse a bad code example we found in production. A customer in India (we manage the Magento site hosting) raised a ticket that their international shipping costs stopped working. Since we take care of their release process, we investigated to see if this was an inadvertent code change that made it to production. We could find no change. Infact, there was no release made from the day the problem started. So, with permission from the customer, we investigated further.

The bad code

They had overridden the default table shipping with custom logic. The primary reason was that they wanted to give shipping costs in USD for US customers. Magento has concepts of base currency and display currency. All calculations are done in base currency (in this case INR) while the charge was in display currency (USD).

 ...
    $to_Currency = urlencode($to_Currency);
    $url = "http://www.google.com/finance/converter?a=$amount&from=$from_Currency&to=$to_Currency";
    $ch = curl_init();
    $timeout = 0;
    curl_setopt ($ch, CURLOPT_URL, $url);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
 ...

I wondered why it took over 4 years since the bad code was written to break. The reason it finally broke was that google took the URL out of service. The code below would return 0 as it did not find the appropriate tag in the output. So, all shipping was free.

Analyziing the bad code

  • Instead of using Magento’s functions for currency conversion, the developer thought of
    • Accessing the google finance page
    • Scraping the HTML output
  • Might be in violation of Google’s terms of service (programmatic access)
  • The customer has a one step checkout plugin – this code gets called each step, when anything changes. This is slowing down the checkout page.
  • There is no retry or reporting of failure
  • The conversion rate used for shipping may be different for the products, as the site conversion rates are updated daily.

Conclusion

Using curl in code that is accessed as the visitor waits is a bad idea. Apart from slowing down user interaction, the probability of an error increases and then the error has to be handled appropriately.

A non technical guide to scaling Magento (or any other website)

Performance & Scaling of a Magento web site are often confused. As a store owner who may not be technical a close analogy with real life will help in talking to your hosting providers and other experts.

It is no coincidence that hits to a website is called as traffic. We take this analogy further, to explain what factors matter to performance and scaling a website.

Website performance is like a car – higher performance cars drive faster and can cover a distance in a shorter time. Similarly, a higher performant website will serve a page fast. This is often measured as page load speed. A critical component of page load that the server is responsible for, is server response time. Like measuring performance of a car, measuring the page load speed is done in test mode with little or no traffic. Sometimes the performance is measured at a random time without looking at other traffic to the site. That is like test driving a car through traffic.

Scaling is like building highways and roads for the cars to move on. Highways are resources – CPU, memory, network that the hits to a website will utilize. The task of a Magento scaling expert is to architect a system – servers and sizes, services to run on each server, connectivity of the servers and access from internet, etc.

Hits to a website is like traffic of random cars on the highway. Each vehicle seems to have a mind of its own, joining and leaving the highway. Each visitor to the website will take their own journey visiting different pages.

Some Observations

Observation 1 : Like a car cannot drive at its highest speed possible at all times due to traffic, a website too cannot perform at its best best all the time. Understanding the factors that make the website perform at its optimal level all the time would be the task of both the developer and the server architect.

Observation  2: Like in traffic we have vehicles of different performance, in a website all URLs do not perform equally. A category page may not perform the same way as product detail page for example.

Observation 3: Better throughput will be achieved with the same resources, if the vehicle performance is improved – some bottlenecks can be avoided if the vehicles moved faster. Similarly, a better performing website is likely to scale better.

Observation  4: Like in traffic in order to scale one has to find the bottleneck in the highway that is causing the current slowdown, fix it and then look for the next bottleneck. This is a change in the hosting infrastructure and architecture, different from the website performance.

Observation  5: A traffic designers job is to ensure maximum number of vehicles can pass the highway at the best speed for each vehicle. A hosting designers job is to ensure maximum traffic is handled in a way that each hit is best served.

What lessons can we learn from traffic management

Lesson 1: To better manage traffic highway system has to be designed that is scalable. Mostly by bottleneck analysis we can derive what needs to be done. For example, is database a bottleneck, is file system access a bottleneck, etc.

Lesson 2: When traffic increases, possibly beyond the capacity of the highway, traffic management has to account for one more variable – starvation. The amount of time a vehicle has to wait at a metered light to enter the highway. The longer the wait, more frustration from the drivers who will find a better route to their destination.

Lesson : On a highway lanes are drawn. A better hosting will make lanes. The way most hosting providers take traffic is analogous to not having lanes with the hope that the maximun throughput will be achieved by letting hits contend for resources. The operating system stands to decide what process gets to use resources.

What are the recommended steps to achieve scaling?

As a first step to server side scaling, we move the database layer out to another instance or server. The main reason is that it is better to allocate resources in a single server when the workload is similar.

In our multi part series we take you through achieving scaling. The series is aimed at a store owner who need not be technical but is ultimately responsible to take a decision on the store. Until now you had to depend on an expert. However, there are no clear answers and the expert is making judgement calls based on most likely their prior experience. As a matter of fact no 2 webstores so results of efforts vary. This series will make you better informed.

We start by looking at a popular form of scaling – using FPC or Full Page Cache and other types of caches.

In order to help with scale, another important aspect is code quality specifically related to scaling. Scaling is difficult to achieve reliably if there is any externally dependent blocking service executed as part of the hit. Examples include sending email directly to a recipient or a external service, sending information from the server to an external service. All such processing should be done with some form of a queue handled either by an different process such as a cron job. Until Magento 1.9.2.4, the default email sending was inline for example, slowing the order success page being shown.

Autoscaling adds and removes servers (and hence resources) – something traffic managers cannot do with highways. This gives website scaling an advantage to be more elastic.

Magento Caches – Scaling Magento Part 2

Introduction

Caches are an integral part of strategy of performance and scaling a Magento website. A Cache is the term used to describe a mechanism to store calculated values such as a query or HTML so a subsequent request does not need to recalculate.

While Part 1 looked at FPC specifically, in this article we review all other caches needed for a Magento store. Some of these are commonly found to be referred in best practices, but a deeper understanding will help put them in perspective of Magento.

In general, for any cache the factors to consider are

  • Effectiveness – measured in terms of impact on page speed. Caches should be highly effective.
  • Invalidation flexibility – is an inbuilt automatic mechanism available? Is the mechanism too aggressive? A High score means the mechanism is very flexible.
  • Performance of miss – if a miss were to occur what would be the performance in terms of page speed. A low score represents bad performance.
  • Cost of refill –how much would it cost to refill the cache completely to be back in its most effective state?
  • Hit ratio achievable – In a real world environment if hit ratio were to be measured periodically what can be expected. A high would be over 99% hits over a reasonable period such as a day.
  • Memory required : How much memory is needed on the server side to cache the content.
Cache Type Effectiveness Invaliation Flexibility Peformance of miss cost to refill Hit ratio achievable Memory required
Browser High High Low Moderate High N/A
CDN High Low Moderate Moderate High N/A
FPC High Low Low High Moderate High
HTML Block Cache Moderate High Moderate Moderate High Low
Mysql cache High Low Low High High Moderate

Browser cache

  • Easy to setup – at the hosting level. Static resources like css, js and images should have caching enabled.
  • There is no need for invalidation as the browser checks for each cached resource if a newer version is available.
  • The ability of browsers to find changed resources means cacheable resources should have a very long expiry – say over a year.
  • Browser’s request for detecting change has a performance impact based on number of resources.
  • A key question to be answered is how merging css and js files impact browser caching performance. If merging is enabled, each page type (home, category, product, CMS) are likely to have different css and js files. Not merging allows the browser to cache individual files and reuse them across pages. However, in HTTP/1 browsers limit the number of connections to each domain. So, the advice is to not merge css and js files if using HTTP/2 or splitting domains for skin and js if using HTTP/1.

CDN

  • CDN networks cache at edge servers closer to users – reducing round trip latency from browser to server.
  • CDN also offloads the server from serving static cacheable resources, improving network performance of the server as well as freeing up server CPU to serve dynamic content.
  • CDNs may also take the load of SSL validation however, caution is needed here as the traffic between CDN and server may be unsecure making the site vulnerable to some type of attacks.
  • CDNs are notorious for invalidations – some charge for APIs, others take a few minutes before the invalidations are effective across all edge points. When evaluating a CDN, this is a key factor that is not evaluated.
  • Having many edge locations may not be a good thing – as each edge records the first access to a resource as a miss
  • While a single miss is easily retrieved from the server or backup store, a full invalidation requires multiple GBs to be transferred to make the cache effective again
  • CDNs when full give great performance benefit on page load times
  • Modern CDNs like section.io can also do FPC (html) caching using a distributed varnish cache architecture.

FPC

We have reviewed aspects of FPC in part 1 here.

Recap

  • FPC caches full HTML pages – except for variable content
  • Excellent for caching dynamic content
  • Requires very high amount of RAM on the server side
  • Depending on the quality of code, FPC invalidation or miss can have impact on resource utilization
  • Best implemented with autoscale – so servers are added automatically when cache is invalidated

HTML Block Cache

  • Also caches HTML but cached at the block level. Magento uses HTML blocks for building a page.
  • Since blocks may be shared across pages, these blocks do not have a high impact on invalidation as they will be regenerated once and used multiple times
  • Can dramatically improve performance if used consistently and correctly.
  • Needs developer help as many blocks are not cached by default. To cache a block, one needs a unique key that correctly identifies its variation. Check this technical blog info.
  • Invalidation can be either via a key or time (TTL). If using a key, developer needs to write appropriate event callbacks to detect change.
  • Examples of major speed up include home page blocks where latest products are shown. Depending on frequency of store updated, a 10 minute to 1 hour TTL on the block will result in dramatic improvement of home page speed.

We can analyze your site for free

Schedule a call

Do you know how your website performs and want an expert to look at it?

  • We will analyze your site using public information.
  • We will run a synthetic test from the internet.
  • We will ask you to give us a 1 day web server log file.
  • We will also try to identify what steps if any you should take to improve your sites performance goals.

mysql cache

Mysql can store result queries in memory. The amount of memory is specified in my.cnf as a combination of query_cache_limit (max memory for a single query result) and query_cache_size (max memory all cached queries).

  • mysql automatically invalidates a cached query if any table that was used in the query changes.
  • to access the cache mysql uses a lock on a table thereby reducing the effectiveness of the cache
  • Many mysql articles recommend smaller cache values due to the lock problem but it is best to test the size of the cache for your situation. It is best to monitor “low mem prunes” and “table locks wait”. The first gives the number of times a query was cached by removing another and the latter gives the number of times table lock was not immediately available. Both should be as low as possible.

There are other caches too

  • Php opcache
    php code is read and converted to “opcodes” which are then interpreted. An opcode cache stores the opcodes reducing one step each time. Opecode cache should have sufficient RAM and keys (equal to number of php files).
    Using opcache.php the status should be checked regularly.
    It is best to leave invalidation automatic so when the underlying php file changes the opcode cache is invalidated.
  • Operating system cache
    linux has an excellent file system cache – whenever additional RAM is available, linux will store opened files in RAM. This is important if for example you do not have a CDN. Files will actually be retrieved most likely from RAM rather than disk.
  • Magento has caches other than block. These should always be enabled on a production server.

Conclusion

Setting up caches is crucial for achieving performance in Magento. But with so many caches and so much advice there is overload and scatter of information. In the first 2 parts of our Magento scaling series, we have tried to pull out the important points.

Scaling Magento Series

Performance & Scaling of a Magento web site are often confused. As a store owner who may not be technical a close analogy with real life will help in talking to your hosting providers and other experts. It is no coincidence that hits to a website is called as traffic!

Performance of a website is like a car – higher performance cars drive faster and can cover a distance in a shorter time. Similarly, a higher performant website will serve a page fast. This is often measured as page load speed. A critical component of page load that the server is responsible for is server response time. Like measuring performance of a car, measuring the page load speed is done in test mode with little or no traffic. Sometimes the performance is measured at a random time without looking at other traffic to the site. That is like test driving a car through traffic.

Core Infrastructure is like an engine – Better CPU with L2/L3 cache, faster memory, better disk performance will improve the engine you use. There are simple commands in linux and ways to find this information. Refer here for CPU performance, memory performance and disk performance.

Good code is like good fuel – Just as bad fuel will hurt the performance of a car, bad code will hurt the performance of a website. Refer here for identifying bad code in Magento.

Page Load Time is like transmission – the mechanism that delivers speed to the page. This is achieved by optimizing in code the above-the-fold content, use of CDN,  use of browser cache, using gzip on all text content, optimizing images, delivering appropriately sized images, minifying css and js, optimizing the use of marketing pixels.

Scaling is like building highways and roads for the cars to move on. Highways are resources – CPU, memory, network that the hits to a website will utilize. The task of a Magento scaling expert is to architect a system – caches, servers and sizes, services to run on each server, connectivity of the servers and access from internet, etc.

Hits to a website is like traffic of random cars on the highway. Each vehicle seems to have a mind of its own, joining and leaving the highway. Each visitor to the website will take their own journey visiting different pages.

Observation : Like a car cannot drive at its highest speed possible at all times due to traffic, a website too cannot perform at its best best all the time. Understanding the factors that make the website perform at its optimal level all the time would be the task of both the developer and the server architect.

Observation  : Like in traffic we have vehicles of different performance, in a website all URLs do not perform equally. A category page may not perform the same way as product detail page for example.

Observation : Better throughput will be achieved with the same resources, if the vehicle performance is improved – some bottlenecks can be avoided if the vehicles moved faster. Similarly, a better performing website is likely to scale better.

Observation  : Like in traffic in order to scale one has to find the bottleneck in the highway that is causing the current slowdown, fix it and then look for the next bottleneck. This is a change in the hosting infrastructure and architecture, different from the website performance.

Observation  : A traffic designers job is to ensure maximum number of vehicles can pass the highway at the best speed for each vehicle. A hosting designers job is to ensure maximum traffic is handled in a way that each hit is best served.

What lessons can we learn from traffic management

Lesson : To better manage traffic highway system has to be designed that is scalable. Mostly by bottleneck analysis we can derive what needs to be done. For example, is database a bottleneck, is file system access a bottleneck, etc.

Lesson : When traffic increases, possibly beyond the capacity of the highway, traffic management has to account for one more variable – starvation. The amount of time a vehicle has to wait at a metered light to enter the highway. The longer the wait, more frustration from the drivers who will find a better route to their destination.

Lesson : On a highway lanes are drawn. A better hosting will make lanes – a thought we think is unique to our style of hosting Magento. The way most hosting providers take traffic is analogous to not having lanes with the hope that the maximum throughput will be achieved by letting hits contend for resources. The operating system stands to decide what process gets to use resources.

What are the recommended steps to achieve scaling?

As a first step to server side scaling, we move the database layer out to another instance or server. The main reason is that it is better to allocate resources in a single server when the workload is similar.

In general these techniques can be used for scaling

Caching. Simply stated do not recompute results that can be reused. The results could be HTML (either a complete page or a part of a page or a page with holes to be filled), or json (returning data to a API call) or sql query results, etc. Cache entries require either an expiry time or an invalidation event. Caches work better when stale content is not a major problem.

Queing. Another powerful technique is putting things to work in a queue vs doing them in real time. A queue would then have a poll for results to update when results are ready or a trigger to update. Magento makes it easy to write trigger events and many are used for 3rd party integrations. Unfortunately events are fired inline – when the visitor to the website waits for the response. It is better to use queing system Another popular event trigger is to email.

Tuning. Monitoring and tuning to improve scale as a continuous process. If you are not monitoring you cannot improve. Monitoring does not mean CPU and RAM. Measuring actual response times, cache hit / miss ratios, queue lengths and alerting and analysing these parameters.

Sharding. If database is the bottleneck, sharding – either vertically or horizontally – can help in reducing the load. This works equally well on mysql as well as nosql databases, but requires code to be reworked. A properly sharded database can cause parts of applications to be split allowing for greater app level scalability.

Laning services. Another powerful technique achievable by stateless APIs, SoA, microservice architecture and other design patterns. This allows easy scaling as busier lanes can be scaled out independently. Along with database sharding, lanes can be made very deep and can then be scaled out physically and geographically.

In our multi part series we take you through achieving scaling. The series is aimed at a store owner who need not be technical but is ultimately responsible to take a decision on the store. Until now you had to depend on an expert. However, there are no clear answers and the expert is making judgement calls based on most likely their prior experience. As a matter of fact no 2 webstores so results of efforts vary. This series will make you better informed.

We start by looking at a popular form of scaling – using FPC or Full Page Cache.

In order to help with scale, another important aspect is code quality specifically related to scaling. Scaling is difficult to achieve reliably if there is any externally dependent blocking service executed as part of the hit. Examples include sending email directly to a recipient or a external service, sending information from the server to an external service. All such processing should be done with some form of a queue handled either by an different process such as a cron job. Until Magento 1.9.2.4, the default email sending was inline for example, slowing the order success page being shown.

Autoscaling adds and removes servers (and hence resources) – something traffic managers cannot do with highways. This gives website scaling an advantage to be more elastic.

Part 1 : Full Page Cache (FPC)

Part 2 : Other Magento Caches

Part 3 : Code quality

Part 4 : Optimizing Checkout Flow

Part 5 : Hardware

Part 6 : Hosting

Part 7 : Auto scaling

Full Page Cache (FPC): Scaling Magento Part 1

Introduction

The most important and often ignored factor to scaling is the quality of code. Well written code will scale better. The next most important factor is perhaps caching. There are many types of caches that developers, managers and store owners need to understand. Full Page Cache (FPC) is seen by store owners as a magic solution to speed issues. Understanding the benefits and compromises of a caching mechanism is important to understand scaling.

FPC Options

Magento Enterprise 1.x and Magento Community & Enterprise 2.x comes with a FPC module inbuilt.

There are many plugins available for Magento Community 1.x. Some hosting providers will help setup a Varnish based FPC with appropriate hole punching.

Magento 2 has two mechanisms for FPC – php based (called FPC) and varnish. This option generates a .vcl file that will be required by your hosting provider to setup varnish.

The discussion below applies to all these mechanisms as well as to Magento 1.x and Magento 2.

What is Full Page Cache (FPC)?

FPC is a cache of a full HTML page – except variable content such as a login status or items in cart or stock status of a product or sometimes even price of a product. When a hit is encountered – i.e. the page required is in the cache, FPC will return very fast compared to when there is a miss i.e. the page is not in the cache, which will require re-generating the content. FPC may store the cache in files, but more likely for maximum benefit, it will be stored in memory.

FPC affects resource utilization – memory and CPU. In a way it trades memory for CPU time.

Traditional FPC stores the page in Magento cache and is a part of Magento. Varnish stores the page HTML after it has been generated and is not a part of Magento. It is a separate process.

What FPC is not!

Let us understand FPC better

  • Memory needed to store the entire site in FPC
    Let us say each page is 100KB and you have 10000 pages to cache. That would take about 1GB of RAM. The problem is when the number of pages or page size starts rising above this, the RAM requirement goes up. So, if you now had 20000 pages (result of each option in layered navigation for example), you would need 2GB or if each page was 120KB the 20000 pages would need  4GB. Pages are not just products – they are category pages as well. If layered navigation is added the pages multiply fast as each combination is unique and needs to be stored independently. If you start exceeding the RAM available, you need to decide what to do when you hit the memory limit.
  • Cache warming.
    Cache warming is the process of automatically adding pages to the cache before a real visitor hit comes to the cached page. When a cache is cleared, you may need to warm the cache to make FPC effective early. Cache warming uses a crawler to artificially visit pages of a site. A typical crawler will recursively crawl the site starting from the home page. This sounds logical but here are some things to think through

    • If possible find the most likely pages you need to be in the cache and warm the cache with only those pages. This will give the maximum benefit.
    • If you cannot fit all the pages in memory, the use of crawling to warm the caches becomes a problem – they will recycle pages out of memory at random, not based on the end user popularity of the pages.
    • When the cache is being warmed your resource requirement in terms of CPU will rise as both the crawler and real traffic are being served.
    • If possible crawl the site in parallel – the earlier the pages get cached the more likely a visitor to a page will already be in the cache (scoring a hit).

Performance degradation on FPC full invalidation

The above figure shows the bad response immediately when a FPC that had built to 1.5GB was cleared completely. The top image is from redis usage graph from munin and the one below is AWS cloudwatch latency (time to serve a page) averaged per minute. The latency came down as AWS Autoscale added more instances, costing money.

  • Invalidating the cache :
    Magento automatically invalidates FPC (internal or varnish) by tagging or hashing the content with keys that refer to the type of content. For example, it may generate a tag / hash CATEGORY_123 if the page depends on category 123. Now, when category 123 changes, Magento sends out a invalidate message that says “all pages that have tag / hash CATEGORY_123 should be invalid”. Magento has a elaborate tag convention.
  • FPC and robotic crawlers (BOTS)
    Even if you do not use a crawler for warming, robotic crawlers on the internet (such as google’s indexer Googlebot) will start filling the FPC cache with pages they happen to crawl. It is our advice that a site with FPC should have robots.txt and a front end processor (nginx, WAF) restricting BOTs.
  • CPU and time needed to re-generate a page
    A FPC can fully invalidate (clear) due to a (p)html or css file changing or partially due to a data change such as a product update. A miss from FPC results in the page being regenerated. The CPU requirement for a miss is much higher than a hit. If a crawler is used to warm the cache or if traffic is high, CPU requirement can be quite high as the FPC fills up. Yet, the visitor experience is not good during this period. Using autoscale, this performance degradation can be contained to some extent as additional instances are launched to handle the high CPU requirement.
  • Discipline when using FPC – know when invalidation happens
    It is important to add discipline for code update as it has the worst effect on user experience.

    • Code update should be done at low traffic times.
    • Category changes should be carefully planned at low traffic times.
    • Magento indexing should be set to manual (M1) or on schedule (M2) with a cron running the indexer.

Our recommendation for FPC

  • Do not use a random crawler to warm the FPC cache. Use a page popularity based crawler to warm the cache if necessary.
  • Avoid using a crawler during high traffic – the crawler will compete for system resources with live traffic
  • If possible update code during low traffic times as it causes FPC to invalidate
  • If your site is horizontally scaled, pre-launch instances to your load balancer before invalidating FPC, either explicitly or indirectly, so the latency of starting an instance does not further worsen the user experience

Magento 1.x FPC Plugins

  1. Free Lesti FPC : https://github.com/GordonLesti/Lesti_Fpc. Use this guide to install
  2. Magento connect search results for FPC

Should FPC be a part of scaling strategy?

FPC is concerned with speed. Scale is concerned with the process that helps the site add resources when needed. FPC helps in scalability by reducing the use of resources per hit to the website under certain conditions. It changes the dynamics of when and how many resources will be needed.

FPC has to be considered to be part of scaling strategy – but as one of many parts.

Read part 2 where we discuss other Magento caches.

Read the overview of our Magento scaling series here.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.