In this post, I will try to explain some misconceptions about redundancy that I usually find during my every day job. Being able to explain a customer what redundancy is for and what are not part of its key benefits shall help customers to avoid misleading expectations.
What does redundancy mean in IT infrastructure?
The concept of redundancy in IT is simple: creating multiple copies of a resource, so in case of failure, another copy of the resource can fulfil the task. Redundancy is the key stone for eliminating Single Points of Failures, and thus, achieving highly available infrastructures.
But redundancy not only works as a “failover” system, but also in some components like web server, it is used to distribute the load evenly and increase the number of connections that an application can withstand (as explained in my previous post about concurrency .
What are the common misconceptions about redundancy?
Redundancy doesn’t mean better performance. Often, I find myself trying to explain a customer why having a percona cluster (3 mysql servers) will not improve the general performance of the database per 3. In fact, in some components the performance will be a little bit worse or equal than in a single instance configuration:
Mysql Cluster: In a mysql cluster configuration, all write operations should be replicated in all 3 nodes. This will increase the time it takes the operation to be consistent and it will increase the probability that table locking occurs. On the other hand, if the application is prepared (or if a proxysql is used), requests to cluster can be configured, so write operations are done in one node, and read operations in the other two (This can boost a little bit the performance for read operations).
Php processes: PHP is single threaded in nature and will take only a CPU for each request. The request of a user will be processed with the same performance as with 4 or 24 CPUs (as long as the server can handle it without bottlenecks). Performance in PHP is more related to Ghz and CPU instructions.
Varnish: In cache services, the performance should be the same as in case of redundancy. The only (but important) CON is that each cache instance should we warmed separately (you can use Varnish Enterprise and Cluster support to avoid this problem thus).
Redundancy is not a waste. If you come from an industrial / production background like me, you will have heard the LEAN term “Muda” for sure. Sometimes, I find myself arguing with a customer with this kind of mindset about the convenience of redundancy in one service and how will this ‘waste’ affect in the cost of the infrastructure. The convenience of redundancy is a trade-off between ‘how much will the solution cost’ and ‘how much downtime can I afford’. If your app is mission critical (like an ecommerce with a high percentage of its sales from this channel), an approach to design a highly available infrastructure should be taken.