So you’ve just launched a pilot version of your product users are going nuts about how cool this product is. You get a lot of feature requests from this pilot version, you decided to pick some and release them.
It’s the peak hour of the day when you get the most traffic, you decided now is not the good time to release the feature and you have scheduled some time in late-night as your maintenance window. You follow quite a lengthy process to update the software and wait for your user’s feedback.
The next day, you see blood everywhere (error messages) on your inbox, log monitoring applications. Bam! your server has crashed.. i.e. it’s facing a downtime.
Table of Contents
How do avoid downtime in deployments?
The first and most expensive approach to avoid downtimes is to have longer maintenance windows, do a lot of load testing, feature testing, and rest assured. But this is hard work, we should be doing smart work.
The second approach, the more thoughtful way is to implement zero-downtime deployment techniques. There are several minimal / zero downtime deployment techniques like Rolling Updates, Canary, Blue-Green Deployment, etc.
This article does a deep dive into understanding the Blue-Green Deployment technique.
Overview of Blue-Green Deployment
It is a continuous deployment technique wherein there are two identical environments ready for production release. In the process of blue-green deployment, one of the environments is used to deploy the new changes of the software thereby switching the traffic to this environment. The older environment stays in standby mode; in case of any issues in the new environment the traffic is switched back to the older environment aka the rollback process.
These environments are labeled as either green environment or blue environment; for the sake of this article, we will call the current running environment a blue environment and the new environment a green environment. This technique is also referred to as red-black deployment or A/B deployment.
This methodology of continuous deployment ensures minimal or zero downtime and hence increases application availability.
Also read, What is p50-p90-p99 latency?
How does Blue-Green Deployment work?
The process of blue-green deployment can be divided into four steps:
1. Load Balancer setup for routing users
Since we have two environments blue and green; we will look forward to moving our traffic from one environment to another very frequently. This route switching can be done by simply changing the DNS records. But it takes time for DNS to resolve all the records, hence altering DNS records for switching routes is not recommended.
The better but more expensive way for mitigating this is to use load balancers. With load balancers, there is no need to change the DNS records, as the load balancer would still point to the same servers. The only change happening here would be that it will seamlessly switch traffic from one instance to the other.
2. Execute the updates in the green environment
Once the changes are ready, we deploy them in the green environment (new changes) and this executes parallel to the blue environment. With the help of load balancers, the traffic is slowly routed to the green environment. This process of switching traffic is so smooth that end users don’t even recognize the change.
3. Monitor behavior of the green environment
Once the deployment is done and traffic is switched to the new environment, the user behavior is monitored carefully via logs; in order to track any discrepancies.
4. Final Step: Full deployment or Rollback
During step 3, i.e. while monitoring if any issues are caught the traffic is switched back to a blue environment as a rollback process. If no issues are caught in the process of smoke testing, the green environment becomes the blue environment for the next release cycle.
Read more on the advantages and disadvantages of blue-green deployment.
To sum it up…
Downtime can cost you more expensive than you think. As a DevOps engineer, you should assure deployment with minimal to zero downtime. The blue-green deployment is just one of the techniques to ensure there is minimal downtime in the release process.
Subscribe to our newsletter for more such content.