Cloud Platform Services

Productboard: “Why we decided to migrate to AWS” (Webinar recap)

January 14, 2020

Serverless solutions are a growing trend. More companies are deciding to move their IT from previous solutions (whether an on-premises datacenter or platform-as-a-service type) to the cloud.

Why are they doing that? What are the benefits of running your infrastructure in the cloud? How does the migration process look like from a technical point of view? What should you look out for?

We discussed these topics with Tomáš Fedor and Tomáš Růžička from Productboard — a Czech project management software developer. This fast-growing technology business recently decided to move their workloads from Heroku (Platform as a Service) to Amazon Web Services (AWS Cloud). They were kind enough to agree to share their migration story in a webinar.

This recap is for those of you who want to learn more from Productboard’s first-hand experience but prefer reading to listening.

The challenge

Productboard’s infrastructure was a typical monolith that ran on the Heroku platform (PaaS). This solution was good enough for the beginning, but as the business grew, they needed to start thinking long-term about both organizational and infrastructure scaling. With the company’s growth and the acquisition of corporate customers came much higher demands on the security of their product.

From the infrastructure point of view, it became clear that their monolithic approach was not sustainable anymore. Updating, scaling or developing new features independently from one another was a real challenge. All these combined noticeably restricted the efficiency of their development.

“When you have multiple people touching very closed parts of the codebase, it restricts your development efficiency,” says Tomáš Růžička

Giving developers more control

The company put a great emphasis on improving the developers’ experience. One of the main aims of the new design of their architecture was to give developers more control and more insight into what is happening in each separate part of the system. They decided to use the Infrastructure as Code principle which allows them to share knowledge easily. They started with IM rolls and users that allowed them to have more control over the level of access different people have to particular services.

The technologies implemented

The first step before migrating to AWS was to find equivalents of technologies they were using that are available on the new platform.

Except for basic ones such as EC2, S3, DynamoDB, and Route 53 they have also implemented RDS, which allows them to scale relational databases in the cloud with just a few clicks. The installed ACM (AWS Certificate Manager) for management of certificates and ElastiCache, a fully managed data store that allows for secure, fast performance.

First of all, they had to decide how to organise their infrastructure in AWS. They decided to split it into 3 main accounts: production, staging, and ops (operations).

“Production and staging are pretty simple, ops are used mostly for infrastructure purposes.” says Tomáš Fedor

Next, they implemented proper IAM (AWS IAM) that enables them to manage access to AWS services and resources securely.

They run a Kubernetes cluster in each account to have better control over what is running where. Additionally, they run all their main infrastructure services such as Grafana, Prometheus, aws-iam-authenticator, and DataDog in those Kubernetes clusters. Orchestrating their containers in Kubernetes allows them to build, test and deploy new features much faster now.

Another big question was what kind of CI were they going to use? Heroku offers a very convenient solution for PRs (Pull Requests). Productboard’s Dev team is currently using CircleCI, which does not offer them any PRs, so they need to find a different solution. They are considering GitLab as it also offers much more than CI and has better support for Kubernetes. Security is one of the key factors for Productboard, and they are very impressed with the number of improvements that GitLab has made in this field.

Running the proof of concept first for a zero-downtime switch

On the application layer, they had to do quite a lot of steps.

Before they even launched Kubernetes, they had a proof of concept service in production. This allowed them to explore the optimal settings for running their infrastructure without disrupting the service. Once they decided on the final design, they started working on the monolith itself. The aim was to remove any Heroku dependencies. They went through the list and found alternatives to Heroku add-ons on AWS. They had a few Heroku-specific pieces of code that they feature-flagged in order to allow for running the app simultaneously on AWS and Heroku during the transition time.

Since they were not running any Docker containers in production just yet, they had to dockerize the application. During this process, it was key to configure all the environments settings, database connections and so on. Moving closer to the infrastructure part of the migration, they decided to use Helm charts so they could easily configure the components based on the metrics that they’ve seen in Heroku. This is where settings such as limits and thresholds were adjusted.

Then came the deployment process itself. In Heroku, a lot of it already comes built-in. For example, Heroku Ruby builtpack provided them with automatic migration and asset precompiling etc. It was important to find a counterpart in the new setup. They decided to go with the combination of CI and Init Container. Based on that, they figured out what the optimal setup for the migration is and how to upgrade all the pods that they will be running with zero downtime.

Before the final switch, they ran an image on the cluster on AWS and monitored what was happening, looking out for any issues that may have arisen. Once they were confident that their design was performing well on staging, they ran stress-tests and set performance benchmarks to see what the thresholds were, and how it all compared to the Heroku version.

As for the front-end part of our application, it was nothing more than a simple switch of a CDN.

Next steps

Now that they have all this figured out, they’re thinking about all of the different ways to split the monolith. This is where the fun part starts. The challenge ahead of them is figuring out how to tackle this. What is the right bounding context for all the different features?

At the moment, the whole project is in staging and they’re planning to go to production at the end of July.

Was there anything during the process that surprised them?

“Yes. Since nobody on board has ever migrated a large-scale application running on production, we were not aware of all the little details that you might run into” — says Tomáš Růžička

He further explains that Heroku fights for their customers and it is not easy to leave them when so many of their systems were completely dependant on it.

“In our case, Postgres, for example, couldn’t be easily replicated with zero downtime. We actually needed to publicly expose our Redis instance inside AWS temporarily so that we could use our current application staging in Heroku but have the data already migrated to AWS. Exposed databases aren’t something you want to have in production”.

The benefits

“For us, one of the most important benefits is that we have a scalable, secure solution that will help us deliver features to production fast.” — says Tomáš Fedor

He also says that the total cost of the AWS-based solution in comparison to Heroku comes out way more efficient.

“With the company expanding and our infrastructure growing, it will be cheaper for us to run it in AWS. We will also be able to meet the demands of our corporate customers such as private clouds, private databases, security certifications and so on” — he adds

FAQs

Q1: Why did Productboard decide to migrate its infrastructure from Heroku to AWS?

Their original monolithic infrastructure on the Heroku platform was no longer sustainable for their business growth. As they acquired more corporate customers, they faced higher security demands. The monolith made it challenging to update, scale, or develop new features independently, which restricted the efficiency of their development teams.

Q2: What were the main goals of the migration to AWS?

The key objectives were to establish a scalable and secure long-term solution, give developers more control and insight into the system, and accelerate the delivery of new features to production. A major business driver was to create a more cost-efficient infrastructure that could meet the advanced security and compliance demands of their corporate customers.

Q3: How did Productboard structure its new environment in AWS?

They split their infrastructure into three main accounts: production, staging, and ops (operations). They implemented AWS IAM to manage access securely and run a separate Kubernetes cluster in each account for better control over their services.

Q4: What key technologies were implemented as part of the new AWS architecture?

Besides core services like EC2, S3, and Route 53, they also implemented RDS for scaling relational databases, AWS Certificate Manager (ACM) for handling certificates, and ElastiCache for a high-performance, managed data store.

Q5: Why did Productboard choose to use Kubernetes for its new infrastructure?

By orchestrating their containers in Kubernetes, they can build, test, and deploy new features much faster. It also provides them with better control over what is running in each of their accounts, where they also run infrastructure services like Grafana and Prometheus.

Q6: What was a surprising challenge the team encountered during the migration?

Since no one on the team had ever migrated a large-scale application, they were unprepared for all the small details involved. Specifically, it was difficult to migrate away from some Heroku-dependent systems with zero downtime. For instance, their Postgres database couldn’t be easily replicated, which forced them to temporarily expose their Redis instance publicly to AWS — a practice not recommended for production environments.

Q7: What are the primary benefits Productboard gained from this migration?

The most important benefits are having a scalable and secure solution that allows them to deliver features to production much faster. The AWS-based solution is also significantly more cost-efficient compared to Heroku. Additionally, they are now able to meet specific demands from corporate customers, such as providing private clouds, private databases, and obtaining security certifications.

Q8: What is the next major step for Productboard after establishing the new infrastructure?

Now that the migration is nearly complete, the next challenge is to begin splitting their monolith. The team is now focused on figuring out the correct bounding context for their various features to break them apart into microservices.