Professional Services, Cloud Platform Services

How to: automated long-term cloud SQL backups step by step guide

February 4, 2020

A solution for automating Cloud SQL backups through GCP technologies leaning as close to fully managed as possible. There are a few ways to implement this solution. In this step-by-step guide, I will show you how to implement such solutions with good, maintainable practices in mind.

The solution uses the following GCP products, aside from Cloud SQL itself:

Cloud Scheduler
Cloud Pub/Sub
Cloud Functions
Cloud Storage
Cloud IAM

Before starting watch out for this

A Cloud SQL instance cannot have multiple "exports" running at the same time. Make sure that the scheduler is set so that the export has time to finish.
The Cloud Function's IAM account is NOT the one that's used to store the SQL data into GCS. It's the Cloud SQL instance's hidden IAM account.

Options

As I said before there are more ways to implement this solution.

1. If you have a multi-project environment, then you may want to create a "backups" project, where pretty much all of the 5 resources mentioned above will reside, and add the Cloud Function's IAM role into the other individual projects, so that the Function can initiate backups.

2. If you have one or more separate projects, you can put all of the resources mentioned above into each one of them.

There are specific steps to achieve either solution.

Steps

Enable the Cloud SQL Admin API (sqladmin.googleapis.com) in the project or projects that the Cloud Function will be running in.
Create a GCS bucket or buckets in the project or projects where you want the backups to be stored in.
- Storage type: Consider Nearline, Coldline, or Archive storage types, to decrease the price of storage, but keep in mind that the price of file retrieval increases as the price of storage decreases
- Lifecycle rules: Consider adding a lifecycle rule to the bucket, so that files are not kept longer than they need to be, e.g. 1 year.
Get the target Cloud SQL instance's hidden IAM account, and assign its email address to the bucket or buckets wherein you want the Cloud SQL instance's data to be transferable to with the "Storage Legacy Bucket Writer" role.
```
$ gcloud sql instances describe CLOUD_SQL_INSTANCE_ID --project GCP_PROJECT_ID | grep serviceAccountEmailAddress
```
Create a new IAM service account in the project where the Cloud Function will be running in.
Add the Cloud Function's Service account to the project or projects where the Cloud SQL instances are with the "Cloud SQL Client" role.
Create a new Pub/Sub topic. We recommend giving it a general name, such as "cloud-sql-backups".

Create a new Cloud Function

Name: We recommend giving it a general name, such as "cloud-sql-backupper".
Region: Doesn't really matter which region is chosen.
Memory: Choose low allocated memory, as the Cloud Function doesn't do any heavy lifting.
Trigger: Select "Pub/Sub" as the "Trigger" of the Cloud Function, and choose the previously created Pub/Sub topic.
Runtime: Node.js 10
Source code: Choose whichever option suits you, but for the sake of simplicity of this guide, choose an inline editor. The code is below.
Function entry point: "initiateBackup"
Service account: Choose the service account you've created previously

Create a Cloud Scheduler job

Name: Give the job a specific name, such as "app-sql-monthly-backup"
Frequency: Specify the frequency in the cron format.
- Monthly: 0 0 1 * *
- Daily: 0 1 * * *
- Hourly: 0 * * * *
Target: Pub/Sub
Topic: Choose the previously created topic

Payload: JSON, as in the following example. Make sure that the bucket path includes the protocol lead (gs://) and that the bucket path does NOT end with a slash (/)

{
    "project": "gcp-project-id",
    "instance": "appdata",
    "database": "sql_database_name",
    "bucket": "gs://bucket-name/directory-without-trailing-slash"
}

The code

const { google } = require('googleapis')
const { auth } = require('google-auth-library')
const sqladmin = google.sqladmin('v1beta4')
/**
 * Triggered from a Pub/Sub topic.
 * 
 * The input must be as follows:
 * {
 *   "project": "PROJECT_ID",
 *   "instance": "CLOUD_SQL_INSTANCE_NAME",
 *   "database": "SQL_DATABASE_NAME",
 *   "bucket": "BUCKET_NAME_WITH_OPTIONAL_PATH_WITHOUT_TRAILING_SLASH"
 * }
 *
 * @param {!Object} event Event payload
 * @param {!Object} context Metadata for the event
 */
exports.initiateBackup = async (event, context) => {
        const pubsubMessage = JSON.parse(Buffer.from(event.data, 'base64').toString())
        const authRes = await auth.getApplicationDefault()
        const request = {
                auth: authRes.credential,
                project: pubsubMessage['project'],
                instance: pubsubMessage['instance'],
                resource: {
                        exportContext: {
                                kind: 'sql#exportContext',
                                databases: [pubsubMessage['database']],
                                fileType: 'SQL',
                                uri: pubsubMessage['bucket'] + '/backup-' + Date.now() + '.gz'
                        }
                }
        }
        sqladmin.instances.export(request, (err, res) => {
                if (err) console.error(err)
                if (res) console.info(res)
        })
}

package.json
{
	"name": "cloudsql-backups",
	"version": "1.0.0",
	"dependencies": {
		"googleapis": "^45.0.0",
		"google-auth-library": "3.1.2"
	}
}

Rudolf Krňoul

Rudolf is a high-level cloud architecture design specialist, DevOps expert for cloud environments and a Linux enthusiast. He has a wide experience in provisioning, containers, Kubernetes, CI/CD, automation, bare *nix environments, and migrations to the cloud. At Revolgy he works as a Cloud Engineer.