Anyone who has deployed a Google Cloud Function written in Go knows that there are a number of restrictions involved. For example, the highest Go version supported is 1.13. Another example is the lack of built-in support for dependencies when using Go modules. This post covers an approach to handling the latter using Terraform.
I’m not going into details about Go modules, or how to configure your environment to use private modules. Basically, I am assuming you have a cloud function with private dependencies, and a working local environment where you are able to fetch those dependencies.
The principles in this post can be applied to a CI/CD pipeline as well. The only thing that would be different is how you authenticate with the upstream Git repository, but the details for that could be the topic for another post.
The function module
So let’s start with a quick overview of what we’re really talking about here. A hypothetical scenario could be a Git repository where you have defined a Go module that has a go.mod
that looks a little something like this:
module github.com/hedlund/my-cloud-function
go 1.13
require (
cloud.google.com/go v0.81.0
github.com/hedlund/private-module v0.1.0
)
Basically, you have a module that may, or may not, have a number of public dependencies (the cloud-google.com/go
module in this example). Those are not an issue, as if you deploy the function and simply include the go.mod
and go.sum
files, the deployment pipeline will automatically download and use those dependencies for you.
The issue arise when it tries to download any private modules, such as the github.com/hedlund/private-module
. Google Cloud Build (which is used under the hood), and it’s service accounts, do not have access to download your private repositories.
If you have worked with Cloud Build before, you know that there are ways to authenticate the builds to actually get access (this is how you get a CI/CD pipeline working). Unfortunately Google does not expose, or give us control, over the actual build tasks involved when deploying a Cloud Function, so that route is a dead end.
The solution is to move the parts that require authentication to outside of Google’s automated build process - to before we actually upload any code to the Cloud Functions. As mentioned earlier, this can either be on your local development machine, or it can be part of a CI/CD pipeline, even one running on Google Cloud Build. It does not matter from where you do it, as long as the environment running it can authenticate with whatever upstream Git repository you are using.
A bit of Terraform
Let’s start with the basic Terraform setup to actually deploy a function. The way I typically do it is to zip the whole folder containing the function code (you may need to exclude some files though), upload it to a storage bucket (not included in this example), and then deploy that zip file:
data "archive_file" "function_archive" {
type = "zip"
source_dir = "${path.module}/.."
output_path = "${path.module}/../my-function.zip"
excludes = ["terraform", "my-function.zip"]
}
In this hypothetical example the Terraform code is in a sub-folder, terraform
, so it zips everything from it’s parent folder, excluding the Terraform code and the zip-file itself. Then we upload that file to a storage bucket:
resource "google_storage_bucket_object" "function_archive" {
bucket = var.bucket
name = "my-function-${data.archive_file.function_archive.output_md5}.zip"
source = data.archive_file.function_archive.output_path
}
The bucket is not defined in this code, but just passed as a variable. Finally, we create the Cloud Function itself using the uploaded archive:
resource "google_cloudfunctions_function" "my_function" {
name = "my-function"
entry_point = "MyFunction"
runtime = "go113"
trigger_http = true
source_archive_bucket = google_storage_bucket_object.function_archive.bucket
source_archive_object = google_storage_bucket_object.function_archive.name
}
I’ve simplified the code a bit, and only included the most relevant parts, but you can always check the documentation for more thorough examples.
If we didn’t have private dependencies in our example, that would be everything we need to do in order to deploy the function. Unfortunately, in our case we do and this will fail during the deployment, when it tries to download hedlund/private-module
.
Vendoring
The simplest solution is really simple, and all we need to do is to wrap all our Terraform commands in a script that vendors our dependencies before creating the archive:
#!/bin/bash -e
go mod vendor
(cd terraform && terraform apply)
rm -rf vendor
Basically, what we do is tell Go to vendor all dependencies, meaning it copies the dependencies from the cache in your local GOROOT
into a vendor
folder created within your local project. Then we run terraform apply
(remember, our Terraform code is in a sub-folder), which will automatically incude the new folder in the zip-file. Finally we need to remove the vendor
folder again, as it’s in the way of our normal development.
This is actually quite fragile, as if something goes wrong when running the Terraform command, or you terminate the script with Ctrl + C, it will most likely not remove the
vendor
folder. You can write some scripts to trap any termination signal, and do the cleanup that way, but we will improve the script in another way, so don’t worry about it.
Another, bigger issue, is that you need to do this for every Terraform command you need to run. Granted, plan
and apply
are likely the most important ones, but it still adds a bit of overhead.
Since we have vendored your dependencies, we must also make sure to not include the go.mod
and go.sum
files in the archive file, otherwise the deployment process will still try to download the dependencies:
data "archive_file" "function_archive" {
type = "zip"
source_dir = "${path.module}/.."
output_path = "${path.module}/../my-function.zip"
excludes = [
"terraform", "my-function.zip",
"go.mod", "go.sum", # This line is the only change
]
}
At this stage, you are pretty much good to go and should be able to deploy your Cloud Function using the script. That is, as long as all your local function code is in the root directory! If there are any other packages, the deployment will still fail.
Local packages
If any of your code is organised into packages (i.e. sub-folders), the Cloud Function runtime will lose the context of how to import that code the moment we remove the go.mod
file, as we are technically no longer deploying a Go module.
What we need to do is to extend our script to copy the local packages into the vendor
folder, following the same structure as the package would have on the GOPATH
. Let’s say that we have two packages sub1
and sub2
that we need to copy; our script would then look something like:
#!/bin/bash -e
go mod vendor
# New: copy local packages
mkdir -p vendor/github.com/hedlund/my-cloud-function
cp -r sub1 vendor/github.com/hedlund/my-cloud-function
cp -r sub2 vendor/github.com/hedlund/my-cloud-function
(cd terraform && terraform apply)
rm -rf vendor
First we create a folder matching the fully qualified name of our module, then we simply copy the local packages into that folder. The Terraform script will handle the rest, and cleanup works the same as before.
This is still quite verbose, and we have a lot of hard-coded (and duplicated) strings in here. If we change the module name, we’d have to remember to also change it in our scripts, and if we add another package we’d have to add code for that as well. Let’s improve the script, making it a bit more generic:
#!/bin/bash -e
go mod vendor
# New: extract module name...
module=$(< go.mod grep "^module .*")
module=${module#"module "}
# ...and copy any folders
mkdir -p "vendor/$module"
for f in * ; do [ -d "$f" ] && cp -r $f "vendor/$module" ; done
(cd terraform && terraform apply)
rm -rf vendor
There’re two changes made to the script. First, we extract the name of the module from the go.mod
file itself. That way we only have to define it in a single location. I’m using grep
and parameter expansion to accomplish the task, but there are probably better ways of doing it - I’m no bash
ninja.
Secondly, we loop over all files in the root directory and automatically copy those that are folders. That way we don’t have to remember to update the script if we add a new package, it will always be copied automatically.
This solution actually works quite well, but there are still some problems with it. First of all it’s an additional layer on top of Terraform that we always have to run and maintain. More importantly, due to the vendoring and copying of the files, it will always mark the archive as changed, and thus as in need of redeployment, even though no code has actually been changed. Let’s fix that with a bit of Terraform hackery!
“Plain” Terraform
Let’s remove the need for en external script, by actually making it internal instead. I tried a bunch of different aproaches and resources types before actually reaching this solution. While it’s not perfect, it does work - at least on my machine!
The first thing we’re going to do is to change the archive code back to the original:
data "archive_file" "function_archive" {
type = "zip"
source_dir = "${path.module}/.."
output_path = "${path.module}/../my-function.zip"
excludes = ["terraform", "my-function.zip"]
}
In order for this to work, we need to include the
go.mod
andgo.sum
files in the zip.
Then we are going to use a null_resource
and a local-exec
provisioner to run our script from before, but as part of the normal Terraform process:
locals {
function_vendor_path = replace(data.archive_file.function.output_path, ".zip", "-vendor.zip")
}
resource "null_resource" "function_vendor" {
triggers = {
archive_md5 = data.archive_file.function_archive.output_md5
}
provisioner "local-exec" {
interpreter = ["/bin/bash", "-c"]
command = <<-EOT
cd "$(mktemp -d)"
unzip ${abspath(data.atchive_file.function_archive.output_path)}
module=$(< go.mod grep "^module .*")
module=$${module#"module "}
mkdir -p "vendor/$module"
for f in * ; do [ -d "$f" ] && mv $f "vendor/$module" ; done
rm go.mod && rm go.sum
rm -f ${abspath(local.function_vendor_path)}
zip -r ${abspath(local.function_vendor_path)} .
EOT
}
}
There’s lots of things going on in these few lines. The whole purpose of the code is to take the zip-file Terraform creates of our function, and automatically create a new one that also contains our vendored dependencies. So we first declare a local variable that creates the path to our new zip-file based on the original one, but with a -vendor
suffix.
Then we create a null_resource
with a trigger that depends on the MD5 hash of the original zip-file. That means that any time the zip-file is updated, the null_resource
will also be updated. Thus, if we make any changes to the code, or the dependencies, the null_resource
will be updated, but otherwise nothing will happen - exactly what we are after.
On the resource we run a local script, which is quite similar to the one we used earlier. One change is that instead of running the script in the local folder, we extract the zip-file to a temporary folder and make our changes there instead. This also means that instead of copying our local packages, we can instead move them into place (and save some space). As this is a temporary copy of the original code, we can also simply delete the go.mod
and go.sum
files before we zip everything back together into our new zip-file, which will be placed alongside the original one.
The final thing we need to do, is to upload the new vendored file instead of the original zip-file:
resource "google_storage_bucket_object" "function_archive" {
bucket = var.bucket
name = "my-function-${data.archive_file.function_archive.output_md5}.zip"
# These are the changes:
source = local.function_vendor_path
depends_on = {
null_resource.function_vendor,
}
}
As you can see, we only change the source declaration, meaning it will upload the vendored zip-file, and actually keep the original name, using the MD5 of the original zip-file. This way we are actually lying a bit, but there is a good reason for doing so. Because our vendored zip is actually created outside of the Terraform state, there is no way (that I found at least) to trigger the upload based on the changes made the “external” vendored file. So what I ended up doing is piggy-backing on the same mechanism we use to actually trigger creating the file in the first place, and that is changes made to the original zip-file, using the MD5 hash in the name as a trigger signal.
Since we also declare that the storage bucket object depends_on
the null_resource
, we make sure that our custom script is run before Terraform actually uploads the new zip-file to the storage bucket.
Final words
Is this a perfect solution? Nope, not at all. Is this even a good solution? Borderline, but it works (again, on my machine at least) and it can be nicely bundled into a Terraform module to hide its ugliness.
The best solution would be for Google to up their game and actually support the features of their own language when used together with their cloud solutions… But that’s a topic for another day.