Host DataPM Registry in Google Cloud Run
Google Coud Run (GCR) can be used to host a public or private DataPM registry.
You can host your own DataPM registry for private or public use. Be sure to understand the following:
- Read the DataPM License
Advantages
- GCR is very cost efficient.
- GCR is extremely performance scalable.
- GCR is a very secure platform.
- GCR is serverless, and therefore operational maintenance is relatively low.
- datapm.io is hosted on GCR, and therefore hosting on GCR is highly tested.
- Deployment to GCR can be automated using DataPM's ready to go Terraform scripts.
- Includes Postgres Database
- Does not include SMTP mail server
Challenges
- GCR is a serverless platform that is not as widely known as other cloud platforms.
- You will need to be familiar with GCR
- Operating GCR is very different than other cloud platforms
- GCR limits a single HTTP connection lifetime to 60 minutes, and therefore may not be suitable for storing some continuously streaming or very large data sets.
WARNING
This guide and the resources it provides are intended as a helpful starter. These resources are not garunteed to produce a 100% secure deployment of DataPM. You should be cautious. Review and apply GCP's security best practices documentation before, during, and continuously after using this guide.
By using this guide, you agree to the DataPM License and the liability limitations therein.
Install Terraform Command Line Client
- Install Terraform Command Line Client
- MacOS Homebrew:
brew install terraform
- MacOS Homebrew:
Prepare Google Cloud Resources
- Open the GCP Console with your Google Account
- Create a GCP billing account
- You may already have an active billing account
- (Optional) Create a new project folder in your GCP organization
- All datapm projects will be placed in this folder
- You can then share access to GCP resources at the folder level
- Create a GCP project.
- (optional) Place this project in the "datapm" folder you created above
- You will need to enable billing for the project
- It is highly recommended to create a new separate project for each instance of DataPM
- Create a GCP Storage Bucket that will be used to hold the terraform state
- Example name "<company-name>-datapm-
-state" - Enable public access restriction policy
- Use multiple regions to ensure failover
- Use standard storage class
- Enable version retentions with about 5 versions retained for at least 7 days
- Example name "<company-name>-datapm-
Local Google Cloud Authentication
Use one of the options below to set gcloud authentication for use by the Terraform command.
Authentication Option 1: Use GCloud Application Default Credentials
- Use the following command to set your glcoud application default credentilas - which are used by the Terraform Commands below.
- Note: this may affect any other google cloud related applications you are using from the command line.
gcloud auth application-default login
- Be sure to unset/delete the GOOGLE_APPLICATION_CREDENTIALS environment variable, because it would override the above credentials.
- Linux and MacOS:
unset GOOGLE_APPLICATION_CREDENTIALS
- Windows:
$env:GOOGLE_APPLICATION_CREDENTIALS=''
- Linux and MacOS:
Authentication Option 2: Use GCP Service Account Credentials
- Create a GCP service account for deployments
- This service account will be used to deploy the GCP resources
- Give it the "Owner" role
- This is highly over provisioned, and should be trimmed down
- Create a service account key and download a JSON copy of it
- This key file will be used during the terraform deployment process
- You should protect this file, and never check it into source control.
- Set your local
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of the key file- Linux and MacOS
export GOOGLE_APPLICATION_CREDENTIALS=<path>
- Windows
$env:GOOGLE_APPLICATION_CREDENTIALS="<path>"
- Linux and MacOS
- Add the new service account as a Billing Account Principal
- Give it the "Billing Account Viewer" role
Download and Prepare Terraform Files
- Download the DataPM GCP Terraform Scripts.
- You will need to periodically download new versions of the script as they are updated.
- Rename and modify the environment-example.tvars file
- Rename and modify the backend-example.config file
- The bucket is the name of the Google Cloud Storage bucket you created above.
Run Terraform Commands
- Open a terminal and "cd" into the directory with the scripts
- Run the
terraform init --backend-config="backend.config"
command. - Run the
terraform import -var-file="environment.tfvars" google_project.project <google-project-id>
command.- This will import the existing GCP project into the Terraform state
- Run the
terraform plan -var-file="environment.tfvars"
command.- Be sure to review the output for changes and errors
- Run the
terraform apply -var-file="environment.tfvars"
command.- The SQL server can take up to 30 minutes to deploy
- The domain certificate will be provisioned immediately, but will take up to 30 minutes to become active
- The GCP global load balancers will return a "Server Error" until the certificate is active
- You can modify the terraform files, and re-run the terraform plan and apply commands above.
Ongoing Maintenance
- Periodically return to this page to download the latest DataPM GCP Terraform Scripts.
- Setup Google Cloud Security Command Center to monitor the GCP resources you created.