Mon Mar 17
Gerenciamento Seguro de Secrets no GCP
Como construir uma solução robusta com Google Cloud Secret Manager, Cloud KMS e SOPS.
Introduction
You know that feeling when you accidentally push a database password to GitHub at 4:58 PM on a Friday? Or when you find API keys hardcoded in a three-year-old repository with 200+ stars? I’ve been there. We’ve all been there. That stomach-dropping moment when you realize your “secret” isn’t so secret anymore.
“It’s just a dev environment,” I used to tell myself, right before frantically rotating credentials at midnight while my partner asked why I was sweating at my laptop instead of watching Netflix :(
The truth is, in today’s cloud-first world, managing secrets securely isn’t just a best practice — it’s the difference between a good night’s sleep and an emergency incident response. As a team scaled the GCP infrastructure, the collection of API keys, database credentials, and service account tokens grew faster than my coffee consumption during crunch time (and that’s saying something).
In this article, I’ll share how to implement a comprehensive secret management solution using Google Cloud Secret Manager, Cloud KMS, and SOPS. No more Slack messages asking “Hey, what’s the password for the staging database?” or finding secrets in ancient config files. This approach has transformed how we handle sensitive information, allowing us to scale securely while keeping both our security team and developers happy — a rare achievement indeed.
The Challenge
As our infrastructure grew, we faced several challenges with secret management:
- Security: Storing secrets securely with proper encryption
- Accessibility: Making secrets available to services that need them
- Auditability: Tracking who accessed which secrets and when
- Versioning: Managing changes to secrets over time
- DevOps Integration: Incorporating secrets into our CI/CD pipelines
Our Solution
We built a comprehensive solution using:
- Google Cloud Secret Manager: For centralized secret storage
- Google Cloud KMS: For customer-managed encryption keys (CMEK)
- SOPS: For encrypting secrets in our version control
- Terraform: For infrastructure as code and automation
Architecture Overview
Our architecture follows these principles:
- Project-Specific Deployment: Each GCP project has its own Secret Manager instance
- Centralized Configuration: All configuration is managed in Terraform
- Prefix-Based Organization: Secrets are organized with prefixes (e.g., CM for Collection Manager)
- CMEK Encryption: All secrets are encrypted with customer-managed keys
- SOPS Integration: Secrets are encrypted at rest in our repository
Key Components
1. KMS Configuration
We set up a KMS key ring and crypto key for encrypting our secrets:
resource "google_kms_key_ring" "secrets_key_ring" {
name = var.key_ring_name
location = var.location
labels = {
terraform = "true"
}
}
resource "google_kms_crypto_key" "secrets_crypto_key" {
name = var.crypto_key_name
key_ring = google_kms_key_ring.secrets_key_ring.id
rotation_period = var.rotation_period
version_template {
algorithm = "GOOGLE_SYMMETRIC_ENCRYPTION"
protection_level = "SOFTWARE"
}
labels = {
terraform = "true"
}
}
2. Secret Manager Service Identity
We create the Secret Manager service identity required for CMEK:
resource "null_resource" "create_secretmanager_identity" {
triggers = {
project_id = data.google_project.current.project_id
}
provisioner "local-exec" {
command = "gcloud beta services identity create --service=secretmanager.googleapis.com --project=${data.google_project.current.project_id}"
}
}
3. Secret Module
We created a reusable module for creating secrets with consistent configuration:
module "cm_database_url" {
source = "./modules/secret"
prefix = "CM"
secret_id = "database-url"
secret_value = data.sops_file.secrets.data["cm_database_url"]
kms_key_id = google_kms_crypto_key.secrets_crypto_key.id
location = var.location
secret_accessor_members = [
"serviceAccount:${var.terraform_service_account}"
]
}
4. SOPS Integration
We use SOPS to encrypt our secrets in the repository:
# Encrypted secrets.yaml
cm_database_url: ENC[AES256_GCM,data:D8OoAhTP6GrRNSX1XoWnjfEk3OMD0WEC3FosXNAONA6ALX2o5Oat0TtwPIKYlyJMUmG9FnL4zDo9KAz1HlF1Qk0sj2S94v2gbwAq+v3F3qejhqiUCvWJYzWqoAAJavzz9iYPccOrvKnny9AiwiGQIYUsv4jb+0GOcLo9oxN2T0X3JDVyhTTlen0h4GP5HpF5+RTNkwH0z4k9ctIpnEk=,iv:VbeQikUVlqsXOCn5aKZk2VygkF1B1mzsmMMi/1tDypk=,tag:3dsRsy/R2ZFA3PXIsATunA==,type:str]
Secret Rotation Strategy
Implementing a robust secret rotation strategy is critical for maintaining long-term security. Here’s our approach:
1. Automated Rotation Framework
We built a secret rotation framework using Cloud Functions and Cloud Scheduler:
resource "google_cloud_scheduler_job" "secret_rotation_trigger" {
name = "secret-rotation-trigger"
description = "Triggers the secret rotation function on schedule"
schedule = "0 0 1 * *" # Monthly rotation
time_zone = "UTC"
http_target {
uri = google_cloudfunctions_function.rotate_secrets.https_trigger_url
http_method = "POST"
oidc_token {
service_account_email = google_service_account.rotation_service_account.email
}
}
}
resource "google_cloudfunctions_function" "rotate_secrets" {
name = "rotate-secrets"
runtime = "python39"
entry_point = "rotate_secrets"
# Function code that handles rotation logic
source_archive_bucket = google_storage_bucket.function_bucket.name
source_archive_object = google_storage_bucket_object.rotate_secrets_code.name
service_account_email = google_service_account.rotation_service_account.email
environment_variables = {
SECRET_PREFIXES = "CM,API,DB"
NOTIFICATION_TOPIC = google_pubsub_topic.rotation_notifications.name
}
}
2. Zero-downtime Rotation Process
Our rotation process follows these steps to ensure zero-downtime:
- Create New Version: Generate a new secret value and add it as a new version
- Dual Availability Period: Keep both old and new versions accessible for a configurable period
- Staged Rollout: Update services gradually to use the new secret version
- Monitoring: Verify all systems are using the new version
- Disable Old Version: Once all services are migrated, disable the old version
3. Risk-Based Rotation Schedules
We implement different rotation schedules based on secret sensitivity:
Secret Type Rotation Frequency Example API Keys Monthly External API credentials Database Credentials Quarterly Production DB passwords Encryption Keys Bi-annually Data encryption keys Service Accounts Annually Service identity credentials
4. Integration with Incident Response
We’ve integrated secret rotation with our incident response processes:
resource "google_pubsub_topic" "rotation_notifications" {
name = "secret-rotation-notifications"
}
resource "google_pubsub_subscription" "slack_notifications" {
name = "secret-rotation-slack-alerts"
topic = google_pubsub_topic.rotation_notifications.name
push_config {
push_endpoint = "https://our-slack-webhook-endpoint.com"
attributes = {
x-goog-version = "v1"
}
}
}
resource "google_pubsub_subscription" "pagerduty_alerts" {
name = "secret-rotation-pagerduty-alerts"
topic = google_pubsub_topic.rotation_notifications.name
push_config {
push_endpoint = "https://events.pagerduty.com/integration/abcdef123456/enqueue"
attributes = {
x-goog-version = "v1"
}
}
}
5. Verification and Compliance
Each rotation automatically generates compliance reports:
- Rotation timestamp and executor identity
- Services affected by the rotation
- Verification status of deployment
- Duration of the dual-availability period
- Hash of the old secret (for verification purposes)
This comprehensive rotation strategy ensures we maintain the principle of least privilege over time while providing operational reliability and compliance documentation.
Benefits of Our Approach
1. Enhanced Security
- Multiple Layers of Encryption: Secrets are encrypted by SOPS in our repository and by KMS in Secret Manager
- CMEK Control: We maintain control over the encryption keys
- IAM Integration: Fine-grained access control through IAM policies
- Automated Rotation: Regular rotation reduces the impact of potential compromises
2. Developer Experience
- Transparent Workflow: Developers can edit encrypted secrets using SOPS
- Prefix Organization: Logical grouping of secrets by service or purpose
- Validation Tools: Scripts to validate secret integrity
- Self-Service Access: Clearly defined processes for requesting access
3. Operational Excellence
- Infrastructure as Code: Everything is defined in Terraform
- Automated Testing: Scripts to verify secret accessibility and configuration
- Audit Trail: GCP provides comprehensive audit logs for secret access
- Zero-Downtime Rotations: Services continue operating during secret changes
Implementation Steps
If you’re looking to implement a similar solution, here’s a step-by-step guide:
- Set Up SOPS: Install SOPS and configure with age encryption
- Create Terraform Configuration: Define KMS and Secret Manager resources
- Encrypt Secrets: Use SOPS to encrypt sensitive values
- Deploy Infrastructure: Apply Terraform configuration
- Validate Setup: Run validation scripts to ensure everything works
- Implement Rotation: Set up automation for regular secret rotation
- Document Processes: Create clear documentation for team members
Lessons Learned
Throughout our implementation journey, we learned several valuable lessons:
- Centralize Configuration: Keep all configuration in one place for easier management
- Use Prefixes: Organize secrets with prefixes for better categorization
- Automate Validation: Create scripts to validate secret integrity
- Document Everything: Maintain comprehensive documentation for the team
- Implement Risk-Based Rotation: Apply different rotation frequencies based on secret sensitivity
Conclusion
Building a secure and scalable secret management system is an investment that pays dividends in both security posture and operational efficiency. It’s also an investment in your team’s mental health — no more 2 AM panic attacks when someone remembers they left an AWS key in a Jupyter notebook.
Our implementation of GCP Secret Manager with KMS, SOPS, and automated rotation has significantly improved our security stance while enhancing developer productivity. More importantly, it’s reduced our collective anxiety by about 73% (completely scientific measurement, trust me).
The architecture we’ve shared is battle-tested in production environments and can scale to handle thousands of secrets across multiple projects. I’ve watched this system successfully manage secrets during traffic spikes, midnight deployments, and even that time when half the team was out with the flu and the other half was at a conference.
By implementing a similar approach, you can address the core challenges of secret management while giving your security team something to smile about for once. Because let’s be honest — making your security team happy is like finding a unicorn that also brings you coffee.
Remember that secret management is not a one-time project but an ongoing relationship. Like any good relationship, it needs attention, care, and occasional therapy (in the form of security assessments). Regular reviews, rotation schedules, and security audits should be part of your operational rhythm to ensure your secrets remain secure over time.
And the next time someone asks for a production password over Slack, you can smugly point them to your beautifully designed secret management system instead.
What secret management challenges have you faced in your organization? Have you implemented a similar solution or taken a different approach? Share your experiences in the comments below!
Publicado originalmente no Medium.