This tutorial is part of the Bytebase Terraform Provider series:

What You’ll Learn

  • Define semantic types with various masking algorithms
  • Configure data classification levels and categories
  • Create global masking policies that apply workspace-wide
  • Set up database-specific column masking
  • Grant masking exceptions for specific users

Prerequisites

Before starting this tutorial, ensure you have:

Setup

From the previous tutorials, you should have:
  • Database instances and projects configured
  • Users and access controls set up
  • Production database hr_prod with employee data

Understanding Data Masking in Bytebase

Bytebase employs two concepts for data masking:
  • Semantic Types: Define masking algorithms (e.g., full mask, partial mask)
  • Classifications: Group data by sensitivity levels (e.g., Level 1, Level 2)
Classifications MUST be mapped to semantic types for masking to work:
  • Classifications define sensitivity levels (Level 1, Level 2, etc.) but cannot mask data by themselves
  • Semantic types define the actual masking algorithms (full-mask, range-mask, etc.)
You must map classifications to semantic types for masking to occur (e.g., Level 2 → full-mask) You can apply masking in two ways:
  1. Global masking rules: Define workspace-wide rules that map to semantic types
    • Match by column names or patterns → semantic type
    • Match by classification levels → semantic type
  2. Column-level masking: Apply directly to specific columns
    • Assign semantic types directly to columns
    • Assign classifications to columns (which then use semantic types via global rules)

Configure Data Masking

Step 1: Define Semantic Types

Terraform resourcebytebase_setting
Sample file8-1-semantic-types.tf
Create 8-1-semantic-types.tf to define masking algorithms:
8-1-semantic-types.tf
resource "bytebase_setting" "semantic_types" {
  name = "settings/SEMANTIC_TYPES"

  semantic_types {
    id    = "full-mask"
    title = "Full mask"
    algorithm {
      full_mask {
        substitution = "***"
      }
    }
  }

  semantic_types {
    id    = "date-year-mask"
    title = "Date year mask"
    algorithm {
      range_mask {
        slices {
          start        = 0
          end          = 4
          substitution = "****"
        }
      }
    }
  }

  semantic_types {
    id    = "name-first-letter-only"
    title = "Name first letter only"
    algorithm {
      inner_outer_mask {
        prefix_len   = 1
        suffix_len   = 0
        substitution = "*"
        type         = "INNER"
      }
    }
  }
}

Step 2: Define Data Classification

Terraform resourcebytebase_setting
Sample file8-2-classification.tf
Create 8-2-classification.tf to organize data by sensitivity levels:
8-2-classification.tf
resource "bytebase_setting" "classification" {
  name = "settings/DATA_CLASSIFICATION"

  classification {
    id    = "classification-example"
    title = "Classification Example"

    levels {
      id    = "1"
      title = "Level 1"
    }
    levels {
      id    = "2"
      title = "Level 2"
    }

    classifications {
      id    = "1"
      title = "Basic"
    }

    classifications {
      id    = "1-1"
      title = "User basic"
      level = "1"
    }

    classifications {
      id    = "1-2"
      title = "User contact info"
      level = "2"
    }

    classifications {
      id    = "2"
      title = "Employment"
    }

    classifications {
      id    = "2-1"
      title = "Employment info"
      level = "2"
    }
  }
}

Step 3: Apply Basic Configuration

Apply the semantic types and classification configuration:
terraform plan
terraform apply
Verify in Bytebase:
  1. Click Data Access > Semantic Types on the left sidebar. You should see three masking types configured. semantic-types
  2. Click Data Access > Data Classification on the left sidebar. You should see the classification hierarchy with two levels. Note that Level 2 is marked as more sensitive. classification

Step 4: Apply Global Masking Policy

Now that you’ve defined your masking methods, apply them workspace-wide using a global policy.
Classification levels must be mapped to semantic types to perform actual masking. Classification defines the sensitivity level, while semantic types define the masking algorithm.
Terraform resourcebytebase_policy
Sample file8-3-global-data-masking.tf
Create 8-3-global-data-masking.tf to apply workspace-wide masking rules:
8-3-global-data-masking.tf
resource "bytebase_policy" "global_masking_policy" {
  depends_on = [
    bytebase_instance.prod,
    bytebase_setting.environments
  ]

  parent              = "workspaces/-"
  type                = "MASKING_RULE"
  enforce             = true
  inherit_from_parent = false

  global_masking_policy {

    rules {
      condition     = "column_name == \"birth_date\""
      id            = "birth-date-mask"
      semantic_type = "date-year-mask"
      title = "Mask Birth Date Year"
    }

    rules {
      condition     = "column_name == \"last_name\""
      id            = "last-name-first-letter-only"
      semantic_type = "name-first-letter-only"
      title = "Last Name Only Show First Letter"
    }

    rules {
      condition     = "classification_level in [\"2\"]"
      id            = "classification-level-2"
      semantic_type = "full-mask"  # Maps Level 2 classification to full-mask semantic type
      title = "Full Mask for Classification Level 2"
    }
  }
}
Apply the global policy:
terraform plan
terraform apply
Verify in Bytebase:
  1. Click Data Access > Global Masking. You should see the global policy with three conditions with corresponding semantic types. global-masking
  2. Log in as Developer 1 (dev1@example.com), then go to SQL Editor to access hr_prod. Double-click employee table on the left. birth_date has Mask Birth Date Year semantic type, and last_name has Last Name Only Show First Letter. Hovering the eye icon will show the masking reason. dev1-employee

Step 5: Apply Column-Specific Masking

Terraform resourcebytebase_database
Sample file8-4-database-masking.tf
Create 8-4-database-masking.tf to apply masking to specific columns:
  • Column from_date is assigned the semantic type date-year-mask
  • Column amount is assigned the classification 2-1 (Employment info)
8-4-database-masking.tf
resource "bytebase_database" "database" {
  depends_on = [
    bytebase_instance.prod,
    bytebase_project.project-two,
    bytebase_setting.environments
  ]

  name        = "instances/prod-sample-instance/databases/hr_prod"
  project     = bytebase_project.project-two.name
  environment = bytebase_setting.environments.environment_setting[0].environment[1].name

  catalog {
    schemas {
      name = "public"
      tables {
        name = "salary"
        columns {
          name          = "from_date"
          semantic_type = "date-year-mask"
        }
        columns {
          name          = "amount"
          classification = "2-1"
        }
      }
    }
  }
}
Apply the column-specific masking:
terraform plan
terraform apply
Verify in Bytebase:
  1. Go into Project Two, then click Database > Databases and click hr_prod.
  2. Scroll down to find salary table, click it. You should see:
    • amount is assigned as Employment info (Level 2) classification
    • from_date is assigned as date-year-mask semantic type
    table-detail-salary
  3. Log in as Developer 1 (dev1@example.com), then go to SQL Editor to access hr_prod. Double-click salary table on the left. from_date has Date year mask semantic type, and amount has L2 classification which leads to Full masking semantic type. dev1-salary

Step 6: Grant Masking Exceptions

Terraform resourcebytebase_policy
Sample file8-5-masking-exception.tf
Create 8-5-masking-exception.tf to grant bypass permissions:
  • Workspace Admin (admin@example.com) has Masking Exemptions for birth_date in table employee for Query
  • Workspace Admin (admin@example.com) has Masking Exemptions for last_name in table employee for Export
8-5-masking-exception.tf
resource "bytebase_policy" "masking_exception_policy" {
  depends_on = [
    bytebase_project.project-two,
    bytebase_instance.prod
  ]

  parent              = bytebase_project.project-two.name
  type                = "MASKING_EXCEPTION"
  enforce             = true
  inherit_from_parent = false

  masking_exception_policy {
    exceptions {
      reason = "Business requirement"
      database = "instances/prod-sample-instance/databases/hr_prod"
      table    = "employee"
      column   = "birth_date"
      member   = "user:admin@example.com"
      action   = "QUERY"
      expire_timestamp = "2027-07-30T16:11:49Z"

    }
    exceptions {
      reason = "Export data for analysis"
      database = "instances/prod-sample-instance/databases/hr_prod"
      table    = "employee"
      column   = "last_name"
      member   = "user:admin@example.com"
      action   = "EXPORT"
      expire_timestamp = "2027-07-30T16:11:49Z"
    }
  }
}
2027-07-30T16:11:49Z is an ISO 8601 UTC timestamp. Our system uses PostgreSQL to store metadata, where this value is stored as a timestamptz.

Step 7: Apply Final Configuration and Test

Apply the masking exceptions and test everything:
terraform plan
terraform apply
Verify the masking exceptions are working:
  1. Log in as Workspace Admin (admin@example.com), then go to SQL Editor to access hr_prod, double-click employee table on the left. You may notice the birth_date is not masked any longer. admin-employee-query
  2. Click Export, and then open the file. You should notice the birth_date is still masked while last_name is no longer masked. admin-employee-export
  3. You may go to Manage > Masking Exemptions to view current exemptions. They will expire automatically after the expiration time. masking-exceptions

Summary

This tutorial demonstrated how to implement data masking in Bytebase using Terraform. Here are the key concepts: Define Phase:
  • Semantic Types: Define reusable masking algorithms
  • Classification: Organize data by sensitivity levels (requires mapping to semantic types)
Apply Phase:
  • Global Policies: Apply masking rules workspace-wide based on conditions
  • Column-Level Masking: Apply semantic types or classifications to specific columns
Additional Control:
  • Masking Exemption: Grant bypass permissions for specific users to query/export unmasked data

Next Steps

Congratulations! You’ve completed the Bytebase Terraform tutorial series. You now have a fully configured Bytebase workspace with:
  • Database instances and environments
  • Organized projects
  • Risk policies and approval workflows
  • SQL review rules for schema standards
  • Database access control
  • Data masking for sensitive information