Skip to main content
Optimize migration performance and effectively manage schema drift in your GitOps workflow.

Performance Optimization

Batch Large Data Migrations

For DML on large tables, process in batches:
-- 050__archive_old_logs_dml.sql
-- Instead of single large DELETE:
-- ❌ DELETE FROM logs WHERE created_at < '2024-01-01';

-- Use batched approach:
DO $$
DECLARE
    batch_size INT := 10000;
    deleted_count INT;
BEGIN
    LOOP
        DELETE FROM logs
        WHERE id IN (
            SELECT id FROM logs
            WHERE created_at < '2024-01-01'
            LIMIT batch_size
        );

        GET DIAGNOSTICS deleted_count = ROW_COUNT;
        EXIT WHEN deleted_count < batch_size;

        COMMIT; -- If supported
        PERFORM pg_sleep(0.1); -- Throttle
    END LOOP;
END $$;
Benefits:
  • Reduces lock contention
  • Prevents transaction log bloat
  • Allows monitoring progress
  • Can be paused/resumed

Use Online Schema Changes for Large Tables

For MySQL tables > 1M rows:
-- Instead of:
-- ❌ 010__add_index.sql
CREATE INDEX idx_users_email ON users(email);

-- Use gh-ost:
-- ✅ 010__add_index_ghost.sql
ALTER TABLE users ADD INDEX idx_users_email (email);

Online Schema Migration

Configure gh-ost for zero-downtime MySQL migrations

Optimize Index Creation

Create indexes concurrently when possible:
-- PostgreSQL: Non-blocking index creation
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- MySQL: Use online DDL
ALTER TABLE users ADD INDEX idx_email (email), ALGORITHM=INPLACE, LOCK=NONE;
Note: CONCURRENTLY requires:
  • Cannot be in transaction block
  • Takes longer than regular index creation
  • May fail if concurrent writes conflict

Timing Considerations

Best times for migrations:
  • ✅ Low-traffic hours (nights, weekends)
  • ✅ During maintenance windows
  • ✅ After application deployments
Avoid:
  • ❌ Peak business hours
  • ❌ During critical operations (month-end, sales events)
  • ❌ Without communication to stakeholders

Monitor Migration Performance

Track execution time during development:
-- Add timing to migration
\timing on
-- Migration SQL here
\timing off
Set expectations for production:
-- 015__large_migration.sql
-- Expected duration: 5-10 minutes
-- Locks: users table (write lock)
-- Tested on: 2M row table

Handling Schema Drift

Schema drift occurs when manual changes bypass GitOps.

Detection

Bytebase drift detection identifies untracked changes:

Drift Detection

Detect and manage schema drift
How drift detection works:
  1. Bytebase periodically scans database schema
  2. Compares with GitOps tracked schema
  3. Identifies differences (new columns, indexes, etc.)
  4. Alerts team of drift

Prevention

1. Restrict Direct Database Access Use readonly users for most access:
-- Default user: read-only
GRANT SELECT ON ALL TABLES IN SCHEMA public TO app_user;

-- Special user for emergencies only
GRANT ALL ON DATABASE app_db TO emergency_user;
2. Require All Changes Through GitOps Enforce policy through:
  • Database access controls
  • Team documentation
  • Code review requirements
  • Regular audits
3. Monitor Drift Alerts Set up notifications:
{
  "drift_detection": {
    "enabled": true,
    "schedule": "0 */6 * * *",  // Every 6 hours
    "notification": {
      "channels": ["slack", "email"]
    }
  }
}
4. Regular Drift Checks Audit schema consistency:
  • Weekly automated scans
  • Monthly manual reviews
  • After emergency changes

Resolution

When drift is detected: Option 1: Reverse the drift
-- Revert manual change to match GitOps state
ALTER TABLE users DROP COLUMN unauthorized_column;
Option 2: Incorporate into GitOps
-- Create migration reflecting the change
-- 055__add_emergency_column.sql
ALTER TABLE users ADD COLUMN emergency_column TEXT;
Decision matrix:
ScenarioAction
Unauthorized changeReverse immediately
Emergency hotfixDocument and create migration
Testing/developmentReverse in non-prod
Production emergencyIncorporate, review process

Drift Incident Response

When drift occurs:
  1. Detect - Alert received
  2. Assess - Identify change and impact
  3. Communicate - Notify team
  4. Resolve - Reverse or incorporate
  5. Document - Post-incident review
  6. Prevent - Update processes
Incident template:
# Drift Incident: [Date]

## Summary
- **Detected:** 2025-01-20 14:30 UTC
- **Database:** production-db
- **Change:** Column `temp_fix` added to `users` table

## Impact
- No customer impact
- Not tracked in GitOps

## Root Cause
- Emergency hotfix bypassed GitOps
- Developer had elevated permissions

## Resolution
- Created migration 056__add_temp_fix.sql
- Updated GitOps repository

## Prevention
- Removed elevated permissions
- Added approval requirement for prod access

Performance Monitoring

Track Migration Metrics

Monitor these metrics:
MetricTargetAlert Threshold
Migration duration< 5 min> 10 min
Lock wait time< 1 sec> 5 sec
Rows affectedDocumented> Expected
Rollback success rate100%< 100%

Database Health Checks

Before and after migrations:
-- Check table sizes
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- Check index usage
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan < 100
ORDER BY pg_relation_size(indexrelid) DESC;

Next Steps