Bootstrap Process
The bootstrap process loads existing data from your source database into Kasho while ensuring zero data loss during the transition.
Understanding Bootstrap States
pg-change-stream operates in three states during its lifecycle:
WAITING
No replication slot exists. Service is ready to begin bootstrap.
ACCUMULATING
Replication slot created. Capturing changes during initial data load.
STREAMING
Normal operation. Streaming all changes to pg-translicator.
How Bootstrap Works
The bootstrap process ensures data consistency by:
- Creating a consistent snapshot of the source database at a specific point in time
- Starting change accumulation from that exact point
- Loading the snapshot data into the target system
- Transitioning to streaming with all accumulated changes
This approach guarantees that no changes are lost during the initial data migration.
Running Bootstrap
Prerequisites
Before starting bootstrap:
- Ensure
pg-change-streamis running and in WAITING state - Have sufficient disk space for the database dump
- Ensure network connectivity between all components
Option 1: Automated Bootstrap Script (Recommended)
The easiest way to bootstrap is using the provided script:
# Check `pg-change-stream` status
grpcurl -plaintext pg-change-stream:50051 kasho.ChangeStreamService/GetStatus
# Should show state: WAITINGRun the bootstrap script inside the pg-change-stream container:
# Interactive mode - prompts before transitioning to streaming
docker exec -it <pg-change-stream-container> ./bootstrap-kasho.sh
# Automatic mode - transitions without prompting
docker exec -it <pg-change-stream-container> \
env WAIT_FOR_BOOTSTRAP=true ./bootstrap-kasho.shFinding Your Container Name
Use docker ps to find your pg-change-stream container name. It will typically be something like kasho-pg-change-stream-1 or similar based on your deployment method.
The script will:
- Create a temporary replication slot for consistent snapshot
- Signal
pg-change-streamto start accumulating changes - Take a database dump using the snapshot
- Convert the dump to change events
- Clean up temporary resources
- Transition to streaming mode
Option 2: Manual Bootstrap Process
For more control, you can run the bootstrap steps manually:
Click to expand manual bootstrap steps
Step 1: Create Temporary Snapshot
-- Connect to source database as kasho user
-- Create a temporary slot to get a consistent snapshot
SELECT slot_name, lsn, snapshot_name
FROM pg_create_logical_replication_slot('kasho_temp_slot', 'pgoutput', true);
-- Note the LSN and snapshot_name valuesStep 2: Start Accumulation
# Tell `pg-change-stream` to create permanent slot and start accumulating
grpcurl -plaintext \
-d '{"start_lsn": "<lsn-from-step-1>", "snapshot_name": "<snapshot-from-step-1>"}' \
pg-change-stream:50051 \
kasho.ChangeStreamService/StartBootstrapStep 3: Take Database Dump
# Use the snapshot for consistency
pg_dump \
--snapshot=<snapshot-from-step-1> \
--no-owner \
--no-privileges \
--data-only \
"$PRIMARY_DATABASE_URL" > dump.sqlStep 4: Process Dump
# Convert dump to change events
docker run --rm \
-v $(pwd):/data \
--network your-network \
kashoio/kasho:latest \
pg-bootstrap-sync \
--dump-file=/data/dump.sql \
--redis-url=redis://redis:6379Step 5: Clean Up and Transition
-- Drop the temporary slot
SELECT pg_drop_replication_slot('kasho_temp_slot');# Transition to streaming mode
grpcurl -plaintext \
pg-change-stream:50051 \
kasho.ChangeStreamService/CompleteBootstrapMonitoring Progress
During bootstrap, monitor the progress:
# Check current state
grpcurl -plaintext pg-change-stream:50051 kasho.ChangeStreamService/GetStatus
# Watch logs
docker logs -f pg-change-stream
# Monitor accumulated changes
# The AccumulatedChanges count shows buffered changes during bootstrapLarge Database Considerations
For databases larger than 100GB:
-
Use compression:
pg_dump ... | gzip > dump.sql.gz -
Consider parallel dump:
pg_dump --jobs=4 ... -
Monitor disk space during the dump process
-
Run during low-traffic periods to minimize accumulated changes
Troubleshooting
”replication slot already exists”
This usually means a previous bootstrap wasn’t cleaned up properly:
-- Check existing slots
SELECT * FROM pg_replication_slots;
-- If kasho_slot exists and you want to start over
SELECT pg_drop_replication_slot('kasho_slot');Then restart pg-change-stream to return to WAITING state.
Bootstrap seems stuck
Check for:
- Network connectivity issues
- Disk space for dump file
- Long-running transactions blocking the snapshot
High memory usage during bootstrap
pg-bootstrap-sync processes the dump in batches. For very large tables, you may need to increase container memory limits.
After Bootstrap
Once bootstrap completes and pg-change-stream is in STREAMING state:
- Verify data integrity - Compare row counts between source and target
- Monitor replication lag - Check that changes are flowing normally
- Remove dump files - Clean up temporary dump files to free disk space
Next Steps
- Learn about Transform Configuration
- Return to the Quick Start guide
- Review Configuration Options