Skip to main content
Basecut can execute snapshot creation in two modes. Both produce identical snapshots—the difference is where the processing happens. For production workloads, self-hosted agents are recommended. They provide better security (your credentials stay in your environment), better performance (same region as your database), and predictable costs.

Choose Your Execution Mode

Use this decision tree to select the right mode for your use case:

Quick Decision Guide

1. Do you have a self-hosted agent with network access to your source database?
  • No → Use Local Execution (default)
  • Yes → Continue
2. Are you working in production or with production-sized data?
  • Yes → Use Self-Hosted Agents (recommended for production)
  • No → Continue
3. Do you have self-hosted agents deployed?
  • No → Use Local Execution (default, no setup required)
  • Yes → Use Self-Hosted Agents (better performance and security)

Quick Reference

ScenarioRecommended ModeReason
Private DB + agent in VPCSelf-Hosted AgentAgent can reach private network safely
Private DB + no agent pathLocalNo remote worker can reach the database
Quick testing / developmentLocalFaster setup, no infrastructure needed
Production workloadsSelf-Hosted AgentBetter security, performance, and reliability
Large datasets (> 1M rows)Self-Hosted AgentDoesn’t consume local resources
CI/CD pipelinesSelf-Hosted AgentConsistent execution environment
Small datasets (< 100k rows)LocalQuick and simple
Team collaborationSelf-Hosted AgentCentralized execution and job history
Default behavior: Local execution (no --async flag)
Deploy agents in your infrastructure to process extractions. Your CLI submits jobs to the Basecut API, self-hosted agents pick up the work, and results are stored in your configured S3/GCS bucket.

When to Use

  • Production databases - Agent runs in cloud environment with stable network and resources
  • Large extractions - Process millions of rows without consuming local CPU/memory
  • Team workflows - Centralized execution with job history and shared snapshots
  • CI/CD pipelines - No local database access needed—agent connects directly to your DB

How It Works

basecut snapshot create \
  --config basecut.yml \
  --name "prod-snapshot" \
  --async
  1. CLI submits job to Basecut API with your configuration
  2. Self-hosted agent picks up the job from the queue
  3. Agent connects to your database (requires network access)
  4. Extraction runs on agent infrastructure
  5. Snapshot uploaded to configured cloud storage (S3/GCS)
  6. CLI polls for job status and reports completion
Storage: Always uses cloud storage (S3 or GCS) Requirements:
  • Agent needs network access to your database
  • Database credentials passed securely via config
  • Cloud storage bucket configured in basecut.yml

Configuration

# basecut.yml
output:
  provider: s3
  bucket: my-basecut-snapshots
  region: us-east-1
  # prefix: snapshots/  # optional path prefix
# Agent execution is enabled with `--async`
--async is ignored when output.provider: local (agents can’t write to your local filesystem). In that case, Basecut runs locally and prints a warning.

Local Execution

The CLI runs extraction directly on your machine. Useful for development and databases without public access.

When to Use

  • No agent deployment yet - Start immediately with no infrastructure
  • DB reachable from your machine - Local dev DB, tunnel, or VPN from laptop
  • Development/testing - Quick iteration without job queue latency
  • Small datasets - Extraction completes in seconds on your laptop
  • Air-gapped environments - No internet access required (with local storage)

How It Works

basecut snapshot create \
  --config basecut.yml \
  --name "dev-snapshot" \
  --source "postgresql://localhost:5432/myapp"
  1. CLI reads configuration
  2. Connects directly to your database
  3. Runs extraction algorithm locally
  4. Stores snapshot in local filesystem or uploads to cloud storage
  5. Reports completion immediately
Storage: Can use local filesystem or cloud storage (your choice) Requirements:
  • Database accessible from your machine
  • Sufficient local CPU/memory for extraction
  • Disk space for snapshot files (if using local storage)

Configuration

For local filesystem storage:
# basecut.yml
output:
  provider: local
  path: /absolute/path/to/snapshots
  format: directory
For cloud storage (CLI uploads after extraction):
# basecut.yml
output:
  provider: s3 # or gcs
  bucket: my-basecut-snapshots
  region: us-east-1
Local snapshots stored on filesystem are tied to the creating machine. Restoring on a different machine requires access to the same file path.

Comparison

FeatureSelf-Hosted AgentLocal Execution
Database AccessAgent connects from your infrastructureCLI connects directly
Processing PowerYour infrastructure (EC2, GKE, etc.)Your local machine
StorageS3 or GCS (required)Local filesystem or cloud
Authenticationbasecut login or BASECUT_API_KEY requiredbasecut login or BASECUT_API_KEY required
Job QueueYes (track history, retry failures)No (immediate execution)
Best ForProduction, large datasets, teamsDevelopment, quick testing, small extractions
CostYour infrastructure costs (EC2, storage)Free (your local resources)
SecurityCredentials stay in your environmentDirect database access

Switching Between Modes

Same configuration file works for both modes. Control via CLI flags:
# Agent execution
basecut snapshot create --config basecut.yml --async

# Local execution
basecut snapshot create --config basecut.yml
--async queues agent execution; omitting it runs locally.

Next Steps