Choose Your Execution Mode
Use this decision tree to select the right mode for your use case:Quick Decision Guide
1. Do you have a self-hosted agent with network access to your source database?- No → Use Local Execution (default)
- Yes → Continue
- Yes → Use Self-Hosted Agents (recommended for production)
- No → Continue
- No → Use Local Execution (default, no setup required)
- Yes → Use Self-Hosted Agents (better performance and security)
Quick Reference
| Scenario | Recommended Mode | Reason |
|---|---|---|
| Private DB + agent in VPC | Self-Hosted Agent | Agent can reach private network safely |
| Private DB + no agent path | Local | No remote worker can reach the database |
| Quick testing / development | Local | Faster setup, no infrastructure needed |
| Production workloads | Self-Hosted Agent | Better security, performance, and reliability |
| Large datasets (> 1M rows) | Self-Hosted Agent | Doesn’t consume local resources |
| CI/CD pipelines | Self-Hosted Agent | Consistent execution environment |
| Small datasets (< 100k rows) | Local | Quick and simple |
| Team collaboration | Self-Hosted Agent | Centralized execution and job history |
--async flag)
Agent Execution (Recommended for Production)
Deploy agents in your infrastructure to process extractions. Your CLI submits jobs to the Basecut API, self-hosted agents pick up the work, and results are stored in your configured S3/GCS bucket.When to Use
- Production databases - Agent runs in cloud environment with stable network and resources
- Large extractions - Process millions of rows without consuming local CPU/memory
- Team workflows - Centralized execution with job history and shared snapshots
- CI/CD pipelines - No local database access needed—agent connects directly to your DB
How It Works
- CLI submits job to Basecut API with your configuration
- Self-hosted agent picks up the job from the queue
- Agent connects to your database (requires network access)
- Extraction runs on agent infrastructure
- Snapshot uploaded to configured cloud storage (S3/GCS)
- CLI polls for job status and reports completion
- Agent needs network access to your database
- Database credentials passed securely via config
- Cloud storage bucket configured in
basecut.yml
Configuration
--async is ignored when output.provider: local (agents can’t write to your
local filesystem). In that case, Basecut runs locally and prints a warning.Local Execution
The CLI runs extraction directly on your machine. Useful for development and databases without public access.When to Use
- No agent deployment yet - Start immediately with no infrastructure
- DB reachable from your machine - Local dev DB, tunnel, or VPN from laptop
- Development/testing - Quick iteration without job queue latency
- Small datasets - Extraction completes in seconds on your laptop
- Air-gapped environments - No internet access required (with local storage)
How It Works
- CLI reads configuration
- Connects directly to your database
- Runs extraction algorithm locally
- Stores snapshot in local filesystem or uploads to cloud storage
- Reports completion immediately
- Database accessible from your machine
- Sufficient local CPU/memory for extraction
- Disk space for snapshot files (if using local storage)
Configuration
For local filesystem storage:Comparison
| Feature | Self-Hosted Agent | Local Execution |
|---|---|---|
| Database Access | Agent connects from your infrastructure | CLI connects directly |
| Processing Power | Your infrastructure (EC2, GKE, etc.) | Your local machine |
| Storage | S3 or GCS (required) | Local filesystem or cloud |
| Authentication | basecut login or BASECUT_API_KEY required | basecut login or BASECUT_API_KEY required |
| Job Queue | Yes (track history, retry failures) | No (immediate execution) |
| Best For | Production, large datasets, teams | Development, quick testing, small extractions |
| Cost | Your infrastructure costs (EC2, storage) | Free (your local resources) |
| Security | Credentials stay in your environment | Direct database access |
Switching Between Modes
Same configuration file works for both modes. Control via CLI flags:--async queues agent execution; omitting it runs locally.
Next Steps
Storage Providers
Configure S3, GCS, or local filesystem
Self-Hosted Agents
Deploy agents in your infrastructure (recommended for production)
CI/CD Integration
Use self-hosted agents in GitHub Actions
Local Development
Local execution workflow examples