Skip to main content
Self-hosted agents are the recommended approach for production use. Your agents run in your infrastructure with your database credentials—Basecut never touches your data or credentials. Agents continuously poll for snapshot jobs, execute extractions, and upload results to your configured storage (S3/GCS). All processing happens in your environment.

Why Self-Hosted?

Security & Compliance:
  • Your database credentials never leave your environment
  • Data processing happens entirely in your VPC/cloud
  • Meet strict compliance requirements (SOC2, HIPAA, GDPR)
  • Zero-trust architecture
Performance & Control:
  • Choose your own compute resources (CPU, memory)
  • Deploy in the same region as your database
  • Scale agents based on your workload
  • No network latency to external services
Cost Efficiency:
  • Pay only for your infrastructure (EC2, GCS, etc.)
  • No per-job fees or processing charges
  • Predictable costs at scale

When to Use Self-Hosted Agents

  • Production workloads - Recommended default for all production snapshots
  • Private databases - Behind VPN/VPC without public access
  • Compliance requirements - Data cannot leave your infrastructure
  • High volume - Processing dozens or hundreds of snapshots

Railway (Fastest)

Deploy a Basecut agent to Railway in one click: Deploy on Railway After deployment, set at least:
  • BASECUT_API_KEY
  • BASECUT_DATABASE_URL
Use this option when you want managed agent hosting without setting up Docker, Kubernetes, or ECS manually.

Docker Deployment

Basic Docker Run

Run a single agent container:
docker run -d \
  --name basecut-agent \
  --restart unless-stopped \
  -e BASECUT_API_KEY=your_org_api_key \
  -e BASECUT_DATABASE_URL=postgres://readonly@prod-db:5432/myapp \
  -v ~/.aws:/root/.aws:ro \
  ghcr.io/basecuthq/basecut-agent:latest
Environment variables:
VariableRequiredDescription
BASECUT_API_KEYYesOrganization-scoped API key (bc_live_* or bc_test_*)
BASECUT_DATABASE_URLYesDatabase connection string for the agent
AWS_ACCESS_KEY_IDNoAWS credentials (or mount ~/.aws)
AWS_SECRET_ACCESS_KEYNoAWS credentials
AWS_REGIONNoAWS region fallback when output.region is not set in basecut.yml
GOOGLE_APPLICATION_CREDENTIALSNoPath to GCS service account key
Agent settings like poll interval and agent ID are configured via CLI flags: --poll-interval, --heartbeat-interval, --agent-id, --run-once.

Docker Compose

For production deployments with multiple agents and monitoring:
# docker-compose.yml
version: '3.8'

services:
  agent:
    image: ghcr.io/basecuthq/basecut-agent:latest
    restart: unless-stopped
    environment:
      BASECUT_API_KEY: ${BASECUT_API_KEY}
      BASECUT_DATABASE_URL: ${BASECUT_DATABASE_URL}
      GOOGLE_APPLICATION_CREDENTIALS: /etc/basecut/gcs-key.json
    volumes:
      # AWS credentials
      - ~/.aws:/root/.aws:ro
      # GCS credentials (if using GCS)
      - ./gcs-key.json:/etc/basecut/gcs-key.json:ro
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          cpus: '1.0'
          memory: 2G
    logging:
      driver: json-file
      options:
        max-size: '10m'
        max-file: '3'

  # Optional: Run multiple agents for parallelism
  agent-worker-2:
    extends:
      service: agent
    container_name: basecut-agent-2

  agent-worker-3:
    extends:
      service: agent
    container_name: basecut-agent-3
Start the agent pool:
# Create .env file
cat > .env <<EOF
BASECUT_API_KEY=bc_live_abc123_yoursecret
BASECUT_DATABASE_URL=postgres://readonly@prod-db.internal:5432/myapp
EOF

# Start agents
docker compose up -d

# Scale to 5 agents
docker compose up -d --scale agent=5

Kubernetes Deployment

Deploy agents in a Kubernetes cluster with autoscaling:
# basecut-agent.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: basecut-agent
  namespace: basecut
spec:
  replicas: 3
  selector:
    matchLabels:
      app: basecut-agent
  template:
    metadata:
      labels:
        app: basecut-agent
    spec:
      serviceAccountName: basecut-agent # For Workload Identity or IRSA
      containers:
        - name: agent
          image: ghcr.io/basecuthq/basecut-agent:latest
          env:
            - name: BASECUT_API_KEY
              valueFrom:
                secretKeyRef:
                  name: basecut-credentials
                  key: api-key
            - name: BASECUT_DATABASE_URL
              value: 'postgres://readonly@prod-db.internal:5432/myapp'
            # GKE Workload Identity (GCS access)
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: /var/secrets/google/key.json
          resources:
            requests:
              cpu: 1000m
              memory: 2Gi
            limits:
              cpu: 2000m
              memory: 4Gi
          volumeMounts:
            - name: gcp-credentials
              mountPath: /var/secrets/google
              readOnly: true
      volumes:
        - name: gcp-credentials
          secret:
            secretName: basecut-gcp-key

---
apiVersion: v1
kind: Secret
metadata:
  name: basecut-credentials
  namespace: basecut
type: Opaque
stringData:
  api-key: bc_live_abc123_yoursecret

---
# Optional: Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: basecut-agent-hpa
  namespace: basecut
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: basecut-agent
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
Deploy:
kubectl create namespace basecut
kubectl apply -f basecut-agent.yaml

AWS ECS Deployment

Run agents on ECS Fargate with IAM roles for S3 access:
{
  "family": "basecut-agent",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::ACCOUNT:role/BasecutAgentRole",
  "containerDefinitions": [
    {
      "name": "basecut-agent",
      "image": "ghcr.io/basecuthq/basecut-agent:latest",
      "essential": true,
      "environment": [
        {
          "name": "BASECUT_DATABASE_URL",
          "value": "postgres://readonly@prod-db.internal:5432/myapp"
        }
      ],
      "secrets": [
        {
          "name": "BASECUT_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT:secret:basecut/api-key"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/basecut-agent",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "agent"
        }
      }
    }
  ]
}
IAM policy for TaskRole (S3 access):
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::my-basecut-snapshots/*",
        "arn:aws:s3:::my-basecut-snapshots"
      ]
    }
  ]
}
Deploy via CLI:
# Register task definition
aws ecs register-task-definition --cli-input-json file://basecut-agent-task.json

# Create service
aws ecs create-service \
  --cluster production \
  --service-name basecut-agent \
  --task-definition basecut-agent \
  --desired-count 3 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-abc123],securityGroups=[sg-xyz789],assignPublicIp=ENABLED}"

Agent Configuration

Network Access

Agents need outbound access to:
  1. Basecut API (api.basecut.dev:443)
    • Poll for jobs
    • Report job status
    • Upload job logs
  2. Your database (e.g., prod-db.internal:5432)
    • Execute extraction queries
    • Read schema metadata
  3. Cloud storage (S3/GCS)
    • Upload snapshot artifacts
Firewall rules:
  • Outbound HTTPS (443) to api.basecut.dev
  • Outbound PostgreSQL (5432) to your database
  • Outbound HTTPS (443) to s3.amazonaws.com or storage.googleapis.com

Database Credentials

Best practice: Use read-only database user
-- PostgreSQL: Create read-only user for agent
CREATE USER basecut_agent WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE myapp TO basecut_agent;
GRANT USAGE ON SCHEMA public TO basecut_agent;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO basecut_agent;

-- Ensure future tables are readable
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO basecut_agent;
Pass credentials via:
  • Environment variable: BASECUT_DATABASE_URL=postgres://basecut_agent:password@host:5432/db
  • AWS Secrets Manager / Kubernetes Secrets

Monitoring and Logging

Monitor agents via container logs and your platform health checks.

Troubleshooting

Agent Not Picking Up Jobs

Check agent logs:
docker logs basecut-agent
Common issues:
  • Invalid BASECUT_API_KEY (should start with bc_live_ or bc_test_)
  • Network access blocked to api.basecut.dev
  • Agent registered to wrong organization
Verify connectivity:
docker exec basecut-agent curl https://api.basecut.dev/health

Database Connection Failures

Test database access from agent:
docker exec basecut-agent psql "$BASECUT_DATABASE_URL" -c "SELECT 1"
Common issues:
  • Firewall blocking database port
  • Database requires SSL (?sslmode=require)
  • Read-only user lacks schema permissions

S3/GCS Upload Failures

Verify cloud credentials:
# AWS
docker exec basecut-agent aws s3 ls s3://my-bucket

# GCS
docker exec basecut-agent gsutil ls gs://my-bucket
Common issues:
  • IAM role/service account lacks PutObject permission
  • Bucket doesn’t exist or is in wrong region
  • Missing or invalid S3 region on the agent (output.region or AWS_REGION)
  • Network egress blocked to cloud storage API
Tip for S3-compatible endpoints (Cloudflare R2, MinIO):
  • Use output.provider: s3 with output.endpoint
  • Set output.region explicitly (Cloudflare R2: region: auto)

Cost Optimization

Right-Size Agent Resources

Snapshot SizeCPUMemoryRecommended
< 10k rows0.51GBDevelopment/small teams
10k-100k rows1.02GBMost production workloads
100k-1M rows2.04GBLarge databases
> 1M rows4.08GBEnterprise-scale

Autoscaling

Scale agents based on job queue depth:
  • Kubernetes HPA: Scale on CPU/memory utilization
  • ECS Service Autoscaling: Scale on CloudWatch metric JobQueueDepth (custom metric)
  • Target: 1 agent per 5-10 jobs in queue

Spot Instances

Agents are fault-tolerant—jobs resume if an agent dies:
  • ECS: Use Fargate Spot for 70% cost savings
  • Kubernetes: Use spot node pools
  • EC2: Use Spot Instances with interruption handling

Next Steps