basecut.yml file defines what Basecut extracts, how traversal is bounded, how PII is anonymized, and where output is written.
Complete Example
Top-Level Fields
| Field | Required | Type | Description |
|---|---|---|---|
version | Yes | string | Config version. Must be '1'. |
name | No | string | Snapshot name fallback used by CLI if --name is not provided. |
from | Yes | array | One or more root selectors to start extraction. |
exclude | No | array | Tables to exclude entirely from extraction. |
include_all | No | array | Lookup/reference tables extracted in full before traversal. |
traverse | No | object | Traversal controls (parents, children, stop_at). |
limits | No | object | Row-count budgets and overrides. |
sampling | No | object | Row selection mode and random seed. |
anonymize | No | string/object | off, auto, or object with mode + rules. |
output | Yes | string/object | Output destination (./path, s3://..., gs://..., or provider object). |
from
from defines the starting rows.
- Use
table: usersforpublic.users. - Use
table: schema.tablefor non-public schemas. wheresupports named params (:id), values go inparams.
traverse
| Field | Type | Default | Description |
|---|---|---|---|
parents | int | 10 | Max parent FK depth (-1 = unlimited). |
children | int | 2 | Max child FK depth (0 = disable downstream traversal). |
stop_at | array | - | Junction/link tables where FK expansion should stop. |
limits.rows
| Field | Type | Default | Description |
|---|---|---|---|
per_table | int | 1000 | Default row-count cap per table. |
total | int | 100000 | Total row-count cap across the snapshot. |
tables | map[string]int | - | Per-table row-count overrides (0 means unlimited for that table; use exclude to skip extraction). |
per_table: 1000, the cycle budget is 3000 total rows. Adjust per_table or add individual tables entries to control cycle sizes.
anonymize
Shorthand:
mode: autoenables built-in PII auto-detection plus explicitrules.mode: manualapplies only explicitrules.mode: offdisables anonymization.- Table-grouped rule keys can be unqualified (
users) forpublic.users.
output
String shorthand:
- For
provider: s3,bucketis required. - For Cloudflare R2, use
provider: s3+endpointand setregion: auto.