# NetBird GitOps - Pain Points Status ## Summary | # | Pain Point | Status | |---|------------|--------| | 1 | Peer naming after enrollment | **SOLVED** - Watcher service | | 2 | Per-user vs per-role setup keys | **SOLVED** - One-off keys per user | | 3 | Secure key distribution | Documented workflow | --- ## Pain Point 1: Peer Naming After Enrollment - SOLVED ### Problem When a peer enrolls using a setup key, it appears with its hostname (e.g., `DESKTOP-ABC123`), not a meaningful name. ### Solution **Watcher service** automatically renames peers: 1. Setup key name = desired peer name (e.g., `pilot-ivanov`) 2. Operator enrolls -> peer appears as `DESKTOP-ABC123` 3. Watcher detects consumed key via API polling (every 30s) 4. Watcher finds peer created around key usage time 5. Watcher renames peer to match key name -> `pilot-ivanov` **Implementation:** `watcher/netbird_watcher.py` **Deployment:** ```bash cd ansible/netbird-watcher ansible-playbook -i poc-inventory.yml playbook.yml -e vault_netbird_token= ``` **How correlation works:** - Watcher polls `GET /api/setup-keys` for keys with `used_times > 0` - Gets `last_used` timestamp from the key - Polls `GET /api/peers` for peers created within 60 seconds of that timestamp - Renames matching peer via `PUT /api/peers/{id}` - Marks key as processed to avoid re-processing --- ## Pain Point 2: Per-User vs Per-Role Setup Keys - SOLVED ### Problem Reusable per-role keys (e.g., `pilot-onboarding`) don't provide: - Audit trail (who enrolled which device?) - Individual revocation - Usage attribution ### Solution **One-off keys per user/device:** ```hcl resource "netbird_setup_key" "pilot_ivanov" { name = "pilot-ivanov" type = "one-off" # Single use auto_groups = [netbird_group.pilots.id] usage_limit = 1 ephemeral = false } ``` **Benefits:** - Key name = audit trail (linked to ticket/user) - Key is consumed after single use - Individual keys can be revoked before use - Watcher uses key name as peer name automatically --- ## Pain Point 3: Secure Key Distribution ### Current Workflow 1. CI/CD creates setup key 2. Engineer retrieves key locally: `terraform output -raw pilot_ivanov_key` 3. Engineer sends key to operator via secure channel (Signal, encrypted email) 4. Operator uses key within expiry window ### Considerations - Keys are sensitive - anyone with key can enroll a device - One-off keys mitigate risk - single use, can't be reused if leaked - Short expiry (7 days) limits exposure window ### Future Improvements (If Needed) | Option | Description | |--------|-------------| | Ticket integration | CI posts key directly to ticket system | | Secrets manager | Store in Vault/1Password, notify engineer | | Self-service portal | Operator requests key, gets it directly | For ~100 operators with ticket-based workflow, manual retrieval is acceptable. --- ## Final Workflow ``` 1. Ticket: "Onboard pilot Ivanov with BlastPilot" 2. Engineer adds to terraform/setup_keys.tf: - netbird_setup_key.pilot_ivanov (one-off, 7 days) 3. Engineer creates PR -> CI shows plan 4. PR merged -> CI applies -> key created 5. Engineer retrieves: terraform output -raw pilot_ivanov_key 6. Engineer sends key to operator via Signal/email 7. Operator installs NetBird, enrolls with key 8. Watcher auto-renames peer to "pilot-ivanov" 9. Ticket closed ``` **Engineer time:** ~2 minutes (Terraform edit + key retrieval + send) **Automation:** Full - groups, policies, keys, peer naming all automated