129 lines
3.5 KiB
Markdown
129 lines
3.5 KiB
Markdown
# NetBird GitOps - Pain Points Status
|
|
|
|
## Summary
|
|
|
|
| # | Pain Point | Status |
|
|
|---|------------|--------|
|
|
| 1 | Peer naming after enrollment | **SOLVED** - Watcher service |
|
|
| 2 | Per-user vs per-role setup keys | **SOLVED** - One-off keys per user |
|
|
| 3 | Secure key distribution | Documented workflow |
|
|
|
|
---
|
|
|
|
## Pain Point 1: Peer Naming After Enrollment - SOLVED
|
|
|
|
### Problem
|
|
|
|
When a peer enrolls using a setup key, it appears with its hostname (e.g., `DESKTOP-ABC123`), not a meaningful name.
|
|
|
|
### Solution
|
|
|
|
**Watcher service** automatically renames peers:
|
|
|
|
1. Setup key name = desired peer name (e.g., `pilot-ivanov`)
|
|
2. Operator enrolls -> peer appears as `DESKTOP-ABC123`
|
|
3. Watcher detects consumed key via API polling (every 30s)
|
|
4. Watcher finds peer created around key usage time
|
|
5. Watcher renames peer to match key name -> `pilot-ivanov`
|
|
|
|
**Implementation:** `watcher/netbird_watcher.py`
|
|
|
|
**Deployment:**
|
|
```bash
|
|
cd ansible/netbird-watcher
|
|
ansible-playbook -i poc-inventory.yml playbook.yml -e vault_netbird_token=<TOKEN>
|
|
```
|
|
|
|
**How correlation works:**
|
|
- Watcher polls `GET /api/setup-keys` for keys with `used_times > 0`
|
|
- Gets `last_used` timestamp from the key
|
|
- Polls `GET /api/peers` for peers created within 60 seconds of that timestamp
|
|
- Renames matching peer via `PUT /api/peers/{id}`
|
|
- Marks key as processed to avoid re-processing
|
|
|
|
---
|
|
|
|
## Pain Point 2: Per-User vs Per-Role Setup Keys - SOLVED
|
|
|
|
### Problem
|
|
|
|
Reusable per-role keys (e.g., `pilot-onboarding`) don't provide:
|
|
- Audit trail (who enrolled which device?)
|
|
- Individual revocation
|
|
- Usage attribution
|
|
|
|
### Solution
|
|
|
|
**One-off keys per user/device:**
|
|
|
|
```hcl
|
|
resource "netbird_setup_key" "pilot_ivanov" {
|
|
name = "pilot-ivanov"
|
|
type = "one-off" # Single use
|
|
auto_groups = [netbird_group.pilots.id]
|
|
usage_limit = 1
|
|
ephemeral = false
|
|
}
|
|
```
|
|
|
|
**Benefits:**
|
|
- Key name = audit trail (linked to ticket/user)
|
|
- Key is consumed after single use
|
|
- Individual keys can be revoked before use
|
|
- Watcher uses key name as peer name automatically
|
|
|
|
---
|
|
|
|
## Pain Point 3: Secure Key Distribution
|
|
|
|
### Current Workflow
|
|
|
|
1. CI/CD creates setup key
|
|
2. Engineer retrieves key locally: `terraform output -raw pilot_ivanov_key`
|
|
3. Engineer sends key to operator via secure channel (Signal, encrypted email)
|
|
4. Operator uses key within expiry window
|
|
|
|
### Considerations
|
|
|
|
- Keys are sensitive - anyone with key can enroll a device
|
|
- One-off keys mitigate risk - single use, can't be reused if leaked
|
|
- Short expiry (7 days) limits exposure window
|
|
|
|
### Future Improvements (If Needed)
|
|
|
|
| Option | Description |
|
|
|--------|-------------|
|
|
| Ticket integration | CI posts key directly to ticket system |
|
|
| Secrets manager | Store in Vault/1Password, notify engineer |
|
|
| Self-service portal | Operator requests key, gets it directly |
|
|
|
|
For ~100 operators with ticket-based workflow, manual retrieval is acceptable.
|
|
|
|
---
|
|
|
|
## Final Workflow
|
|
|
|
```
|
|
1. Ticket: "Onboard pilot Ivanov with BlastPilot"
|
|
|
|
2. Engineer adds to terraform/setup_keys.tf:
|
|
- netbird_setup_key.pilot_ivanov (one-off, 7 days)
|
|
|
|
3. Engineer creates PR -> CI shows plan
|
|
|
|
4. PR merged -> CI applies -> key created
|
|
|
|
5. Engineer retrieves: terraform output -raw pilot_ivanov_key
|
|
|
|
6. Engineer sends key to operator via Signal/email
|
|
|
|
7. Operator installs NetBird, enrolls with key
|
|
|
|
8. Watcher auto-renames peer to "pilot-ivanov"
|
|
|
|
9. Ticket closed
|
|
```
|
|
|
|
**Engineer time:** ~2 minutes (Terraform edit + key retrieval + send)
|
|
**Automation:** Full - groups, policies, keys, peer naming all automated
|