added netbird-watcher script
All checks were successful
Terraform / terraform (push) Successful in 7s

This commit is contained in:
Prox
2026-02-15 19:11:39 +02:00
parent ec0d96f6a0
commit ca546ff6d8
10 changed files with 803 additions and 275 deletions

244
README.md
View File

@@ -5,7 +5,7 @@ Proof-of-concept for managing NetBird VPN configuration via Infrastructure as Co
## Project Status: POC Complete
**Start date:** 2026-02-15
**Status:** Core functionality working, remaining pain points documented
**Status:** Full automation implemented including peer auto-naming
### What Works
@@ -14,13 +14,7 @@ Proof-of-concept for managing NetBird VPN configuration via Infrastructure as Co
- [x] Gitea Actions runner for CI/CD
- [x] Terraform implementation - creates groups, policies, setup keys
- [x] CI/CD pipeline - PR shows plan, merge-to-main applies changes
### Remaining Pain Points
See [PAIN_POINTS.md](./PAIN_POINTS.md) for detailed analysis of:
- Peer naming automation (no link between setup keys and enrolled peers)
- Per-user vs per-role setup keys
- Secure key distribution to operators
- [x] **Watcher service** - automatically renames peers based on setup key names
---
@@ -29,185 +23,183 @@ See [PAIN_POINTS.md](./PAIN_POINTS.md) for detailed analysis of:
```
+-------------------+ PR/Merge +-------------------+
| Engineer | ----------------> | Gitea |
| (edits .tf) | | (gitea-poc.*) |
+-------------------+ +-------------------+
|
| CI/CD
| (creates setup | | (CI/CD) |
| key: pilot-X) | +-------------------+
+-------------------+ |
| terraform apply
v
+-------------------+
| Terraform |
| (in Actions) |
+-------------------+
|
| API calls
v
+-------------------+ Enroll +-------------------+
| Operators | ----------------> | NetBird |
| (use setup keys) | | (netbird-poc.*) |
+-------------------+ +-------------------+
| Watcher Service | <---- polls ----> | NetBird API |
| (auto-rename) | +-------------------+
+-------------------+ ^
| enrolls
+-------------------+ |
| Operator | -------------------------+
| (uses setup key) | peer appears as "DESKTOP-XYZ"
+-------------------+ watcher renames to "pilot-X"
```
## Complete Workflow
1. **Ticket arrives:** "Onboard pilot Ivanov"
2. **Engineer adds to Terraform:**
```hcl
resource "netbird_setup_key" "pilot_ivanov" {
name = "pilot-ivanov" # <-- This becomes the peer name
type = "one-off"
auto_groups = [netbird_group.pilots.id]
usage_limit = 1
}
```
3. **Engineer creates PR** -> CI runs `terraform plan`
4. **PR merged** -> CI runs `terraform apply` -> setup key created
5. **Engineer retrieves key:** `terraform output -raw pilot_ivanov_key`
6. **Engineer sends key to operator** (via secure channel)
7. **Operator enrolls** -> peer appears as `DESKTOP-ABC123`
8. **Watcher detects** consumed key, renames peer to `pilot-ivanov`
9. **Done** - peer is correctly named, no manual intervention
---
## Directory Structure
```
netbird-gitops-poc/
├── ansible/ # Deployment playbooks
│ ├── caddy/ # Shared reverse proxy
│ ├── gitea/ # Standalone Gitea (no OAuth)
│ ├── gitea/ # Standalone Gitea
│ ├── gitea-runner/ # Gitea Actions runner
── netbird/ # NetBird with embedded IdP
├── terraform/ # Terraform configuration (Gitea repo content)
── netbird/ # NetBird server
│ └── netbird-watcher/ # Peer renamer service
├── terraform/ # Terraform configuration
│ ├── .gitea/workflows/ # CI/CD workflow
── terraform.yml
│ ├── main.tf # Provider config
│ ├── variables.tf # Input variables
│ ├── groups.tf # Group resources
── policies.tf # Policy resources
│ ├── setup_keys.tf # Setup key resources
│ ├── outputs.tf # Output values
│ ├── terraform.tfstate # State (committed for POC)
── terraform.tfvars # Secrets (gitignored)
│ └── terraform.tfvars.example
── main.tf
│ ├── groups.tf
│ ├── policies.tf
│ ├── setup_keys.tf
── outputs.tf
├── watcher/ # Watcher service source
│ ├── netbird_watcher.py
│ ├── netbird-watcher.service
── README.md
├── README.md
└── PAIN_POINTS.md
```
## Quick Start
---
## Deployment
### Prerequisites
- VPS with Docker
- DNS records pointing to VPS
- Ansible installed locally
- Terraform installed locally (for initial setup)
- Terraform installed locally
### 1. Deploy Infrastructure
### 1. Deploy Core Infrastructure
```bash
# 1. NetBird (generates secrets, needs vault password)
# NetBird
cd ansible/netbird
./generate-vault.sh
ansible-vault encrypt group_vars/vault.yml
ansible-playbook -i poc-inventory.yml playbook-ssl.yml --ask-vault-pass
# 2. Gitea
# Gitea
cd ../gitea
ansible-playbook -i poc-inventory.yml playbook.yml
# 3. Caddy (reverse proxy for both)
# Caddy (reverse proxy)
cd ../caddy
ansible-playbook -i poc-inventory.yml playbook.yml
# 4. Gitea Runner (get token from Gitea Admin -> Actions -> Runners)
# Gitea Runner
cd ../gitea-runner
ansible-playbook -i poc-inventory.yml playbook.yml -e vault_gitea_runner_token=<TOKEN>
```
### 2. Initial Terraform Setup (Local)
### 2. Deploy Watcher Service
```bash
cd ansible/netbird-watcher
ansible-playbook -i poc-inventory.yml playbook.yml -e vault_netbird_token=<TOKEN>
```
### 3. Initialize Terraform
```bash
cd terraform
# Create tfvars with your NetBird PAT
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with actual token
# Initialize and apply
# Edit terraform.tfvars with NetBird PAT
terraform init
terraform apply
```
### 3. Push to Gitea
### 4. Configure Gitea
Push terraform directory to Gitea repo, configure secret `NETBIRD_TOKEN`.
---
## Adding a New Operator
1. Add setup key to `terraform/setup_keys.tf`:
```hcl
resource "netbird_setup_key" "pilot_ivanov" {
name = "pilot-ivanov"
type = "one-off"
auto_groups = [netbird_group.pilots.id]
usage_limit = 1
ephemeral = false
}
output "pilot_ivanov_key" {
value = netbird_setup_key.pilot_ivanov.key
sensitive = true
}
```
2. Commit, push, merge PR
3. Retrieve key:
```bash
terraform output -raw pilot_ivanov_key
```
4. Send key to operator
5. Operator enrolls -> watcher auto-renames peer
---
## Monitoring
### Watcher Service
```bash
cd terraform
git init
git add .
git commit -m "Initial Terraform config"
git remote add origin git@gitea-poc.networkmonitor.cc:admin/netbird-iac.git
git push -u origin main
# Status
systemctl status netbird-watcher
# Logs
journalctl -u netbird-watcher -f
# Processed keys
cat /var/lib/netbird-watcher/state.json
```
### 4. Configure Gitea Secrets
In Gitea repository Settings -> Actions -> Secrets:
- `NETBIRD_TOKEN`: Your NetBird PAT
### 5. Make Changes via GitOps
Edit Terraform files locally, push to create PR:
```hcl
# groups.tf - add a new group
resource "netbird_group" "new_team" {
name = "new-team"
}
```
```bash
git checkout -b add-new-team
git add groups.tf
git commit -m "Add new-team group"
git push -u origin add-new-team
# Create PR in Gitea -> CI runs terraform plan
# Merge PR -> CI runs terraform apply
```
---
## CI/CD Workflow
The `.gitea/workflows/terraform.yml` workflow:
| Event | Action |
|-------|--------|
| Pull Request | `terraform plan` (preview changes) |
| Push to main | `terraform apply` (apply changes) |
| After apply | Commit updated state file |
**State Management:** State is committed to git (acceptable for single-operator POC). For production, use a remote backend.
---
## Key Discoveries
### NetBird API Behavior
1. **Peer IDs are not predictable** - Generated server-side at enrollment time
2. **No setup key -> peer link** - NetBird doesn't record which setup key enrolled a peer
3. **Peers self-enroll** - Cannot create peers via API (WireGuard keypair generated locally)
4. **Terraform URL format** - Use `https://domain.com` NOT `https://domain.com/api`
---
## Credentials Reference (POC Only)
| Service | Credential | Location |
|---------|------------|----------|
| NetBird PAT | `nbp_T3yD...` | Dashboard -> Team -> Service Users |
| Gitea | admin user | Created during setup |
| VPS | root | `observability-poc.networkmonitor.cc` |
**Warning:** Rotate all credentials before any production use.
---
## Cleanup
```bash
# Destroy Terraform resources
cd terraform
terraform destroy
cd terraform && terraform destroy
# Stop VPS services
# Stop services on VPS
ssh root@observability-poc.networkmonitor.cc
systemctl stop netbird-watcher
cd /opt/caddy && docker compose down
cd /opt/gitea && docker compose down
cd /opt/netbird && docker compose down
```
---
## Next Steps
See [PAIN_POINTS.md](./PAIN_POINTS.md) for remaining challenges to address before production use.