Add .envrc to the .gitignore file to prevent local environment variables and secrets managed by direnv from being accidentally committed to the repository.
Fabric Orchestrator
Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics
A workflow-based orchestration system that uses NetBox as Source of Truth, Prefect for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.
🎯 Project Vision
Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where:
- Intent is defined in NetBox (Custom Fields, Native Models, BGP Plugin)
- Orchestration is handled by Prefect (Python-native
@flowand@taskdecorators) - State is continuously monitored via gNMI Subscribe
- Changes are computed as diffs and applied atomically via gNMI Set
- Drift is detected and optionally auto-remediated
Think terraform plan and terraform apply, but for your network fabric — powered by Prefect flows.
🏗️ Architecture
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ INTENT LAYER │
│ ┌─────────────────────┐ ┌──────────────────────────┐ ┌──────────────────────┐ │
│ │ NetBox │ │ Custom Fields / │ │ netbox-bgp │ │
│ │ (SoT) │◄───│ Native Models │◄───│ Plugin │ │
│ └──────────┬──────────┘ └──────────────────────────┘ └──────────────────────┘ │
└─────────────┼─────────────────────────────────────────────────────────────────────────┘
│ Webhook / Polling
▼
┌──────────────────────────────────────────────────────────────────────────────┐
│ ORCHESTRATION LAYER (PREFECT) │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Prefect Flows (Python) │ │
│ │ ┌───────────────────┐ ┌───────────────────┐ ┌─────────────────────┐ │ │
│ │ │ fabric_reconcile │ │ handle_drift │ │ drift_remediation │ │ │
│ │ │ (plan/apply) │ │ (subscribe) │ │ (auto-fix) │ │ │
│ │ └───────────────────┘ └───────────────────┘ └─────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Prefect Tasks (Python) │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────────────┐ │ │
│ │ │ Intent Parser │ │ Diff Engine │ │ gNMI Client │ │ │
│ │ │ (NetBox→YANG) │ │ (Want vs Have) │ │ (pygnmi wrapper) │ │ │
│ │ └─────────────────┘ └─────────────────┘ └───────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Webhook Receiver │ Prefect .serve() │ Prefect Server (UI) │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘
│ gNMI Get/Set/Subscribe
▼
┌──────────────────────────────────────────────────────────────────────────────┐
│ DEVICE LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ spine1 │ │ spine2 │ │ leaf1 │ │ leaf2 │ ... │
│ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘
🎛 Why Prefect?
We chose Prefect as the orchestration engine for several reasons:
| Feature | Benefit |
|---|---|
| Python-native workflows | Use @flow and @task decorators — no YAML, just Python |
| Free secrets management | Native Secret blocks for credentials (free in OSS) |
| Built-in UI | Dashboard, logs, metrics, execution history via prefect server start |
| No containerization required | Run flows directly with .serve() — no Docker needed |
| Event-driven triggers | Schedule, webhooks (via FastAPI), flow triggers out of the box |
| Task dependencies | Automatic dependency ordering via task result passing or wait_for |
| Retry & error handling | Built-in retry policies with @task(retries=3) |
| Human-in-the-loop | Native pause_flow_run() for approval workflows |
🎯 Target Fabric
This project is designed for the Arista EVPN-VXLAN ContainerLab topology:
- 2 Spines (BGP Route Reflectors, AS 65000)
- 8 Leafs (4 MLAG VTEP pairs, AS 65001-65004)
- cEOS 4.35.0F with gNMI enabled
- EVPN Type-2 (L2 VXLAN) and Type-5 (L3 VXLAN) support
Reference: arista-evpn-vxlan-clab
📋 Project Phases
Progress is tracked via issues. See all issues or filter by phase:
| Phase | Description | Issues |
|---|---|---|
| Phase 1 | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | phase-1-yang-discovery |
| Phase 2 | Core Components - NetBox client, diff engine, gNMI operations | phase-2-minimal-reconciler |
| Phase 3 | Full Fabric - BGP, MLAG, VRFs, YANG mappers | phase-3-full-fabric |
| Phase 4 | Prefect Integration - Flows, webhooks, drift detection | phase-4-event-driven |
📌 Project Board: View Kanban
📁 Project Structure
fabric-orchestrator/
├── README.md
├── pyproject.toml
│
├── src/ # Python package
│ ├── __init__.py
│ ├── cli.py # CLI for YANG discovery (discover commands)
│ │
│ ├── flows/ # Prefect flows
│ │ ├── __init__.py
│ │ ├── reconcile.py # @flow fabric_reconcile (plan/apply)
│ │ ├── drift.py # @flow handle_drift
│ │ └── remediation.py # @flow drift_remediation
│ │
│ ├── api/ # FastAPI webhook receiver
│ │ ├── __init__.py
│ │ └── webhooks.py # NetBox webhook endpoint
│ │
│ ├── services/ # Long-running services
│ │ ├── __init__.py
│ │ └── drift_monitor.py # gNMI Subscribe drift detection
│ │
│ ├── gnmi/
│ │ ├── __init__.py
│ │ ├── client.py # gNMI client wrapper (pygnmi)
│ │ └── README.md
│ │
│ ├── netbox/
│ │ ├── __init__.py
│ │ ├── client.py # NetBox API client (pynetbox)
│ │ └── models.py # Pydantic models for intent validation
│ │
│ └── yang/
│ ├── __init__.py
│ ├── mapper.py # NetBox intent → YANG paths
│ ├── paths.py # YANG path definitions
│ └── dependencies.py # Dependency ordering graph
│
├── tests/
│
└── docs/
├── cli-user-guide.md # CLI documentation
├── yang-paths.md # Documented YANG paths
└── netbox-data-model.md # NetBox schema documentation
🛠️ Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Source of Truth | NetBox + BGP Plugin | Intent definition via native models |
| Orchestrator | Prefect | Python-native workflow orchestration |
| Webhooks | FastAPI | Receive NetBox webhooks |
| Transport | gNMI | Configuration and telemetry |
| Data Models | YANG (OpenConfig + Arista) | Structured configuration |
| Python Library | pygnmi + pynetbox | gNMI/NetBox interactions |
| CLI | Click + Rich | YANG discovery tools |
| Validation | Pydantic v2 | Intent data validation |
| Lab | ContainerLab + cEOS | Development environment |
🔗 Related Projects
- arista-evpn-vxlan-clab - Target fabric topology
- projet-vxlan-automation - Previous NetBox RenderConfig work
- Arista YANG Models - EOS 4.35.0F YANG definitions
- Prefect Documentation - Orchestration platform docs
📚 References
Prefect
YANG / gNMI
EVPN-VXLAN
🚀 Getting Started
Prerequisites
- Python 3.12+
uvpackage manager- Access to ContainerLab with cEOS images
- NetBox instance with BGP plugin
Quick Start
# Clone the repository
git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git
cd fabric-orchestrator
# Install Python dependencies
uv sync
# Configure Prefect secrets
python -c "
from prefect.blocks.system import Secret
from prefect.variables import Variable
Secret(value='your-netbox-token').save('netbox-token', overwrite=True)
Secret(value='your-gnmi-password').save('gnmi-password', overwrite=True)
Variable.set('netbox_url', 'https://netbox.example.com')
Variable.set('gnmi_username', 'admin')
"
# Start Prefect server (optional, for UI)
prefect server start
# Verify gNMI connectivity to your fabric
uv run fabric-orch discover capabilities --target leaf1:6030
# Explore YANG paths
uv run fabric-orch discover get --target leaf1:6030 \
--path "/interfaces/interface[name=Ethernet1]/state"
Running Flows
from src.flows.reconcile import fabric_reconcile
# Plan only (dry-run)
result = fabric_reconcile(dry_run=True)
# Plan for a specific device
result = fabric_reconcile(device="leaf1", dry_run=True)
# Apply changes automatically
result = fabric_reconcile(auto_apply=True, dry_run=False)
Deploying with Scheduling
# Start the flow with scheduling (runs every 6 hours)
python -m src.flows.reconcile
# Or deploy via Prefect CLI
prefect deployment run fabric-reconcile/fabric-reconcile-scheduled
Starting the Webhook Receiver
# Start FastAPI webhook server
uvicorn src.api.webhooks:app --host 0.0.0.0 --port 8000
Prefect Flow Example
from prefect import flow, task
from prefect.blocks.system import Secret
from prefect.variables import Variable
@task(retries=2, retry_delay_seconds=10)
def get_fabric_intent(device: str | None = None) -> dict:
"""Retrieve fabric intent from NetBox."""
from src.netbox import FabricNetBoxClient
netbox_url = Variable.get("netbox_url")
netbox_token = Secret.load("netbox-token").get()
client = FabricNetBoxClient(url=netbox_url, token=netbox_token)
return client.get_fabric_intent() if not device else client.get_device_intent(device)
@task
def compute_diff(intent: dict, current: dict) -> list[dict]:
"""Compute diff between desired and current state."""
from src.reconciler.diff import compute_diff as diff_engine
return diff_engine(want=intent, have=current)
@task(retries=1)
def apply_changes(changes: list[dict], dry_run: bool = True) -> dict:
"""Apply changes via gNMI Set."""
if dry_run:
return {"applied": False, "changes": changes}
# Apply via gNMI...
return {"applied": True, "changes": changes}
@flow(log_prints=True, name="fabric-reconcile")
def fabric_reconcile(
device: str | None = None,
auto_apply: bool = False,
dry_run: bool = True
) -> dict:
"""Reconcile fabric state with NetBox intent."""
print(f"🔄 Starting fabric reconciliation")
intent = get_fabric_intent(device)
current = get_current_state(devices)
changes = compute_diff(intent, current)
if not changes:
print("✅ No changes detected - fabric is in sync")
return {"changes": [], "in_sync": True}
should_apply = auto_apply and not dry_run
result = apply_changes(changes, dry_run=not should_apply)
return {"changes": changes, "applied": should_apply}
if __name__ == "__main__":
fabric_reconcile.serve(
name="fabric-reconcile-scheduled",
cron="0 */6 * * *",
tags=["network", "fabric"]
)
Status: 🚧 Active Development - Phase 2 (Core Components) & Phase 4 (Prefect Integration)