# Fabric Orchestrator **Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics** A workflow-based orchestration system that uses NetBox as Source of Truth, [Prefect](https://prefect.io) for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics. ## 🎯 Project Vision Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where: - **Intent** is defined in NetBox (Custom Fields, Native Models, BGP Plugin) - **Orchestration** is handled by Prefect (Python-native `@flow` and `@task` decorators) - **State** is continuously monitored via gNMI Subscribe - **Changes** are computed as diffs and applied atomically via gNMI Set - **Drift** is detected and optionally auto-remediated Think `terraform plan` and `terraform apply`, but for your network fabric β€” powered by Prefect flows. ## πŸ—οΈ Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ INTENT LAYER β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ NetBox β”‚ β”‚ Custom Fields / β”‚ β”‚ netbox-bgp β”‚ β”‚ β”‚ β”‚ (SoT) │◄───│ Native Models │◄───│ Plugin β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Webhook / Polling β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ ORCHESTRATION LAYER (PREFECT) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Prefect Flows (Python) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ fabric_reconcile β”‚ β”‚ handle_drift β”‚ β”‚ drift_remediation β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (plan/apply) β”‚ β”‚ (subscribe) β”‚ β”‚ (auto-fix) β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Prefect Tasks (Python) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ Intent Parser β”‚ β”‚ Diff Engine β”‚ β”‚ gNMI Client β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ (NetBoxβ†’YANG) β”‚ β”‚ (Want vs Have) β”‚ β”‚ (pygnmi wrapper) β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ FastAPI Webhook Receiver β”‚ Prefect .serve() β”‚ Prefect Server (UI) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ gNMI Get/Set/Subscribe β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ DEVICE LAYER β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ spine1 β”‚ β”‚ spine2 β”‚ β”‚ leaf1 β”‚ β”‚ leaf2 β”‚ ... β”‚ β”‚ β”‚ gNMI:6030 β”‚ β”‚ gNMI:6030 β”‚ β”‚ gNMI:6030 β”‚ β”‚ gNMI:6030 β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ## πŸŽ› Why Prefect? We chose [Prefect](https://prefect.io) as the orchestration engine for several reasons: | Feature | Benefit | |---------|---------| | **Python-native workflows** | Use `@flow` and `@task` decorators β€” no YAML, just Python | | **Free secrets management** | Native `Secret` blocks for credentials (free in OSS) | | **Built-in UI** | Dashboard, logs, metrics, execution history via `prefect server start` | | **No containerization required** | Run flows directly with `.serve()` β€” no Docker needed | | **Event-driven triggers** | Schedule, webhooks (via FastAPI), flow triggers out of the box | | **Task dependencies** | Automatic dependency ordering via task result passing or `wait_for` | | **Retry & error handling** | Built-in retry policies with `@task(retries=3)` | | **Human-in-the-loop** | Native `pause_flow_run()` for approval workflows | ## 🎯 Target Fabric This project is designed for the Arista EVPN-VXLAN ContainerLab topology: - **2 Spines** (BGP Route Reflectors, AS 65000) - **8 Leafs** (4 MLAG VTEP pairs, AS 65001-65004) - **cEOS 4.35.0F** with gNMI enabled - **EVPN Type-2** (L2 VXLAN) and **Type-5** (L3 VXLAN) support Reference: [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab) ## πŸ“‹ Project Phases Progress is tracked via issues. See [all issues](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues) or filter by phase: | Phase | Description | Issues | |-------|-------------|--------| | **Phase 1** | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | [phase-1-yang-discovery](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=1) | | **Phase 2** | Core Components - NetBox client, diff engine, gNMI operations | [phase-2-minimal-reconciler](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=2) | | **Phase 3** | Full Fabric - BGP, MLAG, VRFs, YANG mappers | [phase-3-full-fabric](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=3) | | **Phase 4** | Prefect Integration - Flows, webhooks, drift detection | [phase-4-event-driven](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=4) | πŸ“Œ **Project Board**: [View Kanban](https://gitea.arnodo.fr/Damien/fabric-orchestrator/projects) ## πŸ“ Project Structure ``` fabric-orchestrator/ β”œβ”€β”€ README.md β”œβ”€β”€ pyproject.toml β”‚ β”œβ”€β”€ src/ # Python package β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ cli.py # CLI for YANG discovery (discover commands) β”‚ β”‚ β”‚ β”œβ”€β”€ flows/ # Prefect flows β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ reconcile.py # @flow fabric_reconcile (plan/apply) β”‚ β”‚ β”œβ”€β”€ drift.py # @flow handle_drift β”‚ β”‚ └── remediation.py # @flow drift_remediation β”‚ β”‚ β”‚ β”œβ”€β”€ api/ # FastAPI webhook receiver β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── webhooks.py # NetBox webhook endpoint β”‚ β”‚ β”‚ β”œβ”€β”€ services/ # Long-running services β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── drift_monitor.py # gNMI Subscribe drift detection β”‚ β”‚ β”‚ β”œβ”€β”€ gnmi/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ client.py # gNMI client wrapper (pygnmi) β”‚ β”‚ └── README.md β”‚ β”‚ β”‚ β”œβ”€β”€ netbox/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ client.py # NetBox API client (pynetbox) β”‚ β”‚ └── models.py # Pydantic models for intent validation β”‚ β”‚ β”‚ └── yang/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ mapper.py # NetBox intent β†’ YANG paths β”‚ β”œβ”€β”€ paths.py # YANG path definitions β”‚ └── dependencies.py # Dependency ordering graph β”‚ β”œβ”€β”€ tests/ β”‚ └── docs/ β”œβ”€β”€ cli-user-guide.md # CLI documentation β”œβ”€β”€ yang-paths.md # Documented YANG paths └── netbox-data-model.md # NetBox schema documentation ``` ## πŸ› οΈ Technology Stack | Component | Technology | Purpose | |-----------|------------|---------| | Source of Truth | NetBox + BGP Plugin | Intent definition via native models | | Orchestrator | **Prefect** | Python-native workflow orchestration | | Webhooks | FastAPI | Receive NetBox webhooks | | Transport | gNMI | Configuration and telemetry | | Data Models | YANG (OpenConfig + Arista) | Structured configuration | | Python Library | pygnmi + pynetbox | gNMI/NetBox interactions | | CLI | Click + Rich | YANG discovery tools | | Validation | Pydantic v2 | Intent data validation | | Lab | ContainerLab + cEOS | Development environment | ## πŸ”— Related Projects - [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab) - Target fabric topology - [projet-vxlan-automation](https://gitea.arnodo.fr/Damien/projet-vxlan-automation) - Previous NetBox RenderConfig work - [Arista YANG Models](https://github.com/aristanetworks/yang/tree/master/EOS-4.35.0F) - EOS 4.35.0F YANG definitions - [Prefect Documentation](https://docs.prefect.io) - Orchestration platform docs ## πŸ“š References ### Prefect - [Prefect Documentation](https://docs.prefect.io) - [Prefect Flows](https://docs.prefect.io/latest/develop/write-flows/) - [Prefect Tasks](https://docs.prefect.io/latest/develop/write-tasks/) - [Prefect Deployments](https://docs.prefect.io/latest/deploy/run-flows-in-local-processes/) - [Prefect Secrets](https://docs.prefect.io/latest/develop/blocks/#secret) ### YANG / gNMI - [Arista gNMI Documentation](https://aristanetworks.github.io/openmgmt/configuration/gnmi/) - [OpenConfig Models](https://github.com/openconfig/public) - [pygnmi Library](https://github.com/akarneliuk/pygnmi) ### EVPN-VXLAN - [Arista BGP EVPN Configuration Example](https://overlaid.net/2019/01/27/arista-bgp-evpn-configuration-example/) - [Arista EVPN Deployment Guide](https://www.arista.com/en/solutions/evpn-vxlan) ## πŸš€ Getting Started ### Prerequisites - Python 3.12+ - `uv` package manager - Access to ContainerLab with cEOS images - NetBox instance with BGP plugin ### Quick Start ```bash # Clone the repository git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git cd fabric-orchestrator # Install Python dependencies uv sync # Configure Prefect secrets python -c " from prefect.blocks.system import Secret from prefect.variables import Variable Secret(value='your-netbox-token').save('netbox-token', overwrite=True) Secret(value='your-gnmi-password').save('gnmi-password', overwrite=True) Variable.set('netbox_url', 'https://netbox.example.com') Variable.set('gnmi_username', 'admin') " # Start Prefect server (optional, for UI) prefect server start # Verify gNMI connectivity to your fabric uv run fabric-orch discover capabilities --target leaf1:6030 # Explore YANG paths uv run fabric-orch discover get --target leaf1:6030 \ --path "/interfaces/interface[name=Ethernet1]/state" ``` ### Running Flows ```python from src.flows.reconcile import fabric_reconcile # Plan only (dry-run) result = fabric_reconcile(dry_run=True) # Plan for a specific device result = fabric_reconcile(device="leaf1", dry_run=True) # Apply changes automatically result = fabric_reconcile(auto_apply=True, dry_run=False) ``` ### Deploying with Scheduling ```bash # Start the flow with scheduling (runs every 6 hours) python -m src.flows.reconcile # Or deploy via Prefect CLI prefect deployment run fabric-reconcile/fabric-reconcile-scheduled ``` ### Starting the Webhook Receiver ```bash # Start FastAPI webhook server uvicorn src.api.webhooks:app --host 0.0.0.0 --port 8000 ``` ## Prefect Flow Example ```python from prefect import flow, task from prefect.blocks.system import Secret from prefect.variables import Variable @task(retries=2, retry_delay_seconds=10) def get_fabric_intent(device: str | None = None) -> dict: """Retrieve fabric intent from NetBox.""" from src.netbox import FabricNetBoxClient netbox_url = Variable.get("netbox_url") netbox_token = Secret.load("netbox-token").get() client = FabricNetBoxClient(url=netbox_url, token=netbox_token) return client.get_fabric_intent() if not device else client.get_device_intent(device) @task def compute_diff(intent: dict, current: dict) -> list[dict]: """Compute diff between desired and current state.""" from src.reconciler.diff import compute_diff as diff_engine return diff_engine(want=intent, have=current) @task(retries=1) def apply_changes(changes: list[dict], dry_run: bool = True) -> dict: """Apply changes via gNMI Set.""" if dry_run: return {"applied": False, "changes": changes} # Apply via gNMI... return {"applied": True, "changes": changes} @flow(log_prints=True, name="fabric-reconcile") def fabric_reconcile( device: str | None = None, auto_apply: bool = False, dry_run: bool = True ) -> dict: """Reconcile fabric state with NetBox intent.""" print(f"πŸ”„ Starting fabric reconciliation") intent = get_fabric_intent(device) current = get_current_state(devices) changes = compute_diff(intent, current) if not changes: print("βœ… No changes detected - fabric is in sync") return {"changes": [], "in_sync": True} should_apply = auto_apply and not dry_run result = apply_changes(changes, dry_run=not should_apply) return {"changes": changes, "applied": should_apply} if __name__ == "__main__": fabric_reconcile.serve( name="fabric-reconcile-scheduled", cron="0 */6 * * *", tags=["network", "fabric"] ) ``` --- **Status**: 🚧 Active Development - Phase 2 (Core Components) & Phase 4 (Prefect Integration)