# Fabric Orchestrator **Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics** A workflow-based orchestration system that uses [InfraHub](https://github.com/opsmill/infrahub) as Source of Truth, [Prefect](https://prefect.io) for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics. ## 🎯 Project Vision Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where: - **Intent** is defined in InfraHub (custom schema, Git-versioned) - **Orchestration** is handled by Prefect (Python-native `@flow` and `@task` decorators) - **State** is continuously monitored via gNMI Subscribe - **Changes** are computed as diffs and applied atomically via gNMI Set - **Drift** is detected and optionally auto-remediated Think `terraform plan` and `terraform apply`, but for your network fabric — powered by Prefect flows. ## 🏗️ Architecture ![architecture](docs/assets/architecture/fabric-orchestration-archi.excalidraw.svg) ## 🎯 Why InfraHub? We chose [InfraHub](https://github.com/opsmill/infrahub) over NetBox as Source of Truth for several reasons: | Feature | NetBox | InfraHub | | ------------------- | --------------------- | ------------------------------------ | | **Schema** | Fixed DCIM/IPAM model | Fully customizable YAML schema | | **Git Integration** | External sync needed | Native - branches = data branches | | **Versioning** | Changelog only | True Git-like versioning with merges | | **Transforms** | Limited | Built-in Jinja2 + Python transforms | | **GraphQL** | Yes | Yes (auto-generated from schema) | **Key benefits for this project:** 1. **Custom Schema** - Model exactly what we need (VTEPs, MLAG pairs, fabric topology) 2. **Git-native** - Schema + data versioned together, easy test environment setup 3. **Transforms** - Generate device configs directly from InfraHub 4. **Branches** - Test fabric changes in isolated branches before merge ## 🎛 Why Prefect? | Feature | Benefit | | -------------------------------- | ---------------------------------------------------------------------- | | **Python-native workflows** | Use `@flow` and `@task` decorators — no YAML, just Python | | **Free secrets management** | Native `Secret` blocks for credentials (free in OSS) | | **Built-in UI** | Dashboard, logs, metrics, execution history via `prefect server start` | | **No containerization required** | Run flows directly with `.serve()` — no Docker needed | | **Event-driven triggers** | Schedule, webhooks (via FastAPI), flow triggers out of the box | | **Task dependencies** | Automatic dependency ordering via task result passing or `wait_for` | | **Retry & error handling** | Built-in retry policies with `@task(retries=3)` | | **Human-in-the-loop** | Native `pause_flow_run()` for approval workflows | ## 🎯 Target Fabric This project is designed for Arista EVPN-VXLAN fabrics: - **2 Spines** (BGP Route Reflectors) - **8 Leafs** (4 MLAG VTEP pairs) - **cEOS 4.35.0F** with gNMI enabled - **EVPN Type-2** (L2 VXLAN) and **Type-5** (L3 VXLAN) support Reference lab topology: [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab) ## 📋 Project Phases Progress is tracked via issues. See [all issues](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues) or filter by phase: | Phase | Description | Status | | ----------- | ------------------------------------------------------------------------- | -------------- | | **Phase 1** | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | ✅ Complete | | **Phase 2** | InfraHub Client & Core Reconciler - SDK client, diff engine, YANG mappers | 🔄 In Progress | | **Phase 3** | Full Fabric Coverage - BGP, MLAG, VRFs mappers | 📋 Planned | | **Phase 4** | Prefect Integration - Flows, webhooks, drift detection | 📋 Planned | ## 📁 Project Structure ``` fabric-orchestrator/ ├── README.md ├── pyproject.toml │ ├── src/ # Python package │ ├── __init__.py │ ├── cli.py # CLI for YANG discovery │ │ │ ├── gnmi/ │ │ ├── __init__.py │ │ ├── client.py # gNMI client wrapper (pygnmi) │ │ └── README.md │ │ │ ├── infrahub/ # InfraHub integration │ │ ├── __init__.py │ │ ├── client.py # InfraHub SDK wrapper │ │ ├── models.py # Pydantic intent models │ │ └── exceptions.py # Client exceptions │ │ │ └── yang/ │ ├── __init__.py │ ├── mapper.py # InfraHub intent → YANG paths │ ├── paths.py # YANG path definitions │ └── mappers/ # Resource-specific mappers │ ├── vlan.py │ ├── interface.py │ ├── bgp.py │ └── vxlan.py │ ├── tests/ │ └── docs/ ├── cli-user-guide.md └── yang-paths.md ``` ## 🛠️ Technology Stack | Component | Technology | Purpose | | --------------- | -------------------------- | ------------------------------------ | | Source of Truth | **InfraHub** | Intent definition via custom schema | | Orchestrator | **Prefect** | Python-native workflow orchestration | | Transport | gNMI | Configuration and telemetry | | Data Models | YANG (OpenConfig + Arista) | Structured configuration | | Python Library | pygnmi + infrahub-sdk | gNMI/InfraHub interactions | | CLI | Click + Rich | YANG discovery tools | | Validation | Pydantic v2 | Intent data validation | | Lab | ContainerLab + cEOS | Development environment | ## 🔗 Related Projects - [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab) - Lab topology, InfraHub schemas & data - [InfraHub](https://github.com/opsmill/infrahub) - Source of Truth platform - [InfraHub Schema Library](https://github.com/opsmill/schema-library) - Reference schemas - [Arista YANG Models](https://github.com/aristanetworks/yang/tree/master/EOS-4.35.0F) - EOS 4.35.0F YANG definitions ## 📚 References ### InfraHub - [InfraHub Documentation](https://docs.infrahub.app) - [InfraHub Schema Guide](https://docs.infrahub.app/guides/create-schema) - [InfraHub Python SDK](https://github.com/opsmill/infrahub-sdk-python) ### Prefect - [Prefect Documentation](https://docs.prefect.io) - [Prefect Flows](https://docs.prefect.io/latest/develop/write-flows/) - [Prefect Tasks](https://docs.prefect.io/latest/develop/write-tasks/) ### YANG / gNMI - [Arista gNMI Documentation](https://aristanetworks.github.io/openmgmt/configuration/gnmi/) - [OpenConfig Models](https://github.com/openconfig/public) - [pygnmi Library](https://github.com/akarneliuk/pygnmi) ### EVPN-VXLAN - [Arista BGP EVPN Configuration Example](https://overlaid.net/2019/01/27/arista-bgp-evpn-configuration-example/) ## 🚀 Getting Started ### Prerequisites - Python 3.12+ - `uv` package manager - Access to an InfraHub instance with the EVPN-VXLAN fabric schema loaded - Access to ContainerLab with cEOS images (for lab testing) ### Quick Start ```bash # Clone the repository git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git cd fabric-orchestrator # Install Python dependencies uv sync # Set InfraHub connection (point to your InfraHub instance) export INFRAHUB_ADDRESS="http://localhost:8000" export INFRAHUB_API_TOKEN="your-token" # Verify gNMI connectivity uv run fabric-orch discover capabilities --target leaf1:6030 # Run reconciliation uv run fabric-orch plan uv run fabric-orch apply ``` ## Prefect Flow Example ```python from prefect import flow, task from prefect.variables import Variable @task(retries=2, retry_delay_seconds=10) async def get_fabric_intent(device: str | None = None) -> dict: """Retrieve fabric intent from InfraHub.""" from src.infrahub.client import FabricInfrahubClient async with FabricInfrahubClient( url=Variable.get("infrahub_url"), api_token=Variable.get("infrahub_token"), ) as client: return await client.get_device(device) @task def compute_diff(intent: dict, current: dict) -> list[dict]: """Compute diff between desired and current state.""" from src.reconciler.diff import compute_diff as diff_engine return diff_engine(want=intent, have=current) @flow(log_prints=True, name="fabric-reconcile") def fabric_reconcile(device: str | None = None, dry_run: bool = True) -> dict: """Reconcile fabric state with InfraHub intent.""" intent = get_fabric_intent(device) current = get_current_state(device) changes = compute_diff(intent, current) if not changes: print("✅ Fabric is in sync") return {"in_sync": True} if not dry_run: apply_changes(changes) return {"changes": changes, "applied": not dry_run} ``` --- **Status**: 🚧 Active Development - Phase 2 (InfraHub Client & Core Reconciler)