Damien ea6b66d639 docs: reformat README tables for better raw readability
Update Markdown tables including InfraHub comparison, Prefect benefits, and project progress phases to use consistent column padding. This improves the visual alignment and readability when viewing the raw source file.
2026-02-06 09:00:27 +01:00
2026-02-05 09:38:05 +01:00
2026-02-04 16:12:10 +01:00
2026-02-04 18:23:04 +01:00

Fabric Orchestrator

Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics

A workflow-based orchestration system that uses InfraHub as Source of Truth, Prefect for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.

🎯 Project Vision

Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where:

  • Intent is defined in InfraHub (custom schema, Git-versioned)
  • Orchestration is handled by Prefect (Python-native @flow and @task decorators)
  • State is continuously monitored via gNMI Subscribe
  • Changes are computed as diffs and applied atomically via gNMI Set
  • Drift is detected and optionally auto-remediated

Think terraform plan and terraform apply, but for your network fabric — powered by Prefect flows.

🏗️ Architecture

architecture

🎯 Why InfraHub?

We chose InfraHub over NetBox as Source of Truth for several reasons:

Feature NetBox InfraHub
Schema Fixed DCIM/IPAM model Fully customizable YAML schema
Git Integration External sync needed Native - branches = data branches
Versioning Changelog only True Git-like versioning with merges
Test/Redeploy Dump/restore git clone = complete environment
Transforms Limited Built-in Jinja2 + Python transforms
GraphQL Yes Yes (auto-generated from schema)

Key benefits for this project:

  1. Custom Schema - Model exactly what we need (VTEPs, MLAG pairs, fabric topology)
  2. Git-native - Schema + data versioned together, easy test environment setup
  3. Transforms - Generate device configs directly from InfraHub
  4. Branches - Test fabric changes in isolated branches before merge

📦 Repository as InfraHub Backend

This repository serves as the single source of truth for both code and infrastructure data:

fabric-orchestrator/
├── .infrahub.yml                 # InfraHub repository config
│
├── schemas/                      # InfraHub schema definitions
│   └── fabric.yml                # Custom EVPN-VXLAN fabric schema
│
├── data/                         # Infrastructure objects (YAML)
│   ├── topology/
│   │   ├── sites.yml
│   │   └── devices.yml           # Spines, Leafs, VTEP pairs
│   ├── network/
│   │   ├── vlans.yml             # VLANs + L2VNI mappings
│   │   ├── vrfs.yml              # VRFs + L3VNI mappings
│   │   └── interfaces.yml        # Interface configs
│   └── routing/
│       ├── bgp_sessions.yml      # Underlay + EVPN overlay
│       └── evpn.yml              # Route targets, RDs
│
├── transforms/                   # Jinja2 templates for config generation
│   └── arista/
│       ├── base.j2
│       ├── interfaces.j2
│       ├── bgp.j2
│       └── evpn.j2
│
└── src/                          # Python orchestration code

Workflow

# 1. Edit data files (e.g., add a VLAN)
vim data/network/vlans.yml

# 2. Commit & push
git commit -am "Add VLAN 100 for production"
git push

# 3. InfraHub syncs automatically from Git
#    → Data available via GraphQL

# 4. Prefect flow detects change → reconciles fabric

Benefits

  • Reproductibility: git clonedocker compose up → complete environment
  • Code Review: Infrastructure changes go through PR review
  • History: Full audit trail via Git
  • Testing: Create a branch, test changes, merge when validated

🎛 Why Prefect?

Feature Benefit
Python-native workflows Use @flow and @task decorators — no YAML, just Python
Free secrets management Native Secret blocks for credentials (free in OSS)
Built-in UI Dashboard, logs, metrics, execution history via prefect server start
No containerization required Run flows directly with .serve() — no Docker needed
Event-driven triggers Schedule, webhooks (via FastAPI), flow triggers out of the box
Task dependencies Automatic dependency ordering via task result passing or wait_for
Retry & error handling Built-in retry policies with @task(retries=3)
Human-in-the-loop Native pause_flow_run() for approval workflows

🎯 Target Fabric

This project is designed for the Arista EVPN-VXLAN ContainerLab topology:

  • 2 Spines (BGP Route Reflectors, AS 65000)
  • 8 Leafs (4 MLAG VTEP pairs, AS 65001-65004)
  • cEOS 4.35.0F with gNMI enabled
  • EVPN Type-2 (L2 VXLAN) and Type-5 (L3 VXLAN) support

Reference: arista-evpn-vxlan-clab

📋 Project Phases

Progress is tracked via issues. See all issues or filter by phase:

Phase Description Status
Phase 1 YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI Complete
Phase 2 InfraHub Setup & Core Reconciler - Schema, diff engine, YANG mappers 🔄 In Progress
Phase 3 Full Fabric Coverage - BGP, MLAG, VRFs mappers 📋 Planned
Phase 4 Prefect Integration - Flows, webhooks, drift detection 📋 Planned

📁 Project Structure

fabric-orchestrator/
├── README.md
├── pyproject.toml
├── .infrahub.yml                     # InfraHub config (points to schemas/)
│
├── schemas/                          # InfraHub schema definitions
│   └── fabric.yml                    # Custom EVPN-VXLAN fabric schema
│
├── data/                             # Infrastructure data (YAML)
│   ├── topology/
│   │   ├── sites.yml
│   │   └── devices.yml
│   ├── network/
│   │   ├── vlans.yml
│   │   ├── vrfs.yml
│   │   └── interfaces.yml
│   └── routing/
│       ├── bgp_sessions.yml
│       └── evpn.yml
│
├── transforms/                       # Jinja2 config templates
│   └── arista/
│       └── *.j2
│
├── src/                              # Python package
│   ├── __init__.py
│   ├── cli.py                        # CLI for YANG discovery
│   │
│   ├── flows/                        # Prefect flows
│   │   ├── __init__.py
│   │   ├── reconcile.py              # @flow fabric_reconcile
│   │   ├── drift.py                  # @flow handle_drift
│   │   └── remediation.py            # @flow drift_remediation
│   │
│   ├── gnmi/
│   │   ├── __init__.py
│   │   ├── client.py                 # gNMI client wrapper (pygnmi)
│   │   └── README.md
│   │
│   ├── infrahub/                     # InfraHub integration
│   │   ├── __init__.py
│   │   ├── client.py                 # InfraHub SDK wrapper
│   │   └── queries.py                # GraphQL queries
│   │
│   └── yang/
│       ├── __init__.py
│       ├── mapper.py                 # InfraHub intent → YANG paths
│       ├── paths.py                  # YANG path definitions
│       └── mappers/                  # Resource-specific mappers
│           ├── vlan.py
│           ├── interface.py
│           ├── bgp.py
│           └── vxlan.py
│
├── tests/
│
└── docs/
    ├── cli-user-guide.md
    └── yang-paths.md

🛠️ Technology Stack

Component Technology Purpose
Source of Truth InfraHub Intent definition via custom schema
Data Storage This Git repo Schema + data versioned together
Orchestrator Prefect Python-native workflow orchestration
Transport gNMI Configuration and telemetry
Data Models YANG (OpenConfig + Arista) Structured configuration
Python Library pygnmi + infrahub-sdk gNMI/InfraHub interactions
CLI Click + Rich YANG discovery tools
Validation Pydantic v2 Intent data validation
Lab ContainerLab + cEOS Development environment

📚 References

InfraHub

Prefect

YANG / gNMI

EVPN-VXLAN

🚀 Getting Started

Prerequisites

  • Python 3.12+
  • uv package manager
  • Docker (for InfraHub)
  • Access to ContainerLab with cEOS images

Quick Start

# Clone the repository (includes schema + data)
git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git
cd fabric-orchestrator

# Install Python dependencies
uv sync

# Start InfraHub (loads schema & data from this repo)
docker compose up -d

# Configure Prefect secrets
python -c "
from prefect.blocks.system import Secret
from prefect.variables import Variable

Secret(value='your-gnmi-password').save('gnmi-password', overwrite=True)
Variable.set('infrahub_url', 'http://localhost:8000')
Variable.set('gnmi_username', 'admin')
"

# Verify gNMI connectivity
uv run fabric-orch discover capabilities --target leaf1:6030

# Run reconciliation
uv run fabric-orch plan
uv run fabric-orch apply

Prefect Flow Example

from prefect import flow, task
from prefect.variables import Variable


@task(retries=2, retry_delay_seconds=10)
def get_fabric_intent(device: str | None = None) -> dict:
    """Retrieve fabric intent from InfraHub."""
    from infrahub_sdk import InfrahubClient
    
    client = InfrahubClient(address=Variable.get("infrahub_url"))
    # Query fabric intent via GraphQL
    return client.query(...)


@task
def compute_diff(intent: dict, current: dict) -> list[dict]:
    """Compute diff between desired and current state."""
    from src.reconciler.diff import compute_diff as diff_engine
    return diff_engine(want=intent, have=current)


@flow(log_prints=True, name="fabric-reconcile")
def fabric_reconcile(device: str | None = None, dry_run: bool = True) -> dict:
    """Reconcile fabric state with InfraHub intent."""
    intent = get_fabric_intent(device)
    current = get_current_state(device)
    changes = compute_diff(intent, current)
    
    if not changes:
        print("✅ Fabric is in sync")
        return {"in_sync": True}
    
    if not dry_run:
        apply_changes(changes)
    
    return {"changes": changes, "applied": not dry_run}

Status: 🚧 Active Development - Phase 2 (InfraHub Setup & Core Reconciler)

Description
Declarative Network Fabric Orchestrator - Terraform-like infrastructure management for Arista EVPN-VXLAN using gNMI, YANG, and Infrahub as Source of Truth
Readme 710 KiB
Languages
Python 100%