Damien Arnodo 77ca22bd0a docs: Update README for InfraHub migration
- Replace NetBox with InfraHub as Source of Truth
- Update architecture diagram
- Explain InfraHub benefits (Git-native, custom schema)
- Update project structure (remove netbox references)
- Update technology stack
- Revise project phases for new approach
2026-02-05 08:42:51 +00:00
2026-02-05 09:38:05 +01:00
2026-02-04 16:12:10 +01:00
2026-02-04 18:23:04 +01:00

Fabric Orchestrator

Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics

A workflow-based orchestration system that uses InfraHub as Source of Truth, Prefect for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.

🎯 Project Vision

Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where:

  • Intent is defined in InfraHub (custom schema, Git-versioned)
  • Orchestration is handled by Prefect (Python-native @flow and @task decorators)
  • State is continuously monitored via gNMI Subscribe
  • Changes are computed as diffs and applied atomically via gNMI Set
  • Drift is detected and optionally auto-remediated

Think terraform plan and terraform apply, but for your network fabric — powered by Prefect flows.

🏗️ Architecture

┌──────────────────────────────────────────────────────────────────────────────┐
│                          INTENT LAYER                                         │
│  ┌─────────────────────────┐    ┌──────────────────────────────────────────┐ │
│  │       InfraHub          │    │           Git Repository                  │ │
│  │   (Source of Truth)     │◄──►│  - Schema definitions (YAML)             │ │
│  │                         │    │  - Transforms (Jinja2/Python)            │ │
│  │  • Custom fabric schema │    │  - Version-controlled intent             │ │
│  │  • GraphQL API          │    └──────────────────────────────────────────┘ │
│  │  • Branch-based changes │                                                 │
│  └────────────┬────────────┘                                                 │
└───────────────┼──────────────────────────────────────────────────────────────┘
                │ GraphQL / SDK
                ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                       ORCHESTRATION LAYER (PREFECT)                           │
│                                                                               │
│  ┌────────────────────────────────────────────────────────────────────────┐  │
│  │                      Prefect Flows (Python)                            │  │
│  │  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────┐  │  │
│  │  │ fabric_reconcile  │  │ handle_drift      │  │ drift_remediation │  │  │
│  │  │ (plan/apply)      │  │ (subscribe)       │  │ (auto-fix)        │  │  │
│  │  └───────────────────┘  └───────────────────┘  └───────────────────┘  │  │
│  └────────────────────────────────────────────────────────────────────────┘  │
│                                                                               │
│  ┌────────────────────────────────────────────────────────────────────────┐  │
│  │                      Prefect Tasks (Python)                            │  │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────────┐│  │
│  │  │ Intent Parser   │  │ Diff Engine     │  │ gNMI Client             ││  │
│  │  │ (InfraHub→YANG) │  │ (Want vs Have)  │  │ (pygnmi wrapper)        ││  │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────────────┘│  │
│  └────────────────────────────────────────────────────────────────────────┘  │
│                                                                               │
│  ┌────────────────────────────────────────────────────────────────────────┐  │
│  │  Prefect Server (UI)  │  Prefect .serve()  │  Webhook Receiver        │  │
│  └────────────────────────────────────────────────────────────────────────┘  │
└──────────────────────────┬───────────────────────────────────────────────────┘
                           │ gNMI Get/Set/Subscribe
                           ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                            DEVICE LAYER                                       │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐         │
│  │   spine1     │ │   spine2     │ │   leaf1      │ │   leaf2      │  ...    │
│  │  gNMI:6030   │ │  gNMI:6030   │ │  gNMI:6030   │ │  gNMI:6030   │         │
│  └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘         │
└──────────────────────────────────────────────────────────────────────────────┘

🎯 Why InfraHub?

We chose InfraHub over NetBox as Source of Truth for several reasons:

Feature NetBox InfraHub
Schema Fixed DCIM/IPAM model Fully customizable YAML schema
Git Integration External sync needed Native - branches = data branches
Versioning Changelog only True Git-like versioning with merges
Test/Redeploy Dump/restore git clone = complete environment
Transforms Limited Built-in Jinja2 + Python transforms
GraphQL Yes Yes (auto-generated from schema)

Key benefits for this project:

  1. Custom Schema - Model exactly what we need (VTEPs, MLAG pairs, fabric topology)
  2. Git-native - Schema + data versioned together, easy test environment setup
  3. Transforms - Generate device configs directly from InfraHub
  4. Branches - Test fabric changes in isolated branches before merge

🎛 Why Prefect?

We chose Prefect as the orchestration engine for several reasons:

Feature Benefit
Python-native workflows Use @flow and @task decorators — no YAML, just Python
Free secrets management Native Secret blocks for credentials (free in OSS)
Built-in UI Dashboard, logs, metrics, execution history via prefect server start
No containerization required Run flows directly with .serve() — no Docker needed
Event-driven triggers Schedule, webhooks (via FastAPI), flow triggers out of the box
Task dependencies Automatic dependency ordering via task result passing or wait_for
Retry & error handling Built-in retry policies with @task(retries=3)
Human-in-the-loop Native pause_flow_run() for approval workflows

🎯 Target Fabric

This project is designed for the Arista EVPN-VXLAN ContainerLab topology:

  • 2 Spines (BGP Route Reflectors, AS 65000)
  • 8 Leafs (4 MLAG VTEP pairs, AS 65001-65004)
  • cEOS 4.35.0F with gNMI enabled
  • EVPN Type-2 (L2 VXLAN) and Type-5 (L3 VXLAN) support

Reference: arista-evpn-vxlan-clab

📋 Project Phases

Progress is tracked via issues. See all issues or filter by phase:

Phase Description Status
Phase 1 YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI Complete
Phase 2 InfraHub Setup & Core Reconciler - Schema, diff engine, YANG mappers 🔄 In Progress
Phase 3 Full Fabric Coverage - BGP, MLAG, VRFs mappers 📋 Planned
Phase 4 Prefect Integration - Flows, webhooks, drift detection 📋 Planned

📁 Project Structure

fabric-orchestrator/
├── README.md
├── pyproject.toml
│
├── src/                              # Python package
│   ├── __init__.py
│   ├── cli.py                        # CLI for YANG discovery (discover commands)
│   │
│   ├── flows/                        # Prefect flows
│   │   ├── __init__.py
│   │   ├── reconcile.py              # @flow fabric_reconcile (plan/apply)
│   │   ├── drift.py                  # @flow handle_drift
│   │   └── remediation.py            # @flow drift_remediation
│   │
│   ├── api/                          # FastAPI webhook receiver
│   │   ├── __init__.py
│   │   └── webhooks.py               # InfraHub webhook endpoint
│   │
│   ├── services/                     # Long-running services
│   │   ├── __init__.py
│   │   └── drift_monitor.py          # gNMI Subscribe drift detection
│   │
│   ├── gnmi/
│   │   ├── __init__.py
│   │   ├── client.py                 # gNMI client wrapper (pygnmi)
│   │   └── README.md
│   │
│   ├── infrahub/                     # InfraHub integration (TODO)
│   │   ├── __init__.py
│   │   ├── client.py                 # InfraHub SDK client
│   │   └── models.py                 # Pydantic models for intent validation
│   │
│   └── yang/
│       ├── __init__.py
│       ├── mapper.py                 # InfraHub intent → YANG paths
│       ├── paths.py                  # YANG path definitions
│       └── dependencies.py           # Dependency ordering graph
│
├── schemas/                          # InfraHub schema definitions (TODO)
│   └── fabric.yml                    # Custom fabric schema
│
├── tests/
│
└── docs/
    ├── cli-user-guide.md             # CLI documentation
    └── yang-paths.md                 # Documented YANG paths

🛠️ Technology Stack

Component Technology Purpose
Source of Truth InfraHub Intent definition via custom schema
Orchestrator Prefect Python-native workflow orchestration
Webhooks FastAPI Receive InfraHub webhooks
Transport gNMI Configuration and telemetry
Data Models YANG (OpenConfig + Arista) Structured configuration
Python Library pygnmi + infrahub-sdk gNMI/InfraHub interactions
CLI Click + Rich YANG discovery tools
Validation Pydantic v2 Intent data validation
Lab ContainerLab + cEOS Development environment

📚 References

InfraHub

Prefect

YANG / gNMI

EVPN-VXLAN

🚀 Getting Started

Prerequisites

  • Python 3.12+
  • uv package manager
  • Access to ContainerLab with cEOS images
  • Docker (for InfraHub)

Quick Start

# Clone the repository
git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git
cd fabric-orchestrator

# Install Python dependencies
uv sync

# Start InfraHub (see InfraHub docs for full setup)
# docker compose -f infrahub-docker-compose.yml up -d

# Configure Prefect secrets
python -c "
from prefect.blocks.system import Secret
from prefect.variables import Variable

Secret(value='your-gnmi-password').save('gnmi-password', overwrite=True)
Secret(value='your-infrahub-token').save('infrahub-token', overwrite=True)

Variable.set('infrahub_url', 'http://localhost:8000')
Variable.set('gnmi_username', 'admin')
"

# Start Prefect server (optional, for UI)
prefect server start

# Verify gNMI connectivity to your fabric
uv run fabric-orch discover capabilities --target leaf1:6030

# Explore YANG paths
uv run fabric-orch discover get --target leaf1:6030 \
  --path "/interfaces/interface[name=Ethernet1]/state"

Prefect Flow Example

from prefect import flow, task
from prefect.blocks.system import Secret
from prefect.variables import Variable


@task(retries=2, retry_delay_seconds=10)
def get_fabric_intent(device: str | None = None) -> dict:
    """Retrieve fabric intent from InfraHub."""
    from infrahub_sdk import InfrahubClient
    
    infrahub_url = Variable.get("infrahub_url")
    infrahub_token = Secret.load("infrahub-token").get()
    
    client = InfrahubClient(address=infrahub_url, api_token=infrahub_token)
    # Query fabric intent via GraphQL
    # ...
    return intent


@task
def compute_diff(intent: dict, current: dict) -> list[dict]:
    """Compute diff between desired and current state."""
    from src.reconciler.diff import compute_diff as diff_engine
    return diff_engine(want=intent, have=current)


@task(retries=1)
def apply_changes(changes: list[dict], dry_run: bool = True) -> dict:
    """Apply changes via gNMI Set."""
    if dry_run:
        return {"applied": False, "changes": changes}
    # Apply via gNMI...
    return {"applied": True, "changes": changes}


@flow(log_prints=True, name="fabric-reconcile")
def fabric_reconcile(
    device: str | None = None,
    auto_apply: bool = False,
    dry_run: bool = True
) -> dict:
    """Reconcile fabric state with InfraHub intent."""
    print(f"🔄 Starting fabric reconciliation")
    
    intent = get_fabric_intent(device)
    current = get_current_state(devices)
    changes = compute_diff(intent, current)
    
    if not changes:
        print("✅ No changes detected - fabric is in sync")
        return {"changes": [], "in_sync": True}
    
    should_apply = auto_apply and not dry_run
    result = apply_changes(changes, dry_run=not should_apply)
    
    return {"changes": changes, "applied": should_apply}


if __name__ == "__main__":
    fabric_reconcile.serve(
        name="fabric-reconcile-scheduled",
        cron="0 */6 * * *",
        tags=["network", "fabric"]
    )

Status: 🚧 Active Development - Migrating to InfraHub as Source of Truth

Description
Declarative Network Fabric Orchestrator - Terraform-like infrastructure management for Arista EVPN-VXLAN using gNMI, YANG, and Infrahub as Source of Truth
Readme 710 KiB
Languages
Python 100%