Reviewed-on: #27
Fabric Orchestrator
Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics
A workflow-based orchestration system that uses NetBox as Source of Truth, Kestra for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.
🎯 Project Vision
Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where:
- Intent is defined in NetBox (Custom Fields, Native Models, BGP Plugin)
- Orchestration is handled by Kestra (declarative YAML workflows)
- State is continuously monitored via gNMI Subscribe
- Changes are computed as diffs and applied atomically via gNMI Set
- Drift is detected and optionally auto-remediated
Think terraform plan and terraform apply, but for your network fabric — powered by Kestra workflows.
🏗️ Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ INTENT LAYER │
│ ┌─────────────────┐ ┌──────────────────────┐ ┌────────────────────┐ │
│ │ NetBox │ │ Custom Fields / │ │ netbox-bgp │ │
│ │ (SoT) │◄───│ Native Models │◄───│ Plugin │ │
│ └────────┬────────┘ └──────────────────────┘ └────────────────────┘ │
└───────────┼─────────────────────────────────────────────────────────────────┘
│ Webhook / Polling
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ ORCHESTRATION LAYER (KESTRA) │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Kestra Workflows (YAML) │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐ │ │
│ │ │ fabric-reconcile│ │ drift-detection │ │ netbox-webhook-handler │ │ │
│ │ │ (plan/apply) │ │ (subscribe) │ │ (event trigger) │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Python Tasks (containerized) │ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────────────────┐ │ │
│ │ │ Intent Parser │ │ Diff Engine │ │ gNMI Client (Get/Set) │ │ │
│ │ │ (NetBox→YANG) │ │ (Want vs Have)│ │ (pygnmi wrapper) │ │ │
│ │ └───────────────┘ └───────────────┘ └───────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ Triggers: Webhook (NetBox) │ Schedule (cron) │ Flow (event-driven) ││
│ └─────────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────┘
│ gNMI Get/Set/Subscribe
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ DEVICE LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ spine1 │ │ spine2 │ │ leaf1 │ │ leaf2 │ ... │
│ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
🎛 Why Kestra?
We chose Kestra as the orchestration engine for several reasons:
| Feature | Benefit |
|---|---|
| Declarative YAML workflows | Infrastructure-as-Code for orchestration logic |
| Built-in UI | Dashboard, logs, metrics, execution history — no custom development |
| Native webhooks | Direct NetBox integration without custom FastAPI server |
| Event-driven triggers | Schedule, webhook, flow triggers out of the box |
| Python task support | Run containerized Python scripts with dependencies |
| DAG support | Automatic dependency ordering with io.kestra.core.tasks.flows.Dag |
| Retry & error handling | Built-in retry policies and error notifications |
| Secrets management | Native secrets storage for credentials |
🎯 Target Fabric
This project is designed for the Arista EVPN-VXLAN ContainerLab topology:
- 2 Spines (BGP Route Reflectors, AS 65000)
- 8 Leafs (4 MLAG VTEP pairs, AS 65001-65004)
- cEOS 4.35.0F with gNMI enabled
- EVPN Type-2 (L2 VXLAN) and Type-5 (L3 VXLAN) support
Reference: arista-evpn-vxlan-clab
📋 Project Phases
Progress is tracked via issues. See all issues or filter by phase:
| Phase | Description | Issues |
|---|---|---|
| Phase 1 | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | phase-1-yang-discovery |
| Phase 2 | Core Components - NetBox client, diff engine, gNMI operations | phase-2-minimal-reconciler |
| Phase 3 | Full Fabric - BGP, MLAG, VRFs, YANG mappers | phase-3-full-fabric |
| Phase 4 | Kestra Integration - Workflows, webhooks, drift detection | phase-4-kestra |
📌 Project Board: View Kanban
📁 Project Structure
fabric-orchestrator/
├── README.md
├── pyproject.toml
├── docker-compose.yml # Kestra + PostgreSQL
│
├── kestra/ # Kestra workflows
│ └── flows/
│ ├── fabric-reconcile.yml # Main plan/apply workflow
│ ├── netbox-webhook.yml # NetBox webhook handler
│ ├── drift-detection.yml # Drift monitoring workflow
│ └── device-config.yml # Per-device configuration
│
├── src/ # Python package (reusable code)
│ ├── __init__.py
│ ├── cli.py # CLI for YANG discovery (discover commands)
│ ├── gnmi/
│ │ ├── __init__.py
│ │ ├── client.py # gNMI client wrapper (pygnmi)
│ │ └── README.md
│ ├── netbox/
│ │ ├── __init__.py
│ │ ├── client.py # NetBox API client (pynetbox)
│ │ └── models.py # Pydantic models for intent validation
│ └── yang/
│ ├── __init__.py
│ ├── mapper.py # NetBox intent → YANG paths
│ └── paths.py # YANG path definitions
│
├── scripts/ # Scripts called by Kestra workflows
│ ├── get_fabric_intent.py
│ ├── diff_engine.py
│ └── apply_changes.py
│
├── tests/
│
└── docs/
├── cli-user-guide.md # CLI documentation
├── yang-paths.md # Documented YANG paths
└── netbox-data-model.md # NetBox schema documentation
🛠️ Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Source of Truth | NetBox + BGP Plugin | Intent definition via native models |
| Orchestrator | Kestra | Declarative workflow orchestration |
| Transport | gNMI | Configuration and telemetry |
| Data Models | YANG (OpenConfig + Arista) | Structured configuration |
| Python Library | pygnmi + pynetbox | gNMI/NetBox interactions |
| CLI | Click + Rich | YANG discovery tools |
| Validation | Pydantic v2 | Intent data validation |
| Lab | ContainerLab + cEOS | Development environment |
🔗 Related Projects
- arista-evpn-vxlan-clab - Target fabric topology
- projet-vxlan-automation - Previous NetBox RenderConfig work
- Arista YANG Models - EOS 4.35.0F YANG definitions
- Kestra Documentation - Orchestration platform docs
📚 References
Kestra
YANG / gNMI
EVPN-VXLAN
🚀 Getting Started
Prerequisites
- Docker and Docker Compose
- Python 3.12+
uvpackage manager- Access to ContainerLab with cEOS images
Quick Start
# Clone the repository
git clone https://gitea.arnodo.fr/Damien/fabric-orchestrator.git
cd fabric-orchestrator
# Start Kestra
docker compose up -d
# Access Kestra UI
open http://localhost:8080
# Install Python dependencies (for CLI tools)
uv sync
# Verify gNMI connectivity to your fabric
uv run fabric-orch discover capabilities --target leaf1:6030
# Explore YANG paths
uv run fabric-orch discover get --target leaf1:6030 \
--path "/interfaces/interface[name=Ethernet1]/state"
Kestra Workflow Example
id: fabric-reconcile
namespace: network.fabric
description: Reconcile fabric state with NetBox intent
inputs:
- id: device
type: STRING
required: false
- id: auto_apply
type: BOOLEAN
defaults: false
tasks:
- id: get_intent
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/damien/fabric-orchestrator:latest
script: |
from kestra import Kestra
from src.netbox import FabricNetBoxClient
client = FabricNetBoxClient()
intent = client.get_fabric_intent()
Kestra.outputs({"intent": intent.model_dump()})
- id: compute_diff
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/damien/fabric-orchestrator:latest
script: |
from kestra import Kestra
# Compute diff between intent and current state
Kestra.outputs({"changes": changes, "has_changes": len(changes) > 0})
- id: apply_changes
type: io.kestra.plugin.scripts.python.Script
runIf: "{{ outputs.compute_diff.vars.has_changes and inputs.auto_apply }}"
containerImage: ghcr.io/damien/fabric-orchestrator:latest
script: |
from src.gnmi import GNMIClient
# Apply changes via gNMI Set
triggers:
- id: netbox_webhook
type: io.kestra.plugin.core.trigger.Webhook
key: "{{ secret('NETBOX_WEBHOOK_KEY') }}"
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 */6 * * *"
errors:
- id: notify_failure
type: io.kestra.plugin.notifications.slack.SlackExecution
url: "{{ secret('SLACK_WEBHOOK') }}"
Status: 🚧 Active Development - Migrating to Kestra orchestration (Phase 4)