docs: Update README for InfraHub migration
- Replace NetBox with InfraHub as Source of Truth - Update architecture diagram - Explain InfraHub benefits (Git-native, custom schema) - Update project structure (remove netbox references) - Update technology stack - Revise project phases for new approach
This commit is contained in:
223
README.md
223
README.md
@@ -2,13 +2,13 @@
|
||||
|
||||
**Declarative Network Infrastructure Management for Arista EVPN-VXLAN Fabrics**
|
||||
|
||||
A workflow-based orchestration system that uses NetBox as Source of Truth, [Prefect](https://prefect.io) for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.
|
||||
A workflow-based orchestration system that uses [InfraHub](https://github.com/opsmill/infrahub) as Source of Truth, [Prefect](https://prefect.io) for orchestration, and gNMI/YANG for atomic configuration management of Arista data center fabrics.
|
||||
|
||||
## 🎯 Project Vision
|
||||
|
||||
Transform network infrastructure management from imperative scripting to true declarative infrastructure-as-code, where:
|
||||
|
||||
- **Intent** is defined in NetBox (Custom Fields, Native Models, BGP Plugin)
|
||||
- **Intent** is defined in InfraHub (custom schema, Git-versioned)
|
||||
- **Orchestration** is handled by Prefect (Python-native `@flow` and `@task` decorators)
|
||||
- **State** is continuously monitored via gNMI Subscribe
|
||||
- **Changes** are computed as diffs and applied atomically via gNMI Set
|
||||
@@ -19,42 +19,46 @@ Think `terraform plan` and `terraform apply`, but for your network fabric — po
|
||||
## 🏗️ Architecture
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ INTENT LAYER │
|
||||
│ ┌─────────────────────┐ ┌──────────────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ NetBox │ │ Custom Fields / │ │ netbox-bgp │ │
|
||||
│ │ (SoT) │◄───│ Native Models │◄───│ Plugin │ │
|
||||
│ └──────────┬──────────┘ └──────────────────────────┘ └──────────────────────┘ │
|
||||
└─────────────┼─────────────────────────────────────────────────────────────────────────┘
|
||||
│ Webhook / Polling
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||
│ ORCHESTRATION LAYER (PREFECT) │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Prefect Flows (Python) │ │
|
||||
│ │ ┌───────────────────┐ ┌───────────────────┐ ┌─────────────────────┐ │ │
|
||||
│ │ │ fabric_reconcile │ │ handle_drift │ │ drift_remediation │ │ │
|
||||
│ │ │ (plan/apply) │ │ (subscribe) │ │ (auto-fix) │ │ │
|
||||
│ │ └───────────────────┘ └───────────────────┘ └─────────────────────┘ │ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Prefect Tasks (Python) │ │
|
||||
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────────────┐ │ │
|
||||
│ │ │ Intent Parser │ │ Diff Engine │ │ gNMI Client │ │ │
|
||||
│ │ │ (NetBox→YANG) │ │ (Want vs Have) │ │ (pygnmi wrapper) │ │ │
|
||||
│ │ └─────────────────┘ └─────────────────┘ └───────────────────────┘ │ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ FastAPI Webhook Receiver │ Prefect .serve() │ Prefect Server (UI) │ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────────────────────────────────┘
|
||||
│ gNMI Get/Set/Subscribe
|
||||
▼
|
||||
│ INTENT LAYER │
|
||||
│ ┌─────────────────────────┐ ┌──────────────────────────────────────────┐ │
|
||||
│ │ InfraHub │ │ Git Repository │ │
|
||||
│ │ (Source of Truth) │◄──►│ - Schema definitions (YAML) │ │
|
||||
│ │ │ │ - Transforms (Jinja2/Python) │ │
|
||||
│ │ • Custom fabric schema │ │ - Version-controlled intent │ │
|
||||
│ │ • GraphQL API │ └──────────────────────────────────────────┘ │
|
||||
│ │ • Branch-based changes │ │
|
||||
│ └────────────┬────────────┘ │
|
||||
└───────────────┼──────────────────────────────────────────────────────────────┘
|
||||
│ GraphQL / SDK
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||
│ DEVICE LAYER │
|
||||
│ ORCHESTRATION LAYER (PREFECT) │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Prefect Flows (Python) │ │
|
||||
│ │ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │ │
|
||||
│ │ │ fabric_reconcile │ │ handle_drift │ │ drift_remediation │ │ │
|
||||
│ │ │ (plan/apply) │ │ (subscribe) │ │ (auto-fix) │ │ │
|
||||
│ │ └───────────────────┘ └───────────────────┘ └───────────────────┘ │ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Prefect Tasks (Python) │ │
|
||||
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐│ │
|
||||
│ │ │ Intent Parser │ │ Diff Engine │ │ gNMI Client ││ │
|
||||
│ │ │ (InfraHub→YANG) │ │ (Want vs Have) │ │ (pygnmi wrapper) ││ │
|
||||
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────────┘│ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Prefect Server (UI) │ Prefect .serve() │ Webhook Receiver │ │
|
||||
│ └────────────────────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────┬───────────────────────────────────────────────────┘
|
||||
│ gNMI Get/Set/Subscribe
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||
│ DEVICE LAYER │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ spine1 │ │ spine2 │ │ leaf1 │ │ leaf2 │ ... │
|
||||
│ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │ gNMI:6030 │ │
|
||||
@@ -62,6 +66,26 @@ Think `terraform plan` and `terraform apply`, but for your network fabric — po
|
||||
└──────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## 🎯 Why InfraHub?
|
||||
|
||||
We chose [InfraHub](https://github.com/opsmill/infrahub) over NetBox as Source of Truth for several reasons:
|
||||
|
||||
| Feature | NetBox | InfraHub |
|
||||
|---------|--------|----------|
|
||||
| **Schema** | Fixed DCIM/IPAM model | Fully customizable YAML schema |
|
||||
| **Git Integration** | External sync needed | Native - branches = data branches |
|
||||
| **Versioning** | Changelog only | True Git-like versioning with merges |
|
||||
| **Test/Redeploy** | Dump/restore | `git clone` = complete environment |
|
||||
| **Transforms** | Limited | Built-in Jinja2 + Python transforms |
|
||||
| **GraphQL** | Yes | Yes (auto-generated from schema) |
|
||||
|
||||
**Key benefits for this project:**
|
||||
|
||||
1. **Custom Schema** - Model exactly what we need (VTEPs, MLAG pairs, fabric topology)
|
||||
2. **Git-native** - Schema + data versioned together, easy test environment setup
|
||||
3. **Transforms** - Generate device configs directly from InfraHub
|
||||
4. **Branches** - Test fabric changes in isolated branches before merge
|
||||
|
||||
## 🎛 Why Prefect?
|
||||
|
||||
We chose [Prefect](https://prefect.io) as the orchestration engine for several reasons:
|
||||
@@ -92,14 +116,12 @@ Reference: [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-v
|
||||
|
||||
Progress is tracked via issues. See [all issues](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues) or filter by phase:
|
||||
|
||||
| Phase | Description | Issues |
|
||||
| Phase | Description | Status |
|
||||
|-------|-------------|--------|
|
||||
| **Phase 1** | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | [phase-1-yang-discovery](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=1) |
|
||||
| **Phase 2** | Core Components - NetBox client, diff engine, gNMI operations | [phase-2-minimal-reconciler](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=2) |
|
||||
| **Phase 3** | Full Fabric - BGP, MLAG, VRFs, YANG mappers | [phase-3-full-fabric](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=3) |
|
||||
| **Phase 4** | Prefect Integration - Flows, webhooks, drift detection | [phase-4-event-driven](https://gitea.arnodo.fr/Damien/fabric-orchestrator/issues?type=all&state=all&labels=4) |
|
||||
|
||||
📌 **Project Board**: [View Kanban](https://gitea.arnodo.fr/Damien/fabric-orchestrator/projects)
|
||||
| **Phase 1** | YANG Path Discovery - Map EOS 4.35.0F YANG models, validate gNMI | ✅ Complete |
|
||||
| **Phase 2** | InfraHub Setup & Core Reconciler - Schema, diff engine, YANG mappers | 🔄 In Progress |
|
||||
| **Phase 3** | Full Fabric Coverage - BGP, MLAG, VRFs mappers | 📋 Planned |
|
||||
| **Phase 4** | Prefect Integration - Flows, webhooks, drift detection | 📋 Planned |
|
||||
|
||||
## 📁 Project Structure
|
||||
|
||||
@@ -108,58 +130,60 @@ fabric-orchestrator/
|
||||
├── README.md
|
||||
├── pyproject.toml
|
||||
│
|
||||
├── src/ # Python package
|
||||
├── src/ # Python package
|
||||
│ ├── __init__.py
|
||||
│ ├── cli.py # CLI for YANG discovery (discover commands)
|
||||
│ ├── cli.py # CLI for YANG discovery (discover commands)
|
||||
│ │
|
||||
│ ├── flows/ # Prefect flows
|
||||
│ ├── flows/ # Prefect flows
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── reconcile.py # @flow fabric_reconcile (plan/apply)
|
||||
│ │ ├── drift.py # @flow handle_drift
|
||||
│ │ └── remediation.py # @flow drift_remediation
|
||||
│ │ ├── reconcile.py # @flow fabric_reconcile (plan/apply)
|
||||
│ │ ├── drift.py # @flow handle_drift
|
||||
│ │ └── remediation.py # @flow drift_remediation
|
||||
│ │
|
||||
│ ├── api/ # FastAPI webhook receiver
|
||||
│ ├── api/ # FastAPI webhook receiver
|
||||
│ │ ├── __init__.py
|
||||
│ │ └── webhooks.py # NetBox webhook endpoint
|
||||
│ │ └── webhooks.py # InfraHub webhook endpoint
|
||||
│ │
|
||||
│ ├── services/ # Long-running services
|
||||
│ ├── services/ # Long-running services
|
||||
│ │ ├── __init__.py
|
||||
│ │ └── drift_monitor.py # gNMI Subscribe drift detection
|
||||
│ │ └── drift_monitor.py # gNMI Subscribe drift detection
|
||||
│ │
|
||||
│ ├── gnmi/
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── client.py # gNMI client wrapper (pygnmi)
|
||||
│ │ ├── client.py # gNMI client wrapper (pygnmi)
|
||||
│ │ └── README.md
|
||||
│ │
|
||||
│ ├── netbox/
|
||||
│ ├── infrahub/ # InfraHub integration (TODO)
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── client.py # NetBox API client (pynetbox)
|
||||
│ │ └── models.py # Pydantic models for intent validation
|
||||
│ │ ├── client.py # InfraHub SDK client
|
||||
│ │ └── models.py # Pydantic models for intent validation
|
||||
│ │
|
||||
│ └── yang/
|
||||
│ ├── __init__.py
|
||||
│ ├── mapper.py # NetBox intent → YANG paths
|
||||
│ ├── paths.py # YANG path definitions
|
||||
│ └── dependencies.py # Dependency ordering graph
|
||||
│ ├── mapper.py # InfraHub intent → YANG paths
|
||||
│ ├── paths.py # YANG path definitions
|
||||
│ └── dependencies.py # Dependency ordering graph
|
||||
│
|
||||
├── schemas/ # InfraHub schema definitions (TODO)
|
||||
│ └── fabric.yml # Custom fabric schema
|
||||
│
|
||||
├── tests/
|
||||
│
|
||||
└── docs/
|
||||
├── cli-user-guide.md # CLI documentation
|
||||
├── yang-paths.md # Documented YANG paths
|
||||
└── netbox-data-model.md # NetBox schema documentation
|
||||
├── cli-user-guide.md # CLI documentation
|
||||
└── yang-paths.md # Documented YANG paths
|
||||
```
|
||||
|
||||
## 🛠️ Technology Stack
|
||||
|
||||
| Component | Technology | Purpose |
|
||||
|-----------|------------|---------|
|
||||
| Source of Truth | NetBox + BGP Plugin | Intent definition via native models |
|
||||
| Source of Truth | **InfraHub** | Intent definition via custom schema |
|
||||
| Orchestrator | **Prefect** | Python-native workflow orchestration |
|
||||
| Webhooks | FastAPI | Receive NetBox webhooks |
|
||||
| Webhooks | FastAPI | Receive InfraHub webhooks |
|
||||
| Transport | gNMI | Configuration and telemetry |
|
||||
| Data Models | YANG (OpenConfig + Arista) | Structured configuration |
|
||||
| Python Library | pygnmi + pynetbox | gNMI/NetBox interactions |
|
||||
| Python Library | pygnmi + infrahub-sdk | gNMI/InfraHub interactions |
|
||||
| CLI | Click + Rich | YANG discovery tools |
|
||||
| Validation | Pydantic v2 | Intent data validation |
|
||||
| Lab | ContainerLab + cEOS | Development environment |
|
||||
@@ -167,12 +191,18 @@ fabric-orchestrator/
|
||||
## 🔗 Related Projects
|
||||
|
||||
- [arista-evpn-vxlan-clab](https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab) - Target fabric topology
|
||||
- [projet-vxlan-automation](https://gitea.arnodo.fr/Damien/projet-vxlan-automation) - Previous NetBox RenderConfig work
|
||||
- [InfraHub](https://github.com/opsmill/infrahub) - Source of Truth platform
|
||||
- [InfraHub Schema Library](https://github.com/opsmill/schema-library) - Reference schemas
|
||||
- [Arista YANG Models](https://github.com/aristanetworks/yang/tree/master/EOS-4.35.0F) - EOS 4.35.0F YANG definitions
|
||||
- [Prefect Documentation](https://docs.prefect.io) - Orchestration platform docs
|
||||
|
||||
## 📚 References
|
||||
|
||||
### InfraHub
|
||||
- [InfraHub Documentation](https://docs.infrahub.app)
|
||||
- [InfraHub Schema Guide](https://docs.infrahub.app/guides/create-schema)
|
||||
- [InfraHub Python SDK](https://github.com/opsmill/infrahub-sdk-python)
|
||||
|
||||
### Prefect
|
||||
- [Prefect Documentation](https://docs.prefect.io)
|
||||
- [Prefect Flows](https://docs.prefect.io/latest/develop/write-flows/)
|
||||
@@ -196,7 +226,7 @@ fabric-orchestrator/
|
||||
- Python 3.12+
|
||||
- `uv` package manager
|
||||
- Access to ContainerLab with cEOS images
|
||||
- NetBox instance with BGP plugin
|
||||
- Docker (for InfraHub)
|
||||
|
||||
### Quick Start
|
||||
|
||||
@@ -208,15 +238,18 @@ cd fabric-orchestrator
|
||||
# Install Python dependencies
|
||||
uv sync
|
||||
|
||||
# Start InfraHub (see InfraHub docs for full setup)
|
||||
# docker compose -f infrahub-docker-compose.yml up -d
|
||||
|
||||
# Configure Prefect secrets
|
||||
python -c "
|
||||
from prefect.blocks.system import Secret
|
||||
from prefect.variables import Variable
|
||||
|
||||
Secret(value='your-netbox-token').save('netbox-token', overwrite=True)
|
||||
Secret(value='your-gnmi-password').save('gnmi-password', overwrite=True)
|
||||
Secret(value='your-infrahub-token').save('infrahub-token', overwrite=True)
|
||||
|
||||
Variable.set('netbox_url', 'https://netbox.example.com')
|
||||
Variable.set('infrahub_url', 'http://localhost:8000')
|
||||
Variable.set('gnmi_username', 'admin')
|
||||
"
|
||||
|
||||
@@ -231,38 +264,6 @@ uv run fabric-orch discover get --target leaf1:6030 \
|
||||
--path "/interfaces/interface[name=Ethernet1]/state"
|
||||
```
|
||||
|
||||
### Running Flows
|
||||
|
||||
```python
|
||||
from src.flows.reconcile import fabric_reconcile
|
||||
|
||||
# Plan only (dry-run)
|
||||
result = fabric_reconcile(dry_run=True)
|
||||
|
||||
# Plan for a specific device
|
||||
result = fabric_reconcile(device="leaf1", dry_run=True)
|
||||
|
||||
# Apply changes automatically
|
||||
result = fabric_reconcile(auto_apply=True, dry_run=False)
|
||||
```
|
||||
|
||||
### Deploying with Scheduling
|
||||
|
||||
```bash
|
||||
# Start the flow with scheduling (runs every 6 hours)
|
||||
python -m src.flows.reconcile
|
||||
|
||||
# Or deploy via Prefect CLI
|
||||
prefect deployment run fabric-reconcile/fabric-reconcile-scheduled
|
||||
```
|
||||
|
||||
### Starting the Webhook Receiver
|
||||
|
||||
```bash
|
||||
# Start FastAPI webhook server
|
||||
uvicorn src.api.webhooks:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## Prefect Flow Example
|
||||
|
||||
```python
|
||||
@@ -273,14 +274,16 @@ from prefect.variables import Variable
|
||||
|
||||
@task(retries=2, retry_delay_seconds=10)
|
||||
def get_fabric_intent(device: str | None = None) -> dict:
|
||||
"""Retrieve fabric intent from NetBox."""
|
||||
from src.netbox import FabricNetBoxClient
|
||||
"""Retrieve fabric intent from InfraHub."""
|
||||
from infrahub_sdk import InfrahubClient
|
||||
|
||||
netbox_url = Variable.get("netbox_url")
|
||||
netbox_token = Secret.load("netbox-token").get()
|
||||
infrahub_url = Variable.get("infrahub_url")
|
||||
infrahub_token = Secret.load("infrahub-token").get()
|
||||
|
||||
client = FabricNetBoxClient(url=netbox_url, token=netbox_token)
|
||||
return client.get_fabric_intent() if not device else client.get_device_intent(device)
|
||||
client = InfrahubClient(address=infrahub_url, api_token=infrahub_token)
|
||||
# Query fabric intent via GraphQL
|
||||
# ...
|
||||
return intent
|
||||
|
||||
|
||||
@task
|
||||
@@ -305,7 +308,7 @@ def fabric_reconcile(
|
||||
auto_apply: bool = False,
|
||||
dry_run: bool = True
|
||||
) -> dict:
|
||||
"""Reconcile fabric state with NetBox intent."""
|
||||
"""Reconcile fabric state with InfraHub intent."""
|
||||
print(f"🔄 Starting fabric reconciliation")
|
||||
|
||||
intent = get_fabric_intent(device)
|
||||
@@ -332,4 +335,4 @@ if __name__ == "__main__":
|
||||
|
||||
---
|
||||
|
||||
**Status**: 🚧 Active Development - Phase 2 (Core Components) & Phase 4 (Prefect Integration)
|
||||
**Status**: 🚧 Active Development - Migrating to InfraHub as Source of Truth
|
||||
|
||||
Reference in New Issue
Block a user