Format: <SITE>-<ZONE>-<ROLE>-<ID> (e.g., PA-DC-LEAF-01) - Site prefix registry (2 chars: PA, LY, MA...) - Zone codes: DC, DR, LAB, CO - Interface descriptions reference full hostnames - MLAG domain-id now unique per pair - IPAM identifiers aligned with new naming - BGP session descriptions standardized
367 lines
13 KiB
Markdown
367 lines
13 KiB
Markdown
# Fabric Standardization — Small EVPN-VXLAN Data Centers
|
||
|
||
> **Status**: Draft — Phase 0 (#31)
|
||
> **Scope**: POC for small data centers (2 spines, 3-8 leafs)
|
||
> **Parent**: Epic #30
|
||
|
||
---
|
||
|
||
## 1. Topology constraints
|
||
|
||
### 1.1 Spine layer
|
||
- Always **2 spines** per fabric (redundancy, no single point of failure)
|
||
- Spines are **pure L3 routers** — no VTEPs, no VLANs, no MLAG
|
||
- Each spine connects to **every leaf** via a dedicated P2P link
|
||
|
||
### 1.2 Leaf layer
|
||
- Leafs always come in **MLAG pairs** (2 leafs = 1 VTEP)
|
||
- Minimum **3 pairs** (6 leafs), maximum **4 pairs** (8 leafs) per fabric
|
||
- Each leaf connects to **both spines** via dedicated uplinks
|
||
- Each leaf pair shares a **VTEP loopback IP** (Loopback1)
|
||
|
||
### 1.3 Host connectivity
|
||
- Hosts connect **directly to leaf pairs** — no access switches
|
||
- Every host is **dual-homed** via MLAG (LACP active)
|
||
- Each host-facing port-channel gets a unique MLAG ID
|
||
|
||
### 1.4 Containerlab model
|
||
- All devices use the `ceos` image (Arista cEOS)
|
||
- The lab simulates a fixed 12-port model per device
|
||
|
||
---
|
||
|
||
## 2. Port assignment — Spine
|
||
|
||
Spines use a simple sequential mapping: one Ethernet port per leaf.
|
||
|
||
| Port | Role | Connected to |
|
||
|------|------|-------------|
|
||
| Ethernet1 | Underlay downlink | Leaf 01 |
|
||
| Ethernet2 | Underlay downlink | Leaf 02 |
|
||
| Ethernet3 | Underlay downlink | Leaf 03 |
|
||
| ... | ... | ... |
|
||
| Ethernet{N} | Underlay downlink | Leaf {N} |
|
||
|
||
All spine downlinks are:
|
||
- **Routed** (`no switchport`)
|
||
- **/31 P2P addressing**
|
||
- **MTU 9214** (jumbo frames for VXLAN encapsulation)
|
||
|
||
---
|
||
|
||
## 3. Port assignment — Leaf
|
||
|
||
Leaf port allocation follows a fixed layout. Ports are divided into 3 zones:
|
||
|
||
| Port range | Role | Details |
|
||
|------------|------|---------|
|
||
| **Ethernet1 — Ethernet9** | Host-facing | MLAG port-channels (trunk, LACP active) |
|
||
| **Ethernet10** | MLAG peer-link | Port-Channel999 (trunk, trunk-group `mlag-peer`) |
|
||
| **Ethernet11** | Spine1 uplink | Routed P2P /31, MTU 9214 |
|
||
| **Ethernet12** | Spine2 uplink | Routed P2P /31, MTU 9214 |
|
||
|
||
### 3.1 Host-facing ports (Ethernet1-9)
|
||
- Each physical port is a member of a Port-Channel
|
||
- Port-Channel number = MLAG ID = host index (e.g., host 01 → Po1, MLAG 1)
|
||
- Mode: `switchport mode trunk`
|
||
- VLANs: only the VLANs needed by the host
|
||
- LACP fallback enabled (timeout 5, individual)
|
||
|
||
### 3.2 MLAG peer-link (Ethernet10)
|
||
- Always **Port-Channel999**
|
||
- Trunk mode with trunk-group `mlag-peer`
|
||
- Spanning-tree link-type point-to-point
|
||
- Carries MLAG control traffic + VLANs 4090, 4091
|
||
|
||
### 3.3 Spine uplinks (Ethernet11-12)
|
||
- Ethernet11 → spine 01, Ethernet12 → spine 02
|
||
- Fixed mapping, never changes regardless of fabric size
|
||
|
||
---
|
||
|
||
## 4. Naming conventions
|
||
|
||
### 4.1 Device hostname format
|
||
|
||
All device hostnames follow the pattern:
|
||
|
||
```
|
||
<SITE>-<ZONE>-<ROLE>-<ID>
|
||
```
|
||
|
||
| Field | Length | Description | Values |
|
||
|-------|--------|-------------|--------|
|
||
| `SITE` | 2 chars | City / location code (uppercase) | PA (Paris), LY (Lyon), MA (Marseille), LO (London)... |
|
||
| `ZONE` | 2-3 chars | Network area (uppercase) | DC (Datacenter), DR (Disaster Recovery), LAB (Lab), CO (Colocation) |
|
||
| `ROLE` | 4-5 chars | Device function (uppercase) | SPINE, LEAF, HOST |
|
||
| `ID` | 2 digits | Sequential number (zero-padded) | 01, 02, ..., 99 |
|
||
|
||
**Examples for Paris datacenter:**
|
||
|
||
| Device | Hostname |
|
||
|--------|----------|
|
||
| Spine 1 | `PA-DC-SPINE-01` |
|
||
| Spine 2 | `PA-DC-SPINE-02` |
|
||
| Leaf 1 (pair 1, primary) | `PA-DC-LEAF-01` |
|
||
| Leaf 2 (pair 1, secondary) | `PA-DC-LEAF-02` |
|
||
| Leaf 3 (pair 2, primary) | `PA-DC-LEAF-03` |
|
||
| Leaf 4 (pair 2, secondary) | `PA-DC-LEAF-04` |
|
||
| Host 1 | `PA-DC-HOST-01` |
|
||
|
||
**Examples for Lyon disaster recovery:**
|
||
|
||
| Device | Hostname |
|
||
|--------|----------|
|
||
| Spine 1 | `LY-DR-SPINE-01` |
|
||
| Leaf 1 | `LY-DR-LEAF-01` |
|
||
|
||
### 4.2 Leaf pairing rule
|
||
|
||
Leafs are numbered sequentially. **Odd = primary, even = secondary** within a pair:
|
||
|
||
| Pair | Primary | Secondary | VTEP |
|
||
|------|---------|-----------|------|
|
||
| 1 | `{SITE}-{ZONE}-LEAF-01` | `{SITE}-{ZONE}-LEAF-02` | VTEP 1 |
|
||
| 2 | `{SITE}-{ZONE}-LEAF-03` | `{SITE}-{ZONE}-LEAF-04` | VTEP 2 |
|
||
| 3 | `{SITE}-{ZONE}-LEAF-05` | `{SITE}-{ZONE}-LEAF-06` | VTEP 3 |
|
||
| 4 | `{SITE}-{ZONE}-LEAF-07` | `{SITE}-{ZONE}-LEAF-08` | VTEP 4 |
|
||
|
||
### 4.3 Fabric name
|
||
|
||
The fabric is identified by `{SITE}-{ZONE}` (lowercase in Infrahub objects):
|
||
|
||
| Infrahub object | Name | Example |
|
||
|----------------|------|---------|
|
||
| `InfraFabric` | `{site}-{zone}` | `pa-dc` |
|
||
| `LocationSite` | `{site}-{zone}` | `pa-dc` |
|
||
|
||
### 4.4 Interface descriptions
|
||
|
||
Interface descriptions reference the **full hostname** of the remote device:
|
||
|
||
| Interface | Description format | Example (on PA-DC-LEAF-01) |
|
||
|-----------|-------------------|----------------------------|
|
||
| Spine uplink Eth11 | `to {REMOTE_HOSTNAME}` | `to PA-DC-SPINE-01` |
|
||
| Spine uplink Eth12 | `to {REMOTE_HOSTNAME}` | `to PA-DC-SPINE-02` |
|
||
| MLAG peer-link Eth10 | `mlag peer link` | `mlag peer link` |
|
||
| Host-facing Eth1 | `to {HOST_HOSTNAME}` | `to PA-DC-HOST-01` |
|
||
| Loopback0 | `Router-ID` | `Router-ID` |
|
||
| Loopback1 | `VTEP` | `VTEP` |
|
||
|
||
On spines:
|
||
|
||
| Interface | Description format | Example (on PA-DC-SPINE-01) |
|
||
|-----------|-------------------|----------------------------|
|
||
| Downlink Eth1 | `to {REMOTE_HOSTNAME}` | `to PA-DC-LEAF-01` |
|
||
| Loopback0 | `Router-ID` | `Router-ID` |
|
||
|
||
### 4.5 MLAG domain
|
||
|
||
| Parameter | Value | Notes |
|
||
|-----------|-------|-------|
|
||
| Domain ID | `{site}-{zone}-pair{NN}` | e.g., `pa-dc-pair01` — unique per MLAG pair |
|
||
| Peer-link | Port-Channel999 | Fixed |
|
||
| Peer-link VLAN | 4090 | Fixed |
|
||
| iBGP peering VLAN | 4091 | Fixed |
|
||
|
||
### 4.6 BGP descriptions
|
||
|
||
| Session type | Description format | Example |
|
||
|-------------|-------------------|---------|
|
||
| eBGP underlay | `underlay to {REMOTE_HOSTNAME}` | `underlay to PA-DC-SPINE-01` |
|
||
| iBGP MLAG peer | `ibgp to {REMOTE_HOSTNAME}` | `ibgp to PA-DC-LEAF-02` |
|
||
| EVPN overlay | `evpn to {REMOTE_HOSTNAME}` | `evpn to PA-DC-SPINE-01` |
|
||
|
||
### 4.7 IPAM identifiers (for resource pool idempotence)
|
||
|
||
All identifiers use **lowercase**, with the fabric name `{site}-{zone}`:
|
||
|
||
| Object | Identifier pattern | Example |
|
||
|--------|-------------------|---------|
|
||
| Site infra prefix | `site-{site}-{zone}-infra` | `site-pa-dc-infra` |
|
||
| Site services prefix | `site-{site}-{zone}-services` | `site-pa-dc-services` |
|
||
| Device loopback0 IP | `lo0-{site}-{zone}-{role}-{id}` | `lo0-pa-dc-leaf-01` |
|
||
| Device loopback1 IP | `lo1-{site}-{zone}-vtep{NN}` | `lo1-pa-dc-vtep01` |
|
||
| Underlay P2P /31 | `p2p-{site}-{zone}-spine{NN}-leaf{NN}` | `p2p-pa-dc-spine01-leaf01` |
|
||
| MLAG peer /31 | `mlag-peer-{site}-{zone}-pair{NN}` | `mlag-peer-pa-dc-pair01` |
|
||
| MLAG iBGP /31 | `mlag-ibgp-{site}-{zone}-pair{NN}` | `mlag-ibgp-pa-dc-pair01` |
|
||
| Leaf ASN | `asn-{site}-{zone}-pair{NN}` | `asn-pa-dc-pair01` |
|
||
|
||
### 4.8 Site prefix registry
|
||
|
||
To avoid conflicts, site prefixes must be registered:
|
||
|
||
| Prefix | City | Country |
|
||
|--------|------|---------|
|
||
| PA | Paris | FR |
|
||
| LY | Lyon | FR |
|
||
| MA | Marseille | FR |
|
||
| LO | London | UK |
|
||
| FR | Frankfurt | DE |
|
||
| AM | Amsterdam | NL |
|
||
|
||
> This registry is maintained as a reference. The `SITE` code is stored on the `LocationSite` object in Infrahub.
|
||
|
||
---
|
||
|
||
## 5. IPAM — IP addressing plan
|
||
|
||
### 5.1 Supernets (global)
|
||
|
||
| Role | Supernet | Description |
|
||
|------|----------|-------------|
|
||
| Infrastructure | `10.0.0.0/8` | Loopbacks, underlay, MLAG |
|
||
| Services | `172.16.0.0/12` | L2/L3 VXLAN user subnets |
|
||
|
||
### 5.2 Site allocation (from supernets)
|
||
|
||
Each site receives:
|
||
- **1x /16** from `10.0.0.0/8` for infrastructure
|
||
- **1x /16** from `172.16.0.0/12` for services
|
||
|
||
### 5.3 Fabric pools (from site infra /16)
|
||
|
||
| Pool | Prefix size | Allocation unit | Pool type |
|
||
|------|-------------|-----------------|-----------|
|
||
| Loopback0 (router-id) | /24 | /32 per device | `CoreIPAddressPool` |
|
||
| Loopback1 (VTEP) | /24 | /32 per MLAG pair | `CoreIPAddressPool` |
|
||
| Underlay P2P | /24 | /31 per spine-leaf link | `CoreIPPrefixPool` |
|
||
| MLAG peer-link SVI | /24 | /31 per MLAG pair | `CoreIPPrefixPool` |
|
||
| MLAG iBGP peering | /24 | /31 per MLAG pair | `CoreIPPrefixPool` |
|
||
|
||
### 5.4 Service pools (from site services /16)
|
||
|
||
| Pool | Allocation unit | Pool type |
|
||
|------|-----------------|-----------|
|
||
| L2 VXLAN subnets | /24 per VLAN (customizable) | `CoreIPPrefixPool` |
|
||
| L3 VXLAN subnets (VRF SVIs) | /24 per VRF SVI (customizable) | `CoreIPPrefixPool` |
|
||
|
||
### 5.5 Special VLANs (reserved, not from pools)
|
||
|
||
| VLAN | Name | Purpose | Trunk group |
|
||
|------|------|---------|-------------|
|
||
| 4090 | mlag-peer | MLAG peer-link SVI | mlag-peer |
|
||
| 4091 | mlag-ibgp | MLAG iBGP peering | mlag-peer |
|
||
|
||
---
|
||
|
||
## 6. BGP — Autonomous System assignment
|
||
|
||
### 6.1 Spine ASN
|
||
- **Single ASN** shared by all spines in a fabric
|
||
- Defined as an attribute on `InfraFabric`
|
||
- Default for POC: **65000**
|
||
|
||
### 6.2 Leaf ASN
|
||
- **One ASN per MLAG pair** (iBGP within pair, eBGP to spines)
|
||
- Allocated from a `CoreNumberPool` (range: 65001–65099)
|
||
- Deterministic via identifier: `asn-{site}-{zone}-pair{NN}`
|
||
|
||
### 6.3 BGP configuration standards
|
||
|
||
| Parameter | Value | Notes |
|
||
|-----------|-------|-------|
|
||
| `no bgp default ipv4-unicast` | Always | Explicit activation per AFI |
|
||
| `distance bgp` | `20 200 200` | eBGP preferred over iBGP |
|
||
| `maximum-paths` | `4 ecmp 64` | Multi-path for spine redundancy |
|
||
| `maximum-routes` | `12000 warning-only` | Per neighbor |
|
||
| `ebgp-multihop` | `3` | EVPN overlay (loopback peering) |
|
||
| `send-community extended` | Always | Required for EVPN route-targets |
|
||
| `next-hop-unchanged` | Spine EVPN peer-group | Preserve leaf next-hop in overlay |
|
||
| `next-hop-self` | Leaf iBGP peer-group | Required for iBGP convergence |
|
||
|
||
### 6.4 Peer groups (per device)
|
||
|
||
**Leaf peer groups:**
|
||
|
||
| Peer group | Type | Remote AS | Neighbors |
|
||
|------------|------|-----------|-----------|
|
||
| `underlay` | eBGP | spine ASN | Spine P2P IPs |
|
||
| `underlay_ibgp` | iBGP | own ASN | MLAG peer via VLAN 4091 |
|
||
| `evpn` | eBGP | spine ASN | Spine loopback0 IPs |
|
||
|
||
**Spine peer groups:**
|
||
|
||
| Peer group | Type | Neighbors |
|
||
|------------|------|-----------|
|
||
| `evpn` | eBGP | All leaf loopback0 IPs (each with its own remote-as) |
|
||
|
||
Spine underlay neighbors are configured individually (no peer-group) since each leaf has a different ASN.
|
||
|
||
### 6.5 Address families
|
||
|
||
| AFI | Activated on | Networks advertised |
|
||
|-----|-------------|---------------------|
|
||
| IPv4 unicast | underlay, underlay_ibgp | Loopback0/32, Loopback1/32 |
|
||
| EVPN | evpn | (routes from VLAN/VRF config) |
|
||
|
||
---
|
||
|
||
## 7. MLAG standards
|
||
|
||
| Parameter | Value |
|
||
|-----------|-------|
|
||
| Domain ID | `{site}-{zone}-pair{NN}` (e.g., `pa-dc-pair01`) |
|
||
| Peer-link interface | Port-Channel999 |
|
||
| Peer-link VLAN | 4090 (IP: /31 from MLAG peer pool) |
|
||
| iBGP VLAN | 4091 (IP: /31 from MLAG iBGP pool, MTU 9214) |
|
||
| Peer VLAN autostate | Disabled (`no autostate`) |
|
||
| Dual-primary detection | Enabled (delay 10, errdisable all-interfaces) |
|
||
| Heartbeat | Via Management0 (VRF mgmt) |
|
||
| Virtual MAC | `c001.cafe.babe` (fabric-wide anycast gateway) |
|
||
|
||
### 7.1 Primary/secondary assignment
|
||
- **Odd-numbered leaf** (LEAF-01, LEAF-03, LEAF-05, LEAF-07): lower IP on MLAG VLANs (e.g., x.x.x.0/31)
|
||
- **Even-numbered leaf** (LEAF-02, LEAF-04, LEAF-06, LEAF-08): higher IP (e.g., x.x.x.1/31)
|
||
|
||
---
|
||
|
||
## 8. VXLAN standards
|
||
|
||
### 8.1 VTEP interface
|
||
- Interface: `Vxlan1` on every leaf
|
||
- Source interface: `Loopback1` (shared IP within MLAG pair)
|
||
- UDP port: `4789`
|
||
- Learning: `vxlan learn-restrict any` (EVPN-controlled)
|
||
|
||
### 8.2 VNI allocation
|
||
|
||
| Type | NumberPool range | Usage |
|
||
|------|-----------------|-------|
|
||
| L2 VNI | 100001–199999 | One VNI per extended VLAN (EVPN Type-2) |
|
||
| L3 VNI | 200001–299999 | One VNI per VRF (EVPN Type-5) |
|
||
|
||
VNIs are allocated from `CoreNumberPool` with deterministic identifiers.
|
||
|
||
### 8.3 Route distinguisher and route target
|
||
|
||
| Service type | RD format | RT format |
|
||
|-------------|-----------|-----------|
|
||
| L2 VXLAN (per VLAN) | `{ASN}:{VNI}` | `{VLAN_ID}:{VNI}` (import/export) |
|
||
| L3 VXLAN (per VRF) | `{Loopback0_IP}:{VRF_index}` | `{VRF_index}:{VNI}` (import/export evpn) |
|
||
|
||
---
|
||
|
||
## 9. Global parameters
|
||
|
||
| Parameter | Value | Notes |
|
||
|-----------|-------|-------|
|
||
| Underlay MTU | 9214 | All P2P and iBGP links |
|
||
| Anycast gateway MAC | `c001.cafe.babe` | `ip virtual-router mac-address` |
|
||
| Routing model | `multi-agent` | `service routing protocols model multi-agent` |
|
||
| Spanning-tree | Disabled on VLAN 4090, 4091 | MLAG VLANs only |
|
||
| LLDP | Management0 | `lldp management-address Management0` |
|
||
| gNMI | Enabled | `management api gnmi` with `provider eos-native` |
|
||
|
||
---
|
||
|
||
## 10. Out of scope (for now)
|
||
|
||
- **Access switches** — hosts connect directly to leafs
|
||
- **Multi-fabric / DCI** — single fabric per site
|
||
- **IPv6 underlay** — IPv4 only
|
||
- **BFD** — not configured in initial POC
|
||
- **Route-maps / prefix-lists** — no filtering in the underlay
|
||
- **More than 2 spines** — fixed at 2 for the POC
|
||
- **Non-Arista platforms** — EOS only
|