- Add §1 Device model (7050SX3-48YC12, prod vs clab port layouts) - Rework §2 topology: 2-4 spines, limits derived from hardware - §6 IPAM: only supernets fixed, all else delegated to Infrahub resource manager - §7 BGP: maximum-paths = N_spines×2, add bgp log-neighbor-changes - §9 VXLAN: RD/RT use L3_VNI, add VNI pool names and identifiers Refs: #31
17 KiB
Fabric Standardization — Small EVPN-VXLAN Data Centers
Status: Draft — Phase 0 (#31) Scope: POC for small data centers (2-4 spines, up to 24 leaf pairs) Parent: Epic #30
1. Device model
1.1 Reference platform — Arista 7050SX3-48YC12
All fabric devices (spines and leafs) use the same hardware model:
| Attribute | Value |
|---|---|
| Model | 7050SX3-48YC12 |
| 25G SFP28 ports | 48 (Ethernet1–48) |
| 100G QSFP100 ports | 12 (Ethernet49–60) |
| Total ports | 60 |
Port banks are physically separated, enabling clean role assignment with no overlap between host-facing and fabric traffic.
1.2 Port role assignment — Production
Spine (production):
| Port bank | Role | Details |
|---|---|---|
| Ethernet1–48 (25G) | Leaf downlinks | 1 port per leaf, routed P2P /31, MTU 9214 |
| Ethernet49–60 (100G) | Reserved | Future: inter-spine links, DCI, monitoring |
Leaf (production):
| Port bank | Role | Details |
|---|---|---|
| Ethernet1–48 (25G) | Host-facing | MLAG port-channels (trunk, LACP active) |
| Ethernet49 (100G) | MLAG peer-link | Port-Channel999 (trunk, trunk-group mlag-peer) |
| Ethernet50–{49+N_spines} (100G) | Spine uplinks | 1 per spine, routed P2P /31, MTU 9214 |
| Remaining 100G | Reserved | Future use |
1.3 Port role assignment — Containerlab (clab)
Containerlab uses the ceos image with a fixed 12-port model (Ethernet1–12, uniform speed). The production layout is compressed into 12 ports using a deterministic formula based on spine count:
| Parameter | Formula |
|---|---|
| Spine uplinks | Last N ports: Ethernet{13−N_spines} through Ethernet12 |
| MLAG peer-link | Port just before uplinks: Ethernet{12−N_spines} |
| Host-facing | All remaining: Ethernet1 through Ethernet{11−N_spines} |
| Host port count | 12 − N_spines − 1 |
Concrete layouts per spine count:
| Spines | Host ports | MLAG port | Spine uplinks | Host count |
|---|---|---|---|---|
| 2 | Eth1–9 | Eth10 | Eth11 (S1), Eth12 (S2) | 9 |
| 3 | Eth1–8 | Eth9 | Eth10 (S1), Eth11 (S2), Eth12 (S3) | 8 |
| 4 | Eth1–7 | Eth8 | Eth9 (S1), Eth10 (S2), Eth11 (S3), Eth12 (S4) | 7 |
Spine in clab: all 12 ports are leaf downlinks (Ethernet1 → Leaf-01, Ethernet2 → Leaf-02, ...).
2. Topology constraints
2.1 Derived limits
All limits derive from the device model and the MLAG pairing rule (leafs always in pairs):
| Constraint | Production (7050SX3-48YC12) | Clab (12-port) | Source |
|---|---|---|---|
| Min spines | 2 | 2 | Redundancy requirement |
| Max spines | 4 | 4 | Small DC scope (hardware allows more) |
| Max leafs per fabric | 48 | 12 | Spine downlink port count |
| Max leaf pairs per fabric | 24 | 6 | Max leafs ÷ 2 |
| Min leaf pairs per fabric | 1 | 1 | Minimum viable fabric |
| Max host ports per leaf | 48 | 12 − N_spines − 1 | Port bank size minus fabric ports |
| Spine uplinks per leaf | N_spines | N_spines | 1 uplink per spine |
| MLAG peer-link ports | 1 (100G) | 1 | Fixed: Po999 |
2.2 Validation rules
The fabric generator must enforce:
- N_spines ∈ {2, 3, 4} — minimum 2 for redundancy, max 4 for small DC scope
- N_leafs is even — leafs always come in MLAG pairs
- N_leafs ≥ 2 — at least 1 MLAG pair
- N_leafs ≤ spine downlink port count — cannot exceed physical ports
- Every leaf connects to every spine — full-mesh underlay
- Every spine connects to every leaf — symmetric fabric
2.3 Spine layer
- Spines are pure L3 routers — no VTEPs, no VLANs, no MLAG
- Each spine connects to every leaf via a dedicated P2P link
- All spines share the same ASN within a fabric
2.4 Leaf layer
- Leafs always come in MLAG pairs (2 leafs = 1 VTEP)
- Each leaf connects to all spines via dedicated uplinks
- Each leaf pair shares a VTEP loopback IP (Loopback1)
2.5 Host connectivity
- Hosts connect directly to leaf pairs — no access switches
- Every host is dual-homed via MLAG (LACP active)
- Each host-facing port-channel gets a unique MLAG ID
2.6 POC defaults
For the initial POC, the following defaults apply:
| Parameter | Default | Notes |
|---|---|---|
| Spines | 2 | Minimum for redundancy |
| Leaf pairs | 3 (6 leafs) | Enough to validate multi-VTEP behavior |
| Platform | clab (ceos, 12 ports) | Production model as reference only |
3. Port assignment — Spine
Spines use sequential port mapping: one port per leaf, starting from Ethernet1.
| Port | Role | Connected to |
|---|---|---|
| Ethernet1 | Underlay downlink | Leaf-01 |
| Ethernet2 | Underlay downlink | Leaf-02 |
| ... | ... | ... |
| Ethernet{N_leafs} | Underlay downlink | Leaf-{N_leafs} |
Production: Ethernet1–48 (25G), remaining ports reserved. Clab: Ethernet1–12, all available for downlinks.
All spine downlinks are:
- Routed (
no switchport) - /31 P2P addressing
- MTU 9214 (jumbo frames for VXLAN encapsulation)
4. Port assignment — Leaf
4.1 Host-facing ports
- Each physical port maps to a Port-Channel
- Port-Channel number = MLAG ID = host index (e.g., host 01 → Po1, MLAG 1)
- Mode:
switchport mode trunk - VLANs: only the VLANs needed by the host
- LACP fallback enabled (timeout 5, individual)
4.2 MLAG peer-link
- Always Port-Channel999
- Trunk mode with trunk-group
mlag-peer - Spanning-tree link-type point-to-point
- Carries MLAG control traffic + VLANs 4090, 4091
Production: Ethernet49 (100G) Clab: Ethernet{12−N_spines} (derived from formula in §1.3)
4.3 Spine uplinks
- One uplink per spine, routed P2P /31, MTU 9214
- Fixed mapping: uplink port index matches spine index
Production: Ethernet50 → Spine-01, Ethernet51 → Spine-02, ... Ethernet{49+N_spines} → Spine-{N_spines} Clab: Ethernet{13−N_spines} → Spine-01, ..., Ethernet12 → Spine-{N_spines}
5. Naming conventions
5.1 Device hostname format
All device hostnames follow the pattern:
<SITE>-<ZONE>-<ROLE>-<ID>
| Field | Length | Description | Values |
|---|---|---|---|
SITE |
2 chars | City / location code (uppercase) | PA (Paris), LY (Lyon), MA (Marseille), LO (London)... |
ZONE |
2-3 chars | Network area (uppercase) | DC (Datacenter), DR (Disaster Recovery), LAB (Lab), CO (Colocation) |
ROLE |
4-5 chars | Device function (uppercase) | SPINE, LEAF, HOST |
ID |
2 digits | Sequential number (zero-padded) | 01, 02, ..., 99 |
Examples for Paris datacenter:
| Device | Hostname |
|---|---|
| Spine 1 | PA-DC-SPINE-01 |
| Spine 2 | PA-DC-SPINE-02 |
| Leaf 1 (pair 1, primary) | PA-DC-LEAF-01 |
| Leaf 2 (pair 1, secondary) | PA-DC-LEAF-02 |
| Leaf 3 (pair 2, primary) | PA-DC-LEAF-03 |
| Leaf 4 (pair 2, secondary) | PA-DC-LEAF-04 |
| Host 1 | PA-DC-HOST-01 |
Examples for Lyon disaster recovery:
| Device | Hostname |
|---|---|
| Spine 1 | LY-DR-SPINE-01 |
| Leaf 1 | LY-DR-LEAF-01 |
5.2 Leaf pairing rule
Leafs are numbered sequentially. Odd = primary, even = secondary within a pair:
| Pair | Primary | Secondary | VTEP |
|---|---|---|---|
| 1 | {SITE}-{ZONE}-LEAF-01 |
{SITE}-{ZONE}-LEAF-02 |
VTEP 1 |
| 2 | {SITE}-{ZONE}-LEAF-03 |
{SITE}-{ZONE}-LEAF-04 |
VTEP 2 |
| 3 | {SITE}-{ZONE}-LEAF-05 |
{SITE}-{ZONE}-LEAF-06 |
VTEP 3 |
| 4 | {SITE}-{ZONE}-LEAF-07 |
{SITE}-{ZONE}-LEAF-08 |
VTEP 4 |
5.3 Fabric name
The fabric is identified by {SITE}-{ZONE} (lowercase in Infrahub objects):
| Infrahub object | Name | Example |
|---|---|---|
InfraFabric |
{site}-{zone} |
pa-dc |
LocationSite |
{site}-{zone} |
pa-dc |
5.4 Interface descriptions
Interface descriptions reference the full hostname of the remote device:
| Interface | Description format | Example (on PA-DC-LEAF-01) |
|---|---|---|
| Spine uplink | to {REMOTE_HOSTNAME} |
to PA-DC-SPINE-01 |
| MLAG peer-link | mlag peer link |
mlag peer link |
| Host-facing | to {HOST_HOSTNAME} |
to PA-DC-HOST-01 |
| Loopback0 | Router-ID |
Router-ID |
| Loopback1 | VTEP |
VTEP |
On spines:
| Interface | Description format | Example (on PA-DC-SPINE-01) |
|---|---|---|
| Downlink | to {REMOTE_HOSTNAME} |
to PA-DC-LEAF-01 |
| Loopback0 | Router-ID |
Router-ID |
5.5 MLAG domain
| Parameter | Value | Notes |
|---|---|---|
| Domain ID | {site}-{zone}-pair{NN} |
e.g., pa-dc-pair01 — unique per MLAG pair |
| Peer-link | Port-Channel999 | Fixed |
| Peer-link VLAN | 4090 | Fixed |
| iBGP peering VLAN | 4091 | Fixed |
5.6 BGP descriptions
| Session type | Description format | Example |
|---|---|---|
| eBGP underlay | underlay to {REMOTE_HOSTNAME} |
underlay to PA-DC-SPINE-01 |
| iBGP MLAG peer | ibgp to {REMOTE_HOSTNAME} |
ibgp to PA-DC-LEAF-02 |
| EVPN overlay | evpn to {REMOTE_HOSTNAME} |
evpn to PA-DC-SPINE-01 |
5.7 IPAM identifiers (for resource pool idempotence)
All identifiers use lowercase, with the fabric name {site}-{zone}:
| Object | Identifier pattern | Example |
|---|---|---|
| Site infra prefix | site-{site}-{zone}-infra |
site-pa-dc-infra |
| Site services prefix | site-{site}-{zone}-services |
site-pa-dc-services |
| Device loopback0 IP | lo0-{site}-{zone}-{role}-{id} |
lo0-pa-dc-leaf-01 |
| Device loopback1 IP | lo1-{site}-{zone}-vtep{NN} |
lo1-pa-dc-vtep01 |
| Underlay P2P /31 | p2p-{site}-{zone}-spine{NN}-leaf{NN} |
p2p-pa-dc-spine01-leaf01 |
| MLAG peer /31 | mlag-peer-{site}-{zone}-pair{NN} |
mlag-peer-pa-dc-pair01 |
| MLAG iBGP /31 | mlag-ibgp-{site}-{zone}-pair{NN} |
mlag-ibgp-pa-dc-pair01 |
| Leaf ASN | asn-{site}-{zone}-pair{NN} |
asn-pa-dc-pair01 |
| L2 VNI | l2vni-{site}-{zone}-vlan{NNNN} |
l2vni-pa-dc-vlan0040 |
| L3 VNI | l3vni-{site}-{zone}-{vrf_name} |
l3vni-pa-dc-gold |
5.8 Site prefix registry
To avoid conflicts, site prefixes must be registered:
| Prefix | City | Country |
|---|---|---|
| PA | Paris | FR |
| LY | Lyon | FR |
| MA | Marseille | FR |
| LO | London | UK |
| FR | Frankfurt | DE |
| AM | Amsterdam | NL |
This registry is maintained as a reference. The
SITEcode is stored on theLocationSiteobject in Infrahub.
6. IPAM — IP addressing plan
6.1 Design principle
Only the two supernets are fixed. All intermediate allocations (site prefixes, fabric pools, individual subnets) are delegated to Infrahub's resource manager, which picks the smallest available prefix that satisfies the request. Prefix sizes mentioned in this document are illustrative defaults — the generator requests a number of allocations from a pool and Infrahub handles sizing and placement.
6.2 Supernets (global, fixed)
| Role | Supernet | Description |
|---|---|---|
| Infrastructure | 10.0.0.0/8 |
Loopbacks, underlay, MLAG |
| Services | 172.16.0.0/12 |
L2/L3 VXLAN user subnets |
These are the only hardcoded prefixes. Everything below is allocated dynamically.
6.3 Site allocation (from supernets)
Each site receives one prefix from each supernet, allocated by Infrahub:
- 1 prefix from
10.0.0.0/8for infrastructure (e.g., /16) - 1 prefix from
172.16.0.0/12for services (e.g., /16)
6.4 Fabric pools (from site infra prefix)
The fabric generator creates pools within the site's infrastructure prefix. Each pool serves a specific role and allocates individual subnets on demand:
| Pool | Allocation unit | Pool type | Example size |
|---|---|---|---|
| Loopback0 (router-id) | /32 per device | CoreIPAddressPool |
/24 |
| Loopback1 (VTEP) | /32 per MLAG pair | CoreIPAddressPool |
/24 |
| Underlay P2P | /31 per spine-leaf link | CoreIPPrefixPool |
/24 or /23 |
| MLAG peer-link SVI | /31 per MLAG pair | CoreIPPrefixPool |
/24 |
| MLAG iBGP peering | /31 per MLAG pair | CoreIPPrefixPool |
/24 |
Example sizes are not prescriptive. Infrahub allocates the parent prefix for each pool based on the number of resources requested. A 2-spine / 6-leaf fabric needs far fewer /31s than a 4-spine / 48-leaf fabric — the resource manager adapts accordingly.
6.5 Service pools (from site services prefix)
| Pool | Allocation unit | Pool type |
|---|---|---|
| L2 VXLAN subnets | Per VLAN (e.g., /24) | CoreIPPrefixPool |
| L3 VXLAN subnets (VRF SVIs) | Per VRF SVI (e.g., /24) | CoreIPPrefixPool |
6.6 Special VLANs (reserved, not from pools)
| VLAN | Name | Purpose | Trunk group |
|---|---|---|---|
| 4090 | mlag-peer | MLAG peer-link SVI | mlag-peer |
| 4091 | mlag-ibgp | MLAG iBGP peering | mlag-peer |
7. BGP — Autonomous System assignment
7.1 Spine ASN
- Single ASN shared by all spines in a fabric
- Defined as an attribute on
InfraFabric - Default for POC: 65000
7.2 Leaf ASN
- One ASN per MLAG pair (iBGP within pair, eBGP to spines)
- Allocated from a
CoreNumberPool(range: 65001–65099) - Deterministic via identifier:
asn-{site}-{zone}-pair{NN}
7.3 BGP configuration standards
| Parameter | Value | Notes |
|---|---|---|
no bgp default ipv4-unicast |
Always | Explicit activation per AFI |
bgp log-neighbor-changes |
Always | Operational visibility for BGP state transitions |
distance bgp |
20 200 200 |
eBGP preferred over iBGP |
maximum-paths |
{N_spines × 2} ecmp 64 |
Multi-path scaled to spine count (e.g., 2 spines → 4, 4 spines → 8) |
maximum-routes |
12000 warning-only |
Per neighbor |
ebgp-multihop |
3 |
EVPN overlay (loopback peering) |
send-community extended |
Always | Required for EVPN route-targets |
next-hop-unchanged |
Spine EVPN peer-group | Preserve leaf next-hop in overlay |
next-hop-self |
Leaf iBGP peer-group | Required for iBGP convergence |
7.4 Peer groups (per device)
Leaf peer groups:
| Peer group | Type | Remote AS | Neighbors |
|---|---|---|---|
underlay |
eBGP | spine ASN | Spine P2P IPs |
underlay_ibgp |
iBGP | own ASN | MLAG peer via VLAN 4091 |
evpn |
eBGP | spine ASN | Spine loopback0 IPs |
Spine peer groups:
| Peer group | Type | Neighbors |
|---|---|---|
evpn |
eBGP | All leaf loopback0 IPs (each with its own remote-as) |
Spine underlay neighbors are configured individually (no peer-group) since each leaf has a different ASN.
7.5 Address families
| AFI | Activated on | Networks advertised |
|---|---|---|
| IPv4 unicast | underlay, underlay_ibgp | Loopback0/32, Loopback1/32 |
| EVPN | evpn | (routes from VLAN/VRF config) |
8. MLAG standards
| Parameter | Value |
|---|---|
| Domain ID | {site}-{zone}-pair{NN} (e.g., pa-dc-pair01) |
| Peer-link interface | Port-Channel999 |
| Peer-link VLAN | 4090 (IP: /31 from MLAG peer pool) |
| iBGP VLAN | 4091 (IP: /31 from MLAG iBGP pool, MTU 9214) |
| Peer VLAN autostate | Disabled (no autostate) |
| Dual-primary detection | Enabled (delay 10, errdisable all-interfaces) |
| Heartbeat | Via Management0 (VRF mgmt) |
| Virtual MAC | c001.cafe.babe (fabric-wide anycast gateway) |
8.1 Primary/secondary assignment
- Odd-numbered leaf (LEAF-01, LEAF-03, LEAF-05, LEAF-07): lower IP on MLAG VLANs (e.g., x.x.x.0/31)
- Even-numbered leaf (LEAF-02, LEAF-04, LEAF-06, LEAF-08): higher IP (e.g., x.x.x.1/31)
9. VXLAN standards
9.1 VTEP interface
- Interface:
Vxlan1on every leaf - Source interface:
Loopback1(shared IP within MLAG pair) - UDP port:
4789 - Learning:
vxlan learn-restrict any(EVPN-controlled)
9.2 VNI allocation
| Type | NumberPool name | Range | Usage | Identifier pattern |
|---|---|---|---|---|
| L2 VNI | l2-vni-pool |
100001–199999 | One VNI per extended VLAN (EVPN Type-2) | l2vni-{site}-{zone}-vlan{NNNN} |
| L3 VNI | l3-vni-pool |
200001–299999 | One VNI per VRF (EVPN Type-5) | l3vni-{site}-{zone}-{vrf_name} |
VNIs are allocated from CoreNumberPool with deterministic identifiers for idempotent sync.
9.3 Route distinguisher and route target
| Service type | RD format | RT format |
|---|---|---|
| L2 VXLAN (per VLAN) | {ASN}:{VNI} |
{VLAN_ID}:{VNI} (import/export) |
| L3 VXLAN (per VRF) | {Loopback0_IP}:{L3_VNI} |
{L3_VNI}:{L3_VNI} (import/export evpn) |
10. Global parameters
| Parameter | Value | Notes |
|---|---|---|
| Underlay MTU | 9214 | All P2P and iBGP links |
| Anycast gateway MAC | c001.cafe.babe |
ip virtual-router mac-address |
| Routing model | multi-agent |
service routing protocols model multi-agent |
| Spanning-tree | Disabled on VLAN 4090, 4091 | MLAG VLANs only |
| LLDP | Management0 | lldp management-address Management0 |
| gNMI | Enabled | management api gnmi with provider eos-native |
11. Out of scope (for now)
- Access switches — hosts connect directly to leafs
- Multi-fabric / DCI — single fabric per site
- IPv6 underlay — IPv4 only
- BFD — not configured in initial POC
- Route-maps / prefix-lists — no filtering in the underlay
- More than 4 spines — capped for small DC scope
- Non-Arista platforms — EOS only