From e966d637114e9127d9c72af9a1412a35eaa49f4e Mon Sep 17 00:00:00 2001 From: Damien Arnodo Date: Sun, 30 Nov 2025 18:34:18 +0000 Subject: [PATCH] chore: remove FIXES_APPLIED.md - historical tracking document --- FIXES_APPLIED.md | 157 ----------------------------------------------- 1 file changed, 157 deletions(-) delete mode 100644 FIXES_APPLIED.md diff --git a/FIXES_APPLIED.md b/FIXES_APPLIED.md deleted file mode 100644 index deabcc3..0000000 --- a/FIXES_APPLIED.md +++ /dev/null @@ -1,157 +0,0 @@ -# Fixes Applied in Main Branch - -This document tracks critical fixes that have been discovered and applied during lab deployment to ensure the EVPN-VXLAN fabric functions correctly. - -## ✅ Fixes Applied to Main Branch - -### 1. **Spine Switches - Enable IP Routing** ✅ FIXED -**Problem**: BGP was disabled on spine switches with error "BGP is disabled for VRF default" and "IP routing not enabled" - -**Fix**: Added `ip routing` command to both spine configurations -- `configs/spine1.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) -- `configs/spine2.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) - -**Impact**: This enables BGP to function properly on spines, allowing: -- Underlay BGP IPv4 Unicast sessions to establish -- EVPN BGP sessions to establish -- Route exchange between spines and leafs - -**Status**: ✅ **APPLIED** (commits applied to main branch) - ---- - -### 2. **Leaf Switches - MLAG Port-Channel Mode** ✅ FIXED -**Problem**: LACP bonding (`mode active`) doesn't work properly in Alpine Linux containers due to lack of kernel module support - -**Fix**: Changed from LACP to static LAG -- Changed `channel-group 1 mode active` to `channel-group 1 mode on` in all leaf configs -- This creates a static LAG that works in containerized environments - -**Status**: ✅ **ALREADY APPLIED** (pushed by user in previous commits) - ---- - -## ⏳ Remaining Issues (Pending Application) - -### 3. **Leaf Switches - Port-Channel1 Switchport Mode** ⏳ PENDING -**Problem**: Port-Channel configured as `trunk`, but Alpine containers send untagged traffic - -**Fix Needed**: Change Port-Channel1 from trunk to access mode on all leafs: -``` -interface Port-Channel1 - switchport mode access - switchport access vlan 40 # or appropriate VLAN for each VTEP -``` - -**Status**: ⏳ **NOT YET APPLIED** - Needs manual configuration or config file updates - -**Affected Files**: -- `configs/leaf1.cfg` -- `configs/leaf2.cfg` -- `configs/leaf3.cfg` -- `configs/leaf4.cfg` -- `configs/leaf5.cfg` -- `configs/leaf6.cfg` -- `configs/leaf7.cfg` -- `configs/leaf8.cfg` - ---- - -### 4. **Host Configuration - Simplified Bonding** ⏳ PENDING -**Problem**: Alpine Linux containers cannot properly configure 802.3ad LACP bonding - -**Fix Needed**: Remove bonding complexity, use single interface: -```yaml -host1: - exec: - - ip addr add 10.40.40.101/24 dev eth1 - - ip link set eth1 up -``` - -**Status**: ⏳ **NOT YET APPLIED** - Topology file needs updating - ---- - -## 📋 Summary of Issues Found - -### Issue #1: Missing `ip routing` on Spines -- **Symptoms**: - - `show ip bgp summary` returned "BGP is disabled for VRF default" - - Attempting to configure BGP showed "! IP routing not enabled" -- **Root Cause**: Arista EOS requires explicit `ip routing` command to enable L3 functionality -- **Status**: ✅ **FIXED** - -### Issue #2: LACP Bonding in Containers -- **Symptoms**: - - Port-Channel showing "waiting for LACP response" - - Host bond interface in DOWN state -- **Root Cause**: Alpine containers don't have bonding kernel modules -- **Status**: ✅ **FIXED** (by changing to static LAG) - -### Issue #3: Trunk vs Access Mode -- **Symptoms**: - - No MAC learning on switch - - Port-Channel counters showed traffic but no unicast packets -- **Root Cause**: Hosts send untagged traffic, switch expects tagged (trunk mode) -- **Status**: ⏳ **NEEDS FIXING** - ---- - -## 🚀 Deployment Instructions - -### Quick Start (Recommended) -1. Deploy with fixed spine configs: -```bash -cd ~/arista-evpn-vxlan-clab -sudo containerlab deploy -t evpn-lab.clab.yml -``` - -2. Verify BGP is working: -```bash -ssh admin@clab-arista-evpn-fabric-spine1 "show bgp evpn summary" -``` - -3. Apply remaining fixes manually or wait for config updates - -### Complete Fix (When Ready) -- Once Port-Channel and host configs are updated, redeploy topology for zero-downtime testing - ---- - -## 📊 Testing Results - -After applying spine `ip routing` fix: -- ✅ BGP underlay sessions establish (eBGP between spine-leaf, iBGP between MLAG pairs) -- ✅ BGP EVPN overlay sessions establish -- ✅ MLAG pairs form correctly (active-full, up/up) -- ✅ MAC addresses learned locally on leaf switches -- ⏳ EVPN Type-2 routes advertised (pending overlay establishment) -- ⏳ End-to-end connectivity (pending all fixes applied) - ---- - -## 💡 Key Learnings - -- The `ip routing` fix is **critical** and must be in the startup-config for clean deployments -- Static LAG (`mode on`) is more reliable than LACP in containerized environments -- Access mode port-channels work better with simple Linux containers -- For production environments with proper bonding support, LACP can be re-enabled - ---- - -## 🔗 Related Issues - -- Spine BGP not starting: Missing `ip routing` command -- MLAG port-channels not forming: LACP incompatibility -- No MAC learning: Trunk vs Access mode mismatch -- No VXLAN tunnel endpoints: Pending overlay establishment - ---- - -## ✅ Final Status - -**Spine Fixes**: COMPLETE ✅ -**MLAG Fixes**: COMPLETE ✅ -**Port-Channel Access Mode**: PENDING ⏳ -**Host Networking**: PENDING ⏳ -**EVPN Overlay**: TESTING ⏳