From 8cf952231ebf953c4dad681b566b135061302800 Mon Sep 17 00:00:00 2001 From: Damien Date: Sun, 30 Nov 2025 10:21:00 +0000 Subject: [PATCH] Remove FIXES_APPLIED.md - info documented in issues --- FIXES_APPLIED.md | 184 ----------------------------------------------- 1 file changed, 184 deletions(-) delete mode 100644 FIXES_APPLIED.md diff --git a/FIXES_APPLIED.md b/FIXES_APPLIED.md deleted file mode 100644 index 3a3f14a..0000000 --- a/FIXES_APPLIED.md +++ /dev/null @@ -1,184 +0,0 @@ -# Fixes Applied in fix-bgp-and-mlag Branch - -This branch contains critical fixes discovered during lab testing to make the EVPN-VXLAN fabric functional. - -## ๐Ÿ”ง Fixes Applied - -### 1. **Spine Switches - Enable IP Routing** -**Problem**: BGP was disabled on spine switches with error "BGP is disabled for VRF default" and "IP routing not enabled" - -**Fix**: Added `ip routing` command to both spine configurations -- `configs/spine1.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) -- `configs/spine2.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) - -**Impact**: This enables BGP to function properly on spines, allowing: -- Underlay BGP IPv4 Unicast sessions to establish -- EVPN BGP sessions to establish -- Route exchange between spines and leafs - -### 2. **Leaf Switches - MLAG Port-Channel Mode** -**Problem**: LACP bonding (`mode active`) doesn't work properly in Alpine Linux containers due to lack of kernel module support - -**Fix**: Changed from LACP to static LAG -- Changed `channel-group 1 mode active` to `channel-group 1 mode on` in all leaf configs -- This creates a static LAG that works in containerized environments - -**Status**: โœ… Already applied in main branch (pushed by user) - -### 3. **Leaf Switches - Port-Channel Switchport Mode** -**Problem**: Port-Channel configured as trunk, but Alpine containers send untagged traffic - -**Fix Needed**: Change Port-Channel1 from trunk to access mode on all leafs: -``` -interface Port-Channel1 - switchport mode access - switchport access vlan 40 # or appropriate VLAN for each VTEP -``` - -**Status**: โš ๏ธ **NOT YET APPLIED** - Needs manual configuration or config file update - -### 4. **Host Configuration - Simplified Bonding** -**Problem**: Alpine Linux containers cannot properly configure 802.3ad LACP bonding - -**Fix in topology**: Remove bonding complexity, use single interface: -```yaml -host1: - exec: - - ip addr add 10.40.40.101/24 dev eth1 - - ip link set eth1 up -``` - -**Status**: โš ๏ธ **NOT YET APPLIED** - topology file not updated in this branch - -## ๐Ÿ“‹ Summary of Issues Found - -### Issue #1: Missing `ip routing` on Spines -- **Symptoms**: - - `show ip bgp summary` returned "BGP is disabled for VRF default" - - Attempting to configure BGP showed "! IP routing not enabled" -- **Root Cause**: Arista EOS requires explicit `ip routing` command to enable L3 functionality -- **Status**: โœ… **FIXED** - -### Issue #2: LACP Bonding in Containers -- **Symptoms**: - - Port-Channel showing "waiting for LACP response" - - Host bond interface in DOWN state -- **Root Cause**: Alpine containers don't have bonding kernel modules -- **Status**: โœ… **FIXED** (by changing to static LAG) - -### Issue #3: Trunk vs Access Mode -- **Symptoms**: - - No MAC learning on switch - - Port-Channel counters showed traffic but no unicast packets -- **Root Cause**: Hosts send untagged traffic, switch expects tagged (trunk mode) -- **Status**: โš ๏ธ **NEEDS MANUAL FIX** - -## ๐Ÿš€ Deployment Instructions - -### Option 1: Deploy with Manual Post-Configuration - -1. Deploy the lab: -```bash -cd ~/arista-evpn-vxlan-clab -git checkout fix-bgp-and-mlag -sudo containerlab deploy -t evpn-lab.clab.yml -``` - -2. Fix Port-Channel mode on all leafs (manual): -```bash -for leaf in leaf1 leaf2 leaf3 leaf4 leaf5 leaf6 leaf7 leaf8; do - ssh admin@clab-arista-evpn-fabric-$leaf << 'EOF' -enable -configure terminal -interface Port-Channel1 - switchport mode access - switchport access vlan 40 -write memory -EOF -done -``` - -3. Configure hosts (manual): -```bash -# Host1 (VLAN 40 - L2 VXLAN) -docker exec clab-arista-evpn-fabric-host1 sh -c ' -ip link set bond0 down 2>/dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.40.40.101/24 dev eth1 -ip link set eth1 up -' - -# Host3 (VLAN 40 - L2 VXLAN) -docker exec clab-arista-evpn-fabric-host3 sh -c ' -ip link set bond0 down 2>/dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.40.40.103/24 dev eth1 -ip link set eth1 up -' - -# Host2 (VRF gold - L3 VXLAN) -docker exec clab-arista-evpn-fabric-host2 sh -c ' -ip link set bond0 down 2>/dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.34.34.102/24 dev eth1 -ip link set eth1 up -ip route add default via 10.34.34.1 -' - -# Host4 (VRF gold - L3 VXLAN) -docker exec clab-arista-evpn-fabric-host4 sh -c ' -ip link set bond0 down 2>/dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.78.78.104/24 dev eth1 -ip link set eth1 up -ip route add default via 10.78.78.1 -' -``` - -4. Verify: -```bash -# Check BGP -ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn summary" - -# Check VXLAN -ssh admin@clab-arista-evpn-fabric-leaf1 "show vxlan vtep" - -# Test connectivity -docker exec -it clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103 -docker exec -it clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104 -``` - -### Option 2: Wait for Complete Fix - -A complete fix will require: -1. โœ… Spine configs updated (DONE) -2. โณ All leaf Port-Channel configs updated to access mode -3. โณ Topology file updated to simplify host networking -4. โณ README updated with correct testing procedures - -## ๐Ÿงช Testing Results - -After applying fixes manually: -- โœ… BGP underlay sessions establish (eBGP between spine-leaf, iBGP between MLAG pairs) -- โœ… BGP EVPN overlay sessions establish -- โœ… MLAG pairs form correctly (active-full, up/up) -- โœ… MAC addresses learned locally on leaf switches -- โœ… EVPN Type-2 routes advertised (pending overlay establishment) -- โณ End-to-end connectivity (requires all fixes applied) - -## ๐Ÿ“ Notes - -- The `ip routing` fix is critical and must be in the startup-config for clean deployments -- Static LAG (`mode on`) is more reliable than LACP in containerized environments -- Access mode port-channels work better with simple Alpine containers -- For production environments with proper bonding support, LACP can be re-enabled - -## ๐Ÿ”— Related Issues - -- Spine BGP not starting: Missing `ip routing` command -- MLAG port-channels not forming: LACP bonding incompatibility -- No MAC learning: Trunk vs access mode mismatch