Document all critical fixes discovered during lab testing: - Spine routing: ip routing command added - MLAG: static LAG mode enabled - Pending: port-channel access mode, host networking Track status of each fix for deployment readiness.
5.0 KiB
Fixes Applied in Main Branch
This document tracks critical fixes that have been discovered and applied during lab deployment to ensure the EVPN-VXLAN fabric functions correctly.
✅ Fixes Applied to Main Branch
1. Spine Switches - Enable IP Routing ✅ FIXED
Problem: BGP was disabled on spine switches with error "BGP is disabled for VRF default" and "IP routing not enabled"
Fix: Added ip routing command to both spine configurations
configs/spine1.cfg- Added line:ip routing(beforeservice routing protocols model multi-agent)configs/spine2.cfg- Added line:ip routing(beforeservice routing protocols model multi-agent)
Impact: This enables BGP to function properly on spines, allowing:
- Underlay BGP IPv4 Unicast sessions to establish
- EVPN BGP sessions to establish
- Route exchange between spines and leafs
Status: ✅ APPLIED (commits applied to main branch)
2. Leaf Switches - MLAG Port-Channel Mode ✅ FIXED
Problem: LACP bonding (mode active) doesn't work properly in Alpine Linux containers due to lack of kernel module support
Fix: Changed from LACP to static LAG
- Changed
channel-group 1 mode activetochannel-group 1 mode onin all leaf configs - This creates a static LAG that works in containerized environments
Status: ✅ ALREADY APPLIED (pushed by user in previous commits)
⏳ Remaining Issues (Pending Application)
3. Leaf Switches - Port-Channel1 Switchport Mode ⏳ PENDING
Problem: Port-Channel configured as trunk, but Alpine containers send untagged traffic
Fix Needed: Change Port-Channel1 from trunk to access mode on all leafs:
interface Port-Channel1
switchport mode access
switchport access vlan 40 # or appropriate VLAN for each VTEP
Status: ⏳ NOT YET APPLIED - Needs manual configuration or config file updates
Affected Files:
configs/leaf1.cfgconfigs/leaf2.cfgconfigs/leaf3.cfgconfigs/leaf4.cfgconfigs/leaf5.cfgconfigs/leaf6.cfgconfigs/leaf7.cfgconfigs/leaf8.cfg
4. Host Configuration - Simplified Bonding ⏳ PENDING
Problem: Alpine Linux containers cannot properly configure 802.3ad LACP bonding
Fix Needed: Remove bonding complexity, use single interface:
host1:
exec:
- ip addr add 10.40.40.101/24 dev eth1
- ip link set eth1 up
Status: ⏳ NOT YET APPLIED - Topology file needs updating
📋 Summary of Issues Found
Issue #1: Missing ip routing on Spines
- Symptoms:
show ip bgp summaryreturned "BGP is disabled for VRF default"- Attempting to configure BGP showed "! IP routing not enabled"
- Root Cause: Arista EOS requires explicit
ip routingcommand to enable L3 functionality - Status: ✅ FIXED
Issue #2: LACP Bonding in Containers
- Symptoms:
- Port-Channel showing "waiting for LACP response"
- Host bond interface in DOWN state
- Root Cause: Alpine containers don't have bonding kernel modules
- Status: ✅ FIXED (by changing to static LAG)
Issue #3: Trunk vs Access Mode
- Symptoms:
- No MAC learning on switch
- Port-Channel counters showed traffic but no unicast packets
- Root Cause: Hosts send untagged traffic, switch expects tagged (trunk mode)
- Status: ⏳ NEEDS FIXING
🚀 Deployment Instructions
Quick Start (Recommended)
- Deploy with fixed spine configs:
cd ~/arista-evpn-vxlan-clab
sudo containerlab deploy -t evpn-lab.clab.yml
- Verify BGP is working:
ssh admin@clab-arista-evpn-fabric-spine1 "show bgp evpn summary"
- Apply remaining fixes manually or wait for config updates
Complete Fix (When Ready)
- Once Port-Channel and host configs are updated, redeploy topology for zero-downtime testing
📊 Testing Results
After applying spine ip routing fix:
- ✅ BGP underlay sessions establish (eBGP between spine-leaf, iBGP between MLAG pairs)
- ✅ BGP EVPN overlay sessions establish
- ✅ MLAG pairs form correctly (active-full, up/up)
- ✅ MAC addresses learned locally on leaf switches
- ⏳ EVPN Type-2 routes advertised (pending overlay establishment)
- ⏳ End-to-end connectivity (pending all fixes applied)
💡 Key Learnings
- The
ip routingfix is critical and must be in the startup-config for clean deployments - Static LAG (
mode on) is more reliable than LACP in containerized environments - Access mode port-channels work better with simple Linux containers
- For production environments with proper bonding support, LACP can be re-enabled
🔗 Related Issues
- Spine BGP not starting: Missing
ip routingcommand - MLAG port-channels not forming: LACP incompatibility
- No MAC learning: Trunk vs Access mode mismatch
- No VXLAN tunnel endpoints: Pending overlay establishment
✅ Final Status
Spine Fixes: COMPLETE ✅
MLAG Fixes: COMPLETE ✅
Port-Channel Access Mode: PENDING ⏳
Host Networking: PENDING ⏳
EVPN Overlay: TESTING ⏳