Files
arista-evpn-vxlan-clab/FIXES_APPLIED.md
Damien Arnodo 783e12cea6 Add FIXES_APPLIED.md to main branch
Document all critical fixes discovered during lab testing:
- Spine routing: ip routing command added
- MLAG: static LAG mode enabled
- Pending: port-channel access mode, host networking

Track status of each fix for deployment readiness.
2025-11-28 09:27:56 +00:00

5.0 KiB

Fixes Applied in Main Branch

This document tracks critical fixes that have been discovered and applied during lab deployment to ensure the EVPN-VXLAN fabric functions correctly.

Fixes Applied to Main Branch

1. Spine Switches - Enable IP Routing FIXED

Problem: BGP was disabled on spine switches with error "BGP is disabled for VRF default" and "IP routing not enabled"

Fix: Added ip routing command to both spine configurations

  • configs/spine1.cfg - Added line: ip routing (before service routing protocols model multi-agent)
  • configs/spine2.cfg - Added line: ip routing (before service routing protocols model multi-agent)

Impact: This enables BGP to function properly on spines, allowing:

  • Underlay BGP IPv4 Unicast sessions to establish
  • EVPN BGP sessions to establish
  • Route exchange between spines and leafs

Status: APPLIED (commits applied to main branch)


2. Leaf Switches - MLAG Port-Channel Mode FIXED

Problem: LACP bonding (mode active) doesn't work properly in Alpine Linux containers due to lack of kernel module support

Fix: Changed from LACP to static LAG

  • Changed channel-group 1 mode active to channel-group 1 mode on in all leaf configs
  • This creates a static LAG that works in containerized environments

Status: ALREADY APPLIED (pushed by user in previous commits)


Remaining Issues (Pending Application)

3. Leaf Switches - Port-Channel1 Switchport Mode PENDING

Problem: Port-Channel configured as trunk, but Alpine containers send untagged traffic

Fix Needed: Change Port-Channel1 from trunk to access mode on all leafs:

interface Port-Channel1
   switchport mode access
   switchport access vlan 40   # or appropriate VLAN for each VTEP

Status: NOT YET APPLIED - Needs manual configuration or config file updates

Affected Files:

  • configs/leaf1.cfg
  • configs/leaf2.cfg
  • configs/leaf3.cfg
  • configs/leaf4.cfg
  • configs/leaf5.cfg
  • configs/leaf6.cfg
  • configs/leaf7.cfg
  • configs/leaf8.cfg

4. Host Configuration - Simplified Bonding PENDING

Problem: Alpine Linux containers cannot properly configure 802.3ad LACP bonding

Fix Needed: Remove bonding complexity, use single interface:

host1:
  exec:
    - ip addr add 10.40.40.101/24 dev eth1
    - ip link set eth1 up

Status: NOT YET APPLIED - Topology file needs updating


📋 Summary of Issues Found

Issue #1: Missing ip routing on Spines

  • Symptoms:
    • show ip bgp summary returned "BGP is disabled for VRF default"
    • Attempting to configure BGP showed "! IP routing not enabled"
  • Root Cause: Arista EOS requires explicit ip routing command to enable L3 functionality
  • Status: FIXED

Issue #2: LACP Bonding in Containers

  • Symptoms:
    • Port-Channel showing "waiting for LACP response"
    • Host bond interface in DOWN state
  • Root Cause: Alpine containers don't have bonding kernel modules
  • Status: FIXED (by changing to static LAG)

Issue #3: Trunk vs Access Mode

  • Symptoms:
    • No MAC learning on switch
    • Port-Channel counters showed traffic but no unicast packets
  • Root Cause: Hosts send untagged traffic, switch expects tagged (trunk mode)
  • Status: NEEDS FIXING

🚀 Deployment Instructions

  1. Deploy with fixed spine configs:
cd ~/arista-evpn-vxlan-clab
sudo containerlab deploy -t evpn-lab.clab.yml
  1. Verify BGP is working:
ssh admin@clab-arista-evpn-fabric-spine1 "show bgp evpn summary"
  1. Apply remaining fixes manually or wait for config updates

Complete Fix (When Ready)

  • Once Port-Channel and host configs are updated, redeploy topology for zero-downtime testing

📊 Testing Results

After applying spine ip routing fix:

  • BGP underlay sessions establish (eBGP between spine-leaf, iBGP between MLAG pairs)
  • BGP EVPN overlay sessions establish
  • MLAG pairs form correctly (active-full, up/up)
  • MAC addresses learned locally on leaf switches
  • EVPN Type-2 routes advertised (pending overlay establishment)
  • End-to-end connectivity (pending all fixes applied)

💡 Key Learnings

  • The ip routing fix is critical and must be in the startup-config for clean deployments
  • Static LAG (mode on) is more reliable than LACP in containerized environments
  • Access mode port-channels work better with simple Linux containers
  • For production environments with proper bonding support, LACP can be re-enabled

  • Spine BGP not starting: Missing ip routing command
  • MLAG port-channels not forming: LACP incompatibility
  • No MAC learning: Trunk vs Access mode mismatch
  • No VXLAN tunnel endpoints: Pending overlay establishment

Final Status

Spine Fixes: COMPLETE MLAG Fixes: COMPLETE
Port-Channel Access Mode: PENDING Host Networking: PENDING EVPN Overlay: TESTING