Files
arista-evpn-vxlan-clab/END_TO_END_TESTING.md
Damien 1080bf07bb Complete Lab Fixes - L2 and L3 VXLAN Fully Operational (#14)
## Summary

This PR merges all fixes and improvements from the troubleshooting journey to make the Arista EVPN-VXLAN lab fully operational with both L2 and L3 VXLAN connectivity.

## What's Changed

### 🎯 Major Achievements
-  **L2 VXLAN fully operational** - host1 ↔ host3 connectivity verified
-  **L3 VXLAN fully operational** - host2 ↔ host4 connectivity verified (VRF gold)
-  **LACP bonding working** - dual-homed hosts with proper Port-Channel negotiation
-  **All BGP/EVPN sessions established** - complete underlay and overlay working

### 🔧 Infrastructure Fixes

#### BGP & Routing
- Added `ip routing` command to all spine and leaf switches
- Fixed duplicate BGP network statements on leaf3, leaf4, leaf7, leaf8
- Activated EVPN neighbors on spine switches
- Added loopback network advertisements to BGP

#### MLAG Configuration
- Configured MLAG peer-link in trunk mode (not access) for VLAN 4090/4091
- Added dual-active detection via management interface
- Configured virtual router MAC for MLAG pairs

#### Switch Port Configuration
- Port-Channel1 configured in **trunk mode** on all leaf switches
- Added `switchport trunk allowed vlan` for host VLANs (34, 40, 78)
- Removed `no shutdown` from Port-Channel interfaces

### 🖥️ Host Networking - Complete Redesign

#### Image Change
- **Old:** `alpine:latest` (had bonding syntax issues)
- **New:** `ghcr.io/hellt/network-multitool` (networking tools pre-installed)

#### LACP Bonding Configuration
Proper LACP setup following network-multitool best practices:
```yaml
- ip link add bond0 type bond mode 802.3ad
- ip link set dev bond0 type bond xmit_hash_policy layer3+4
- ip link set dev eth1 down
- ip link set dev eth2 down
- ip link set eth1 master bond0
- ip link set eth2 master bond0
- ip link set dev eth1 up
- ip link set dev eth2 up
- ip link set dev bond0 type bond lacp_rate fast
- ip link set dev bond0 up
```

#### VLAN Configuration
- **L2 VXLAN hosts (host1, host3):** VLAN 40 tagged on bond0
- **L3 VXLAN hosts (host2, host4):** VLANs 34 and 78 tagged on bond0

#### Routing Strategy
- Kept management default route (172.16.0.254 via eth0)
- Added **specific routes** for L3 VXLAN networks instead of default routes:
  - host2: `ip route add 10.78.78.0/24 via 10.34.34.1`
  - host4: `ip route add 10.34.34.0/24 via 10.78.78.1`

### 📁 Files Changed

#### Switch Configurations (Updated)
- `configs/spine1.cfg` - Added ip routing, EVPN activation
- `configs/spine2.cfg` - Added ip routing, EVPN activation
- `configs/leaf1.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf2.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf3.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf4.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf5.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf6.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf7.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf8.cfg` - Added ip routing, loopback ads, Port-Channel config

#### Topology (Updated)
- `evpn-lab.clab.yml` - Updated all host configurations with network-multitool image and proper LACP/VLAN setup

#### Documentation (New)
- `hosts/README.md` - Host interface configuration guide
- `hosts/host1_interfaces` - Interface file for host1 (not currently used, kept for reference)
- `hosts/host2_interfaces` - Interface file for host2 (not currently used, kept for reference)
- `hosts/host3_interfaces` - Interface file for host3 (not currently used, kept for reference)
- `hosts/host4_interfaces` - Interface file for host4 (not currently used, kept for reference)

## Testing & Verification

###  L2 VXLAN (VLAN 40)
```
host1 (10.40.40.101) → host3 (10.40.40.103)
- Connectivity: VERIFIED ✓
- VXLAN tunnel: VTEP1 ↔ VTEP3
- MAC learning: Working via EVPN Type-2
```

###  L3 VXLAN (VRF gold)
```
host2 (10.34.34.102) → host4 (10.78.78.104)
- Connectivity: VERIFIED ✓
- Ping results: 0% packet loss, TTL=62
- Routing: Via EVPN Type-5 through fabric
```

###  Infrastructure Status
- BGP Underlay: All sessions ESTAB
- EVPN Overlay: All neighbors ESTAB
- MLAG: All 4 pairs operational
- Port-Channels: LACP negotiated on all hosts

## Related Issues

Fixes #1 - Lab deployment and configuration fixes
Fixes #2 - BGP EVPN neighbors stuck in Connect state
Fixes #3 - Ready for deployment with EVPN activation
Fixes #4 - Lab convergence in progress
Fixes #5 - BGP EVPN neighbors stuck in Active state
Fixes #11 - Host LACP bonding configuration
Fixes #13 - L3 VXLAN default route issue

## Key Technical Learnings

1. **Arista EOS requires explicit `ip routing`** before BGP can function
2. **MLAG peer-link must be trunk mode** to allow VLAN 4090/4091 traversal
3. **VLAN tagging location matters** - hosts tag, switches use trunk mode
4. **network-multitool image** superior to Alpine for LACP bonding
5. **Specific routes better than default routes** when management network present
6. **LACP rate fast** ensures quick negotiation with Arista switches

## Deployment

After merging, deploy with:
```bash
cd ~/arista-evpn-vxlan-clab
sudo containerlab destroy -t evpn-lab.clab.yml --cleanup
sudo containerlab deploy -t evpn-lab.clab.yml
```

No manual post-deployment configuration needed - everything works from initial deployment!

## Breaking Changes

⚠️ **Host image changed** from `alpine:latest` to `ghcr.io/hellt/network-multitool`
⚠️ **Host configuration completely redesigned** - old exec commands replaced

## Reviewers

@Damien - Please review and merge when ready

---

**This PR represents the complete troubleshooting journey and brings the lab to production-ready status with full L2 and L3 VXLAN functionality.** 🚀

Reviewed-on: #14
Co-authored-by: Damien <damien@arnodo.fr>
Co-committed-by: Damien <damien@arnodo.fr>
2025-11-30 10:24:29 +00:00

9.1 KiB

End-to-End Connectivity Testing Guide

Overview

This document provides a step-by-step guide to test the EVPN VXLAN fabric after deploying the updated topology with proper VLAN tagging on hosts.

Recent Changes

Fixed Issues

  1. Host VLAN Tagging

    • Hosts now create VLAN subinterfaces on top of bonded interfaces
    • Host1 & Host3: VLAN 40 tagged (L2 VXLAN test)
    • Host2: VLAN 34 tagged (L3 VXLAN test)
    • Host4: VLAN 78 tagged (L3 VXLAN test)
  2. Leaf Port-Channel Configuration

    • All leaf Port-Channel1 interfaces are in access mode
    • Properly mapped to their respective VLANs
    • MLAG enabled for dual-active forwarding

Pre-Test Verification

1. Check MLAG Status on All Leaf Pairs

# Leaf Pair 1 (leaf1 & leaf2)
ssh admin@clab-arista-evpn-fabric-leaf1 "show mlag detail"
ssh admin@clab-arista-evpn-fabric-leaf2 "show mlag detail"

# Leaf Pair 2 (leaf3 & leaf4)
ssh admin@clab-arista-evpn-fabric-leaf3 "show mlag detail"
ssh admin@clab-arista-evpn-fabric-leaf4 "show mlag detail"

# Leaf Pair 3 (leaf5 & leaf6)
ssh admin@clab-arista-evpn-fabric-leaf5 "show mlag detail"
ssh admin@clab-arista-evpn-fabric-leaf6 "show mlag detail"

# Leaf Pair 4 (leaf7 & leaf8)
ssh admin@clab-arista-evpn-fabric-leaf7 "show mlag detail"
ssh admin@clab-arista-evpn-fabric-leaf8 "show mlag detail"

2. Check BGP Underlay Status

# On Spines
ssh admin@clab-arista-evpn-fabric-spine1 "show bgp ipv4 unicast summary"
ssh admin@clab-arista-evpn-fabric-spine2 "show bgp ipv4 unicast summary"

# Expected: All leaf neighbors should be in ESTABLISHED state

3. Check BGP EVPN Status

# On any leaf
ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn summary"

# Expected: Both spine neighbors should be ESTABLISHED

L2 VXLAN Testing (VLAN 40)

Hosts Involved

  • Host1 (10.40.40.101) - Connected to Leaf1/Leaf2 (VTEP1)
  • Host3 (10.40.40.103) - Connected to Leaf5/Leaf6 (VTEP3)

Test Sequence

Step 1: Verify Host Network Interfaces

# Check host1 VLAN interface
docker exec clab-arista-evpn-fabric-host1 ip -d link show bond0.40
docker exec clab-arista-evpn-fabric-host1 ip addr show bond0.40

# Check host3 VLAN interface
docker exec clab-arista-evpn-fabric-host3 ip -d link show bond0.40
docker exec clab-arista-evpn-fabric-host3 ip addr show bond0.40

Step 2: Verify Leaf Port-Channel Configuration

# Leaf1 Port-Channel1
ssh admin@clab-arista-evpn-fabric-leaf1 "show interface Port-Channel1 switchport"

# Expected output:
# Switchport Mode: access
# Access Mode VLAN: 40
# Spanning Tree Portfast: enabled

Step 3: Test L2 Connectivity (Ping Test)

echo "=== L2 VXLAN Ping Test (Host1 → Host3) ==="
timeout 10 docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103

Step 4: Verify MAC Learning

# On Leaf1 - check local MAC learning
ssh admin@clab-arista-evpn-fabric-leaf1 "show mac address-table vlan 40"

# Expected: MAC from host1 should appear on Port-Channel1

# On Leaf5 - check MAC learning
ssh admin@clab-arista-evpn-fabric-leaf5 "show mac address-table vlan 40"

# Expected: MAC from host3 should appear on Port-Channel1

Step 5: Verify VXLAN Learning

# Check remote VXLAN endpoints
ssh admin@clab-arista-evpn-fabric-leaf1 "show vxlan vtep"

# Expected: Should show VTEP3 (10.0.255.13)

# Check VXLAN address table
ssh admin@clab-arista-evpn-fabric-leaf1 "show vxlan address-table"

# Expected: Should show MACs learned via Vxlan1 interface

Step 6: Verify EVPN Type-2 Routes

# Check BGP EVPN routes on Leaf1
ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn route-type mac-ip"

# Expected:
# - Local MAC (host1) with RD 65001:110040
# - Remote MAC (host3) with RD 65003:110040 pointing to VTEP 10.0.255.13

L3 VXLAN Testing (VRF gold)

Hosts Involved

  • Host2 (10.34.34.102) - Connected to Leaf3/Leaf4 (VTEP2) in VRF gold VLAN 34
  • Host4 (10.78.78.104) - Connected to Leaf7/Leaf8 (VTEP4) in VRF gold VLAN 78

Test Sequence

Step 1: Verify Host Network Interfaces

# Check host2 VLAN interface
docker exec clab-arista-evpn-fabric-host2 ip -d link show bond0.34
docker exec clab-arista-evpn-fabric-host2 ip addr show bond0.34

# Check host4 VLAN interface
docker exec clab-arista-evpn-fabric-host4 ip -d link show bond0.78
docker exec clab-arista-evpn-fabric-host4 ip addr show bond0.78

Step 2: Verify Leaf VRF VLAN Configuration

# On Leaf3
ssh admin@clab-arista-evpn-fabric-leaf3 "show vlan 34"
ssh admin@clab-arista-evpn-fabric-leaf3 "show interface Vlan34"

# Expected:
# - VLAN 34 exists
# - Vlan34 interface is in VRF gold with IP 10.34.34.2/24
# - Virtual router address 10.34.34.1 is configured

Step 3: Test L3 Connectivity (Ping Test)

echo "=== L3 VXLAN Ping Test (Host2 → Host4) ==="
timeout 10 docker exec clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104

Step 4: Verify VRF Routing Tables

# On Leaf3 - check routes in VRF gold
ssh admin@clab-arista-evpn-fabric-leaf3 "show ip route vrf gold"

# Expected: Should include routes to 10.34.34.0/24 and 10.78.78.0/24

# On Leaf4
ssh admin@clab-arista-evpn-fabric-leaf4 "show ip route vrf gold"

Step 5: Verify EVPN Type-5 Routes

# Check BGP EVPN routes on Leaf3
ssh admin@clab-arista-evpn-fabric-leaf3 "show bgp evpn route-type ip-prefix ipv4"

# Expected:
# - Local subnets (10.34.34.0/24 from Leaf3/Leaf4)
# - Remote subnets (10.78.78.0/24 from Leaf7/Leaf8)

Complete End-to-End Test Script

#!/bin/bash

echo "======================================"
echo "EVPN VXLAN Fabric Testing"
echo "======================================"

# 1. Underlay connectivity
echo ""
echo "=== Testing Underlay BGP ==="
ssh admin@clab-arista-evpn-fabric-spine1 "show bgp ipv4 unicast summary" | tail -20

# 2. EVPN overlay connectivity
echo ""
echo "=== Testing EVPN Overlay ==="
ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn summary" | tail -5

# 3. L2 VXLAN connectivity
echo ""
echo "=== Testing L2 VXLAN (Host1 → Host3) ==="
timeout 10 docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103
echo "Status: $?"

# 4. L3 VXLAN connectivity
echo ""
echo "=== Testing L3 VXLAN (Host2 → Host4) ==="
timeout 10 docker exec clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104
echo "Status: $?"

# 5. MAC learning verification
echo ""
echo "=== Verifying MAC Learning ==="
echo "Leaf1 VLAN 40:"
ssh admin@clab-arista-evpn-fabric-leaf1 "show mac address-table vlan 40"
echo ""
echo "Leaf5 VLAN 40:"
ssh admin@clab-arista-evpn-fabric-leaf5 "show mac address-table vlan 40"

# 6. VRF routing verification
echo ""
echo "=== Verifying VRF Routing ==="
echo "Leaf3 VRF gold routes:"
ssh admin@clab-arista-evpn-fabric-leaf3 "show ip route vrf gold"

Troubleshooting

Ping fails - Hosts can't reach each other

  1. Check host connectivity to leaf:

    docker exec clab-arista-evpn-fabric-host1 ip route
    # Should show default route via VLAN gateway
    
    docker exec clab-arista-evpn-fabric-host1 ping -c 2 10.40.40.1
    # Should reach the virtual router gateway
    
  2. Check leaf port-channel status:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show interface Port-Channel1"
    # Should show "up, up"
    
  3. Check VXLAN interface status:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show interface Vxlan1"
    # Should show "up, up"
    
  4. Check MLAG status:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show mlag detail"
    # Should show "mlag is active"
    

Empty MAC table on leafs

  1. Verify host is sending traffic:

    docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.1
    # Generate some ARP/ICMP traffic
    
  2. Check for spanning-tree blocking:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show spanning-tree detail vlan 40"
    

No EVPN routes exchanged

  1. Check BGP EVPN session state:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn summary"
    # Must show ESTABLISHED, not Connect or Active
    
  2. Check EVPN configuration:

    ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn"
    # Look for rd and route-target configuration
    

Expected Results

Test Expected Outcome Status
Spine BGP All leaves established ✓ Expected
Leaf BGP All spines established ✓ Expected
EVPN neighbors Established with spines ✓ Expected
L2 ping (Host1→Host3) 4/4 packets successful ✓ Expected
L3 ping (Host2→Host4) 4/4 packets successful ✓ Expected
MAC learning MACs learned on Vxlan1 ✓ Expected
EVPN Type-2 Routes learned for MACs ✓ Expected
EVPN Type-5 Routes learned for subnets ✓ Expected

Lab Deployment Steps

To deploy the lab with the fixes:

cd ~/arista-evpn-vxlan-clab
git checkout fix-bgp-and-mlag
sudo containerlab destroy -t evpn-lab.clab.yml
sudo containerlab deploy -t evpn-lab.clab.yml

The lab should now have:

  • Proper VLAN tagging on all hosts
  • Correct VXLAN VTEP configuration
  • Working BGP EVPN overlay
  • End-to-end connectivity between remote VTEPs