Complete Lab Fixes - L2 and L3 VXLAN Fully Operational (#14)

## Summary

This PR merges all fixes and improvements from the troubleshooting journey to make the Arista EVPN-VXLAN lab fully operational with both L2 and L3 VXLAN connectivity.

## What's Changed

### 🎯 Major Achievements
-  **L2 VXLAN fully operational** - host1 ↔ host3 connectivity verified
-  **L3 VXLAN fully operational** - host2 ↔ host4 connectivity verified (VRF gold)
-  **LACP bonding working** - dual-homed hosts with proper Port-Channel negotiation
-  **All BGP/EVPN sessions established** - complete underlay and overlay working

### 🔧 Infrastructure Fixes

#### BGP & Routing
- Added `ip routing` command to all spine and leaf switches
- Fixed duplicate BGP network statements on leaf3, leaf4, leaf7, leaf8
- Activated EVPN neighbors on spine switches
- Added loopback network advertisements to BGP

#### MLAG Configuration
- Configured MLAG peer-link in trunk mode (not access) for VLAN 4090/4091
- Added dual-active detection via management interface
- Configured virtual router MAC for MLAG pairs

#### Switch Port Configuration
- Port-Channel1 configured in **trunk mode** on all leaf switches
- Added `switchport trunk allowed vlan` for host VLANs (34, 40, 78)
- Removed `no shutdown` from Port-Channel interfaces

### 🖥️ Host Networking - Complete Redesign

#### Image Change
- **Old:** `alpine:latest` (had bonding syntax issues)
- **New:** `ghcr.io/hellt/network-multitool` (networking tools pre-installed)

#### LACP Bonding Configuration
Proper LACP setup following network-multitool best practices:
```yaml
- ip link add bond0 type bond mode 802.3ad
- ip link set dev bond0 type bond xmit_hash_policy layer3+4
- ip link set dev eth1 down
- ip link set dev eth2 down
- ip link set eth1 master bond0
- ip link set eth2 master bond0
- ip link set dev eth1 up
- ip link set dev eth2 up
- ip link set dev bond0 type bond lacp_rate fast
- ip link set dev bond0 up
```

#### VLAN Configuration
- **L2 VXLAN hosts (host1, host3):** VLAN 40 tagged on bond0
- **L3 VXLAN hosts (host2, host4):** VLANs 34 and 78 tagged on bond0

#### Routing Strategy
- Kept management default route (172.16.0.254 via eth0)
- Added **specific routes** for L3 VXLAN networks instead of default routes:
  - host2: `ip route add 10.78.78.0/24 via 10.34.34.1`
  - host4: `ip route add 10.34.34.0/24 via 10.78.78.1`

### 📁 Files Changed

#### Switch Configurations (Updated)
- `configs/spine1.cfg` - Added ip routing, EVPN activation
- `configs/spine2.cfg` - Added ip routing, EVPN activation
- `configs/leaf1.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf2.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf3.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf4.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf5.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf6.cfg` - Port-Channel trunk mode, VLAN config
- `configs/leaf7.cfg` - Added ip routing, loopback ads, Port-Channel config
- `configs/leaf8.cfg` - Added ip routing, loopback ads, Port-Channel config

#### Topology (Updated)
- `evpn-lab.clab.yml` - Updated all host configurations with network-multitool image and proper LACP/VLAN setup

#### Documentation (New)
- `hosts/README.md` - Host interface configuration guide
- `hosts/host1_interfaces` - Interface file for host1 (not currently used, kept for reference)
- `hosts/host2_interfaces` - Interface file for host2 (not currently used, kept for reference)
- `hosts/host3_interfaces` - Interface file for host3 (not currently used, kept for reference)
- `hosts/host4_interfaces` - Interface file for host4 (not currently used, kept for reference)

## Testing & Verification

###  L2 VXLAN (VLAN 40)
```
host1 (10.40.40.101) → host3 (10.40.40.103)
- Connectivity: VERIFIED ✓
- VXLAN tunnel: VTEP1 ↔ VTEP3
- MAC learning: Working via EVPN Type-2
```

###  L3 VXLAN (VRF gold)
```
host2 (10.34.34.102) → host4 (10.78.78.104)
- Connectivity: VERIFIED ✓
- Ping results: 0% packet loss, TTL=62
- Routing: Via EVPN Type-5 through fabric
```

###  Infrastructure Status
- BGP Underlay: All sessions ESTAB
- EVPN Overlay: All neighbors ESTAB
- MLAG: All 4 pairs operational
- Port-Channels: LACP negotiated on all hosts

## Related Issues

Fixes #1 - Lab deployment and configuration fixes
Fixes #2 - BGP EVPN neighbors stuck in Connect state
Fixes #3 - Ready for deployment with EVPN activation
Fixes #4 - Lab convergence in progress
Fixes #5 - BGP EVPN neighbors stuck in Active state
Fixes #11 - Host LACP bonding configuration
Fixes #13 - L3 VXLAN default route issue

## Key Technical Learnings

1. **Arista EOS requires explicit `ip routing`** before BGP can function
2. **MLAG peer-link must be trunk mode** to allow VLAN 4090/4091 traversal
3. **VLAN tagging location matters** - hosts tag, switches use trunk mode
4. **network-multitool image** superior to Alpine for LACP bonding
5. **Specific routes better than default routes** when management network present
6. **LACP rate fast** ensures quick negotiation with Arista switches

## Deployment

After merging, deploy with:
```bash
cd ~/arista-evpn-vxlan-clab
sudo containerlab destroy -t evpn-lab.clab.yml --cleanup
sudo containerlab deploy -t evpn-lab.clab.yml
```

No manual post-deployment configuration needed - everything works from initial deployment!

## Breaking Changes

⚠️ **Host image changed** from `alpine:latest` to `ghcr.io/hellt/network-multitool`
⚠️ **Host configuration completely redesigned** - old exec commands replaced

## Reviewers

@Damien - Please review and merge when ready

---

**This PR represents the complete troubleshooting journey and brings the lab to production-ready status with full L2 and L3 VXLAN functionality.** 🚀

Reviewed-on: #14
Co-authored-by: Damien <damien@arnodo.fr>
Co-committed-by: Damien <damien@arnodo.fr>
This commit was merged in pull request #14.
This commit is contained in:
2025-11-30 10:24:29 +00:00
committed by Damien Arnodo
parent 9502302b76
commit 1080bf07bb
23 changed files with 2632 additions and 74 deletions

18
hosts/host2_interfaces Normal file
View File

@@ -0,0 +1,18 @@
auto lo
iface lo inet loopback
auto bond0
iface bond0 inet manual
use bond
bond-slaves eth1 eth2
bond-mode 802.3ad
bond-miimon 100
bond-lacp-rate fast
up ip link set $IFACE up
auto bond0.34
iface bond0.34 inet static
address 10.34.34.102
netmask 255.255.255.0
vlan-raw-device bond0
up ip link set $IFACE up