From db54e56b4179e7a4f0735f6fd1dc051649e46bd9 Mon Sep 17 00:00:00 2001 From: Damien Arnodo Date: Sun, 30 Nov 2025 19:07:22 +0000 Subject: [PATCH] chore: Repository cleanup - Remove unnecessary files (#16) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Repository cleanup to remove unnecessary files and streamline documentation after the successful EVPN-VXLAN lab implementation. Closes #15 --- ## Changes ### Files Removed (13 files total) **Scripts folder:** - `scripts/deploy.sh` - `scripts/test-connectivity.sh` - `scripts/cleanup.sh` **Root-level markdown files:** - `BRANCH_SUMMARY.md` - `BUGFIX_EVPN_ACTIVATION.md` - `DEPLOYMENT_GUIDE.md` - `FIXES_APPLIED.md` - `TESTING_CHECKLIST.md` - `VLAN_TAGGING_FIX_EXPLANATION.md` **docs/ folder (entire folder removed):** - `docs/HOST_INTERFACE_CONFIGURATION.md` - `docs/configuration-guide.md` - `docs/quick-reference.md` - `docs/validation-commands.md` ### Files Updated - `hosts/README.md` - Fixed broken links - `README.md` - Updated repository structure section --- ## Final Repository Structure ``` ├── .gitignore ├── README.md # Main documentation ├── TROUBLESHOOTING.md # Troubleshooting guide ├── END_TO_END_TESTING.md # Testing procedures ├── evpn-lab.clab.yml # ContainerLab topology ├── configs/ # Switch configurations (10 files) └── hosts/ # Host interface configs (5 files) ``` --- ## Testing - [x] Lab redeployed successfully with `containerlab deploy -t evpn-lab.clab.yml` - [x] L2 VXLAN connectivity verified (host1 ↔ host3) - [x] L3 VXLAN connectivity verified (host2 ↔ host4) - [x] All BGP EVPN sessions established - [x] MLAG pairs operational Reviewed-on: https://gitea.arnodo.fr/Damien/arista-evpn-vxlan-clab/pulls/16 --- BRANCH_SUMMARY.md | 251 ----------------- BUGFIX_EVPN_ACTIVATION.md | 114 -------- DEPLOYMENT_GUIDE.md | 179 ------------ FIXES_APPLIED.md | 157 ----------- README.md | 24 +- TESTING_CHECKLIST.md | 304 -------------------- VLAN_TAGGING_FIX_EXPLANATION.md | 167 ----------- docs/HOST_INTERFACE_CONFIGURATION.md | 154 ----------- docs/configuration-guide.md | 400 --------------------------- docs/quick-reference.md | 288 ------------------- docs/validation-commands.md | 375 ------------------------- hosts/README.md | 7 +- scripts/cleanup.sh | 91 ------ scripts/deploy.sh | 248 ----------------- scripts/test-connectivity.sh | 146 ---------- 15 files changed, 16 insertions(+), 2889 deletions(-) delete mode 100644 BRANCH_SUMMARY.md delete mode 100644 BUGFIX_EVPN_ACTIVATION.md delete mode 100644 DEPLOYMENT_GUIDE.md delete mode 100644 FIXES_APPLIED.md delete mode 100644 TESTING_CHECKLIST.md delete mode 100644 VLAN_TAGGING_FIX_EXPLANATION.md delete mode 100644 docs/HOST_INTERFACE_CONFIGURATION.md delete mode 100644 docs/configuration-guide.md delete mode 100644 docs/quick-reference.md delete mode 100644 docs/validation-commands.md delete mode 100644 scripts/cleanup.sh delete mode 100644 scripts/deploy.sh delete mode 100644 scripts/test-connectivity.sh diff --git a/BRANCH_SUMMARY.md b/BRANCH_SUMMARY.md deleted file mode 100644 index ba6ac77..0000000 --- a/BRANCH_SUMMARY.md +++ /dev/null @@ -1,251 +0,0 @@ -# fix-bgp-and-mlag Branch Summary - -## Overview -This branch contains critical fixes for VLAN tagging and host configuration that enable proper end-to-end connectivity in the EVPN VXLAN fabric. - -## Root Cause Analysis - -### Problem -Hosts were unable to communicate across the VXLAN fabric. Testing showed: -- Empty MAC tables on leaf switches -- No EVPN Type-2 routes being advertised -- Ping tests between hosts failed with 100% packet loss - -### Root Cause -**VLAN tagging mismatch** between hosts and leaf switch port-channels: -- Hosts were sending **untagged Ethernet frames** -- Leaf port-channels were configured in **access mode** expecting **tagged VLAN frames** -- Result: Frames were dropped at the leaf ingress interface, never reaching VLAN 40 or 34 - -### Solution -**Host-side VLAN tagging**: Configure hosts to create VLAN subinterfaces (802.1Q) on top of bonded interfaces. This ensures frames carry the correct VLAN tag matching the leaf's access VLAN configuration. - ---- - -## Changes Made - -### 1. evpn-lab.clab.yml -**Modified:** Host device configuration -**Changes:** -- host1: Added VLAN 40 subinterface creation (bond0.40) -- host2: Added VLAN 34 subinterface creation (bond0.34) -- host3: Added VLAN 40 subinterface creation (bond0.40) -- host4: Added VLAN 78 subinterface creation (bond0.78) - -**Before:** -```yaml -host1: - exec: - - ip link add bond0 type bond mode balance-rr - - ip link set eth1 master bond0 - - ip link set eth2 master bond0 - - ip link set bond0 up - - ip addr add 10.40.40.101/24 dev bond0 # ← Untagged! -``` - -**After:** -```yaml -host1: - exec: - - ip link add bond0 type bond mode balance-rr - - ip link set eth1 master bond0 - - ip link set eth2 master bond0 - - ip link set bond0 up - # VLAN tagging added: - - ip link add link bond0 name bond0.40 type vlan id 40 - - ip link set bond0.40 up - - ip addr add 10.40.40.101/24 dev bond0.40 # ← Tagged with VLAN 40! -``` - -### 2. Documentation Files (New) - -#### END_TO_END_TESTING.md -Comprehensive guide covering: -- Pre-test verification procedures -- L2 VXLAN connectivity testing (VLAN 40) -- L3 VXLAN connectivity testing (VRF gold) -- Complete test script for automation -- Detailed troubleshooting procedures - -#### VLAN_TAGGING_FIX_EXPLANATION.md -Technical deep-dive covering: -- Problem explanation with diagrams -- Broken vs. fixed configuration comparison -- VLAN tagging mapping table -- Why this approach was chosen -- Testing verification steps - -#### TESTING_CHECKLIST.md -Deployment validation checklist with: -- Deployment steps -- Pre-testing checks (9 checks total) -- Connectivity tests (9 tests total) -- Summary table -- Troubleshooting procedures -- Success criteria - ---- - -## Technical Details - -### VLAN Configuration Mapping - -| Component | VLAN 40 (L2 VXLAN) | VLAN 34 (L3 VXLAN) | VLAN 78 (L3 VXLAN) | -|-----------|-------------------|-------------------|-------------------| -| **host1** | bond0.40 (10.40.40.101) | - | - | -| **host2** | - | bond0.34 (10.34.34.102) | - | -| **host3** | bond0.40 (10.40.40.103) | - | - | -| **host4** | - | - | bond0.78 (10.78.78.104) | -| **Leaf Port** | Access VLAN 40 | Access VLAN 34 | Access VLAN 78 | -| **VTEP** | 10.0.255.11 (Pair) | 10.0.255.12 (Pair) | 10.0.255.14 (Pair) | -| **VNI** | 110040 (L2) | 100001 (L3) | 100001 (L3) | -| **VRF** | default | gold | gold | - -### Why This Fix Works - -1. **Linux VLAN Subinterfaces** send 802.1Q tagged frames - ``` - Frame format: [DA][SA][**VLAN Tag 40**][Type][Payload] - ``` - -2. **Leaf Access Port** recognizes the VLAN tag - ``` - Receives frame with VLAN 40 → Matches configured access VLAN 40 - ``` - -3. **Frame is untagged** and forwarded within VLAN 40 - ``` - Becomes untagged within VLAN → Normal switching/routing - ``` - -4. **MAC learning** happens normally in VLAN 40 - ``` - MAC table updated → EVPN Type-2 routes created - ``` - -5. **Remote VTEP** receives encapsulated packet - ``` - VXLAN decapsulation → Frames forwarded in target VLAN on remote leaf - ``` - ---- - -## Testing Procedure - -### Quick Validation (5 minutes) -```bash -# Deploy lab -sudo containerlab deploy -t evpn-lab.clab.yml - -# Wait 60 seconds for startup -sleep 60 - -# Test L2 connectivity -docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103 - -# Test L3 connectivity -docker exec clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104 -``` - -### Full Validation (20 minutes) -Follow the TESTING_CHECKLIST.md for comprehensive validation - ---- - -## Affected Functionality - -### ✅ Now Working -- Host-to-host L2 VXLAN connectivity -- MAC learning via VXLAN -- EVPN Type-2 route advertisement -- Host-to-host L3 VXLAN connectivity (VRF gold) -- EVPN Type-5 route advertisement -- MLAG dual-active gateway functionality - -### ✅ Already Working (Unchanged) -- Spine BGP underlay -- Leaf BGP underlay -- EVPN overlay adjacencies -- VXLAN VTEP formation -- VRF isolation - -### ⚠️ No Changes Required (Pre-existing) -- Device startup configurations (except host updates) -- BGP routing policies -- Link configurations -- Physical topology - ---- - -## Backward Compatibility - -**Breaking Change:** Yes - Network topology - -This fix requires a **complete lab redeployment** because: -1. Host network configurations have changed -2. Existing running containers will have incorrect interface configuration -3. Cannot be applied incrementally to running lab - -**No breaking changes to:** -- Device configuration format -- BGP policies -- Routing protocols -- VXLAN encapsulation -- EVPN messages - ---- - -## Deployment Checklist - -- [ ] Verify on `fix-bgp-and-mlag` branch -- [ ] Review changes: `git diff main...fix-bgp-and-mlag` -- [ ] Destroy existing lab: `sudo containerlab destroy -t evpn-lab.clab.yml --cleanup` -- [ ] Deploy fixed lab: `sudo containerlab deploy -t evpn-lab.clab.yml` -- [ ] Wait 90 seconds for startup -- [ ] Run quick validation test (5 min) -- [ ] Run full testing checklist (20 min) -- [ ] Verify all tests pass -- [ ] Prepare pull request to merge to main - ---- - -## Related Issues - -This fix addresses the issue: -**"Fixes from fix-bgp-and-mlag branch integrated to main #1"** - -Topics covered: -- L2 VXLAN end-to-end connectivity -- L3 VXLAN end-to-end connectivity -- VLAN tagging at host-to-switch boundary -- MLAG operation with VXLAN -- EVPN Type-2 and Type-5 route advertisement - ---- - -## Future Improvements - -Possible enhancements in subsequent branches: -1. Automated testing script to validate all checks -2. BGP policy testing (as-path, communities, etc.) -3. Failure scenario testing (link down, VTEP down) -4. Performance testing (throughput, latency) -5. Advanced EVPN features (RT-5, multi-homing, etc.) - ---- - -## References - -- `END_TO_END_TESTING.md` - Complete testing guide -- `VLAN_TAGGING_FIX_EXPLANATION.md` - Technical explanation -- `TESTING_CHECKLIST.md` - Validation checklist -- Original source document: Arista BGP EVPN Configuration Example - ---- - -## Questions? - -See the documentation files in this branch for detailed explanations: -1. Start with `VLAN_TAGGING_FIX_EXPLANATION.md` for understanding the problem -2. Move to `END_TO_END_TESTING.md` for comprehensive testing -3. Use `TESTING_CHECKLIST.md` for validation diff --git a/BUGFIX_EVPN_ACTIVATION.md b/BUGFIX_EVPN_ACTIVATION.md deleted file mode 100644 index 39bf092..0000000 --- a/BUGFIX_EVPN_ACTIVATION.md +++ /dev/null @@ -1,114 +0,0 @@ -# BGP EVPN Activation Bug - Critical Fix - -## Issue Description - -All BGP EVPN neighbors on the leaves were stuck in **Active** state instead of **Established** state, with **0 messages sent/received**. - -``` -Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc -10.0.250.1 4 65000 0 0 0 0 00:02:05 Active -10.0.250.2 4 65000 0 0 0 0 00:02:05 Active -``` - -Active state with 0 messages means the TCP handshake was **never completed**. - -## Root Cause - -The **spine BGP configurations were missing the EVPN address family activation**. - -In both `configs/spine1.cfg` and `configs/spine2.cfg`: - -``` -address-family evpn - neighbor evpn activate ← This line was MISSING! -``` - -Without activating the EVPN address family on the spines, they: -1. Accept the EVPN neighbor definitions -2. But don't actively listen for or respond to EVPN connections -3. Leaves try to establish sessions but spines don't respond -4. Connection attempt times out → Active state - -This is **different from the IPv4 underlay** which was working because the IPv4 address family **was activated** on the spines. - -## Solution Applied - -### Before (Broken) -``` -router bgp 65000 - ... - address-family evpn - ! Missing activation line! -``` - -### After (Fixed) -``` -router bgp 65000 - ... - address-family evpn - neighbor evpn activate -``` - -## Files Modified - -- `configs/spine1.cfg` - Added `neighbor evpn activate` in EVPN address family -- `configs/spine2.cfg` - Added `neighbor evpn activate` in EVPN address family - -## Technical Explanation - -In Arista EOS BGP, neighbors defined in the global BGP context don't actively participate in any address family **until explicitly activated in that address family block**. - -### Address Family Activation Rules - -``` -router bgp 65000 - neighbor 10.0.250.1 peer group evpn - neighbor 10.0.250.1 remote-as 65000 - - address-family evpn - neighbor evpn activate ← REQUIRED for EVPN sessions to work - - address-family ipv4 - neighbor 10.0.250.1 activate ← Separate activation for IPv4 -``` - -Without activating in the EVPN address family: -- The spines define the neighbor parameters ✓ -- The spines enter BGP configuration ✓ -- The spines do NOT listen on TCP 179 for EVPN sessions ✗ -- Leaf attempts to TCP connect to spine loopback on port 179 for EVPN ✗ -- Timeout occurs → Active state ✗ - -## Testing the Fix - -After deploying with the fix, the EVPN neighbors should immediately transition to **Established**: - -```bash -# Before fix -10.0.250.1 4 65000 0 0 0 0 00:02:05 Active - -# After fix -10.0.250.1 4 65000 8 8 0 0 00:00:15 Estab -``` - -## Impact - -This was a **critical bug** that: -- Prevented any EVPN overlay from functioning -- Made L2 VXLAN testing impossible -- Made L3 VXLAN testing impossible -- Prevented MAC learning via VXLAN -- Prevented EVPN route distribution - -Once fixed, the entire EVPN overlay becomes operational immediately. - -## Lesson Learned - -In BGP multi-address-family configurations, **every address family must be explicitly activated**. This includes: -- IPv4 unicast -- IPv6 unicast -- EVPN -- Route target filtering -- Any other address families being used - -A common mistake is to define a neighbor globally but forget to activate it in all address families where it should be used. diff --git a/DEPLOYMENT_GUIDE.md b/DEPLOYMENT_GUIDE.md deleted file mode 100644 index 98ad75a..0000000 --- a/DEPLOYMENT_GUIDE.md +++ /dev/null @@ -1,179 +0,0 @@ -# Deployment Guide - Critical Fixes Applied - -## 📌 What Was Fixed - -Two critical fixes from the `fix-bgp-and-mlag` branch have been **automatically applied** to the main branch: - -### ✅ Fix #1: Spine Switch IP Routing -**Before**: BGP disabled - `show ip bgp summary` returned error messages -**After**: BGP fully operational - underlay and overlay sessions establish - -```diff -+ ip routing - service routing protocols model multi-agent -``` - -Applied to: `configs/spine1.cfg` and `configs/spine2.cfg` - -### ✅ Fix #2: MLAG Static LAG (Already in place from previous fix) -**Changed**: LACP bonding → Static LAG for container compatibility - -```diff -- channel-group 1 mode active -+ channel-group 1 mode on -``` - ---- - -## 🚀 How to Deploy - -### Step 1: Clone/Update Your Repository -```bash -cd ~/arista-evpn-vxlan-clab -git pull origin main -``` - -### Step 2: Deploy the Lab -```bash -sudo containerlab deploy -t evpn-lab.clab.yml -``` - -### Step 3: Verify Spine BGP is Working -```bash -ssh admin@clab-arista-evpn-fabric-spine1 "show bgp evpn summary" -``` - -You should see: -``` -BGP summary information for VRF default -Router identifier 10.0.250.1, local AS number 65000 -Neighbor V AS MsgRcvd MsgSent Up/Down State -10.0.250.11 4 65001 8 8 00:04:20 Estab -10.0.250.12 4 65001 8 8 00:04:20 Estab -... -``` - -### Step 4: Verify Underlay BGP -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp ipv4 summary" -``` - ---- - -## ⏳ What Still Needs Manual Fixes - -### Issue #1: Port-Channel Access Mode -Leaf Port-Channel1 needs to be changed from `trunk` to `access` mode: - -```bash -for leaf in spine1 spine2 leaf1 leaf2 leaf3 leaf4 leaf5 leaf6 leaf7 leaf8; do - ssh admin@clab-arista-evpn-fabric-$leaf </dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.40.40.101/24 dev eth1 -ip link set eth1 up -EOF - -# Configure Host3 (VLAN 40 - L2 VXLAN) -docker exec -it clab-arista-evpn-fabric-host3 sh << 'EOF' -ip link set bond0 down 2>/dev/null -ip link del bond0 2>/dev/null -ip addr flush dev eth1 -ip addr add 10.40.40.103/24 dev eth1 -ip link set eth1 up -EOF -``` - ---- - -## ✅ Verification Checklist - -After deployment, verify: - -- [ ] Spine switches are reachable via SSH -- [ ] BGP EVPN summary shows 8 neighbors in ESTAB state per spine -- [ ] Leaf switches show BGP neighbors as ESTAB -- [ ] MLAG pairs show "active-full, up/up" status -- [ ] Loopback addresses are reachable (10.0.250.x/32) -- [ ] VXLAN interfaces are up on leaf switches -- [ ] MAC learning is occurring on leaf switches - ---- - -## 📋 Current Status - -| Component | Status | Notes | -|-----------|--------|-------| -| Spine IP Routing | ✅ FIXED | Critical fix applied | -| Underlay BGP | ✅ WORKING | EBGP spine-leaf, iBGP MLAG | -| EVPN Overlay | ✅ WORKING | IPv4 unicast established | -| MLAG Static LAG | ✅ WORKING | Container-friendly | -| Port-Channel Mode | ⏳ PENDING | Needs access mode change | -| Host Networking | ⏳ PENDING | Simplified config needed | -| VXLAN Tunnels | 🔧 TESTING | Awaiting host config | -| L2 VXLAN (Type-2) | 🔧 TESTING | Awaiting host connectivity | -| L3 VXLAN (Type-5) | 🔧 TESTING | Awaiting host connectivity | - ---- - -## 🔍 Troubleshooting - -### BGP Not Establishing -1. Verify `ip routing` is present in startup-config -2. Check interface IPs: `show ip interface brief` -3. Check connectivity: `ping ` -4. Check BGP neighbors: `show bgp neighbors` - -### MLAG Not Forming -1. Verify peer-link is up: `show interfaces Po999` -2. Check MLAG status: `show mlag detail` -3. Verify MLAG config: `show run | grep mlag` - -### No VXLAN Traffic -1. Verify VXLAN interface is up: `show interfaces vxlan1` -2. Check remote VTEPs: `show vxlan vtep` -3. Verify host connectivity: `ping ` - ---- - -## 📚 Documentation Reference - -- Original EVPN-VXLAN example: See embedded PDF documentation -- FIXES_APPLIED.md: Detailed tracking of all fixes -- README.md: Lab topology overview -- Individual leaf/spine configs: Complete configurations - ---- - -## 💡 Next Steps - -1. ✅ Deploy with fixed spine configs -2. ✅ Verify BGP is working -3. ⏳ Update leaf Port-Channel configs to access mode -4. ⏳ Configure host networking properly -5. ⏳ Test VXLAN overlay connectivity -6. ⏳ Validate L2 VXLAN (Type-2 routes) -7. ⏳ Validate L3 VXLAN (Type-5 routes) - ---- - -## 🎯 Summary - -The critical `ip routing` fix has been integrated into the main branch. You can now deploy the lab and BGP will function correctly. Additional minor fixes for host networking can be applied manually or will be automated in future config updates. diff --git a/FIXES_APPLIED.md b/FIXES_APPLIED.md deleted file mode 100644 index deabcc3..0000000 --- a/FIXES_APPLIED.md +++ /dev/null @@ -1,157 +0,0 @@ -# Fixes Applied in Main Branch - -This document tracks critical fixes that have been discovered and applied during lab deployment to ensure the EVPN-VXLAN fabric functions correctly. - -## ✅ Fixes Applied to Main Branch - -### 1. **Spine Switches - Enable IP Routing** ✅ FIXED -**Problem**: BGP was disabled on spine switches with error "BGP is disabled for VRF default" and "IP routing not enabled" - -**Fix**: Added `ip routing` command to both spine configurations -- `configs/spine1.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) -- `configs/spine2.cfg` - Added line: `ip routing` (before `service routing protocols model multi-agent`) - -**Impact**: This enables BGP to function properly on spines, allowing: -- Underlay BGP IPv4 Unicast sessions to establish -- EVPN BGP sessions to establish -- Route exchange between spines and leafs - -**Status**: ✅ **APPLIED** (commits applied to main branch) - ---- - -### 2. **Leaf Switches - MLAG Port-Channel Mode** ✅ FIXED -**Problem**: LACP bonding (`mode active`) doesn't work properly in Alpine Linux containers due to lack of kernel module support - -**Fix**: Changed from LACP to static LAG -- Changed `channel-group 1 mode active` to `channel-group 1 mode on` in all leaf configs -- This creates a static LAG that works in containerized environments - -**Status**: ✅ **ALREADY APPLIED** (pushed by user in previous commits) - ---- - -## ⏳ Remaining Issues (Pending Application) - -### 3. **Leaf Switches - Port-Channel1 Switchport Mode** ⏳ PENDING -**Problem**: Port-Channel configured as `trunk`, but Alpine containers send untagged traffic - -**Fix Needed**: Change Port-Channel1 from trunk to access mode on all leafs: -``` -interface Port-Channel1 - switchport mode access - switchport access vlan 40 # or appropriate VLAN for each VTEP -``` - -**Status**: ⏳ **NOT YET APPLIED** - Needs manual configuration or config file updates - -**Affected Files**: -- `configs/leaf1.cfg` -- `configs/leaf2.cfg` -- `configs/leaf3.cfg` -- `configs/leaf4.cfg` -- `configs/leaf5.cfg` -- `configs/leaf6.cfg` -- `configs/leaf7.cfg` -- `configs/leaf8.cfg` - ---- - -### 4. **Host Configuration - Simplified Bonding** ⏳ PENDING -**Problem**: Alpine Linux containers cannot properly configure 802.3ad LACP bonding - -**Fix Needed**: Remove bonding complexity, use single interface: -```yaml -host1: - exec: - - ip addr add 10.40.40.101/24 dev eth1 - - ip link set eth1 up -``` - -**Status**: ⏳ **NOT YET APPLIED** - Topology file needs updating - ---- - -## 📋 Summary of Issues Found - -### Issue #1: Missing `ip routing` on Spines -- **Symptoms**: - - `show ip bgp summary` returned "BGP is disabled for VRF default" - - Attempting to configure BGP showed "! IP routing not enabled" -- **Root Cause**: Arista EOS requires explicit `ip routing` command to enable L3 functionality -- **Status**: ✅ **FIXED** - -### Issue #2: LACP Bonding in Containers -- **Symptoms**: - - Port-Channel showing "waiting for LACP response" - - Host bond interface in DOWN state -- **Root Cause**: Alpine containers don't have bonding kernel modules -- **Status**: ✅ **FIXED** (by changing to static LAG) - -### Issue #3: Trunk vs Access Mode -- **Symptoms**: - - No MAC learning on switch - - Port-Channel counters showed traffic but no unicast packets -- **Root Cause**: Hosts send untagged traffic, switch expects tagged (trunk mode) -- **Status**: ⏳ **NEEDS FIXING** - ---- - -## 🚀 Deployment Instructions - -### Quick Start (Recommended) -1. Deploy with fixed spine configs: -```bash -cd ~/arista-evpn-vxlan-clab -sudo containerlab deploy -t evpn-lab.clab.yml -``` - -2. Verify BGP is working: -```bash -ssh admin@clab-arista-evpn-fabric-spine1 "show bgp evpn summary" -``` - -3. Apply remaining fixes manually or wait for config updates - -### Complete Fix (When Ready) -- Once Port-Channel and host configs are updated, redeploy topology for zero-downtime testing - ---- - -## 📊 Testing Results - -After applying spine `ip routing` fix: -- ✅ BGP underlay sessions establish (eBGP between spine-leaf, iBGP between MLAG pairs) -- ✅ BGP EVPN overlay sessions establish -- ✅ MLAG pairs form correctly (active-full, up/up) -- ✅ MAC addresses learned locally on leaf switches -- ⏳ EVPN Type-2 routes advertised (pending overlay establishment) -- ⏳ End-to-end connectivity (pending all fixes applied) - ---- - -## 💡 Key Learnings - -- The `ip routing` fix is **critical** and must be in the startup-config for clean deployments -- Static LAG (`mode on`) is more reliable than LACP in containerized environments -- Access mode port-channels work better with simple Linux containers -- For production environments with proper bonding support, LACP can be re-enabled - ---- - -## 🔗 Related Issues - -- Spine BGP not starting: Missing `ip routing` command -- MLAG port-channels not forming: LACP incompatibility -- No MAC learning: Trunk vs Access mode mismatch -- No VXLAN tunnel endpoints: Pending overlay establishment - ---- - -## ✅ Final Status - -**Spine Fixes**: COMPLETE ✅ -**MLAG Fixes**: COMPLETE ✅ -**Port-Channel Access Mode**: PENDING ⏳ -**Host Networking**: PENDING ⏳ -**EVPN Overlay**: TESTING ⏳ diff --git a/README.md b/README.md index 57f9a4b..cb5a111 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ This lab demonstrates a complete EVPN-VXLAN data center fabric with: ## 📐 Topology ``` - ┌─────────┐ ┌─────────┐ + ┌──────────┐ ┌──────────┐ │ Spine1 │ │ Spine2 │ │ AS65000 │ │ AS65000 │ └────┬────┘ └────┬────┘ @@ -218,7 +218,9 @@ show mac address-table ``` arista-evpn-vxlan-clab/ ├── README.md # This file -├── evpn-lab.clab.yml # ContainerLab topology +├── TROUBLESHOOTING.md # Troubleshooting guide +├── END_TO_END_TESTING.md # Testing procedures +├── evpn-lab.clab.yml # ContainerLab topology ├── configs/ # Device configurations │ ├── spine1.cfg │ ├── spine2.cfg @@ -230,24 +232,22 @@ arista-evpn-vxlan-clab/ │ ├── leaf6.cfg │ ├── leaf7.cfg │ └── leaf8.cfg -├── docs/ # Documentation -│ ├── configuration-guide.md -│ ├── validation-commands.md -│ └── topology-diagram.png -└── scripts/ # Helper scripts - ├── deploy.sh - ├── test-connectivity.sh - └── cleanup.sh +└── hosts/ # Host interface configurations + ├── README.md + ├── host1_interfaces + ├── host2_interfaces + ├── host3_interfaces + └── host4_interfaces ``` -## 🔧 Cleanup +## 🗑️ Cleanup ```bash # Destroy the lab sudo containerlab destroy -t evpn-lab.clab.yml # Remove all related containers and networks -sudo containerlab destroy --cleanup +sudo containerlab destroy -t evpn-lab.clab.yml --cleanup ``` ## 📚 References diff --git a/TESTING_CHECKLIST.md b/TESTING_CHECKLIST.md deleted file mode 100644 index 9aa6751..0000000 --- a/TESTING_CHECKLIST.md +++ /dev/null @@ -1,304 +0,0 @@ -# Deployment & Testing Checklist - -## ✅ What Was Fixed - -- [x] Host VLAN tagging configuration in topology file -- [x] All 4 hosts now create VLAN subinterfaces (bond0.XX) -- [x] Leaf port-channels properly configured for access mode -- [x] BGP configuration in leafs includes `ip routing` command -- [x] MLAG configurations validated on all 4 leaf pairs -- [x] VXLAN VTEP configuration in place -- [x] EVPN overlay configuration complete - -## 🚀 Deployment Steps - -### 1. Check Current Branch -```bash -cd ~/arista-evpn-vxlan-clab -git branch -git status -``` -Should show: `fix-bgp-and-mlag` branch - -### 2. Destroy Current Lab (if running) -```bash -sudo containerlab destroy -t evpn-lab.clab.yml --cleanup -``` - -### 3. Deploy Fixed Lab -```bash -sudo containerlab deploy -t evpn-lab.clab.yml -# Wait 60-90 seconds for all containers to start -``` - -### 4. Verify Lab is Running -```bash -sudo containerlab inspect -t evpn-lab.clab.yml -``` -Should show all 10 nodes (2 spines + 8 leaves + 4 hosts) as RUNNING - ---- - -## 📋 Pre-Testing Checks (Run in Order) - -### Check 1: Spine BGP Underlay -```bash -ssh admin@clab-arista-evpn-fabric-spine1 "show bgp ipv4 unicast summary" -``` -**Expected:** All 8 leaf neighbors in ESTABLISHED state -``` -10.0.1.1 4 65001 22 18 Estab 3 -10.0.1.3 4 65001 20 17 Estab 3 -10.0.1.5 4 65002 19 18 Estab 0 ← Check this, should be 0 or more -... -``` - -**Status:** ☐ Pass / ☐ Fail - ---- - -### Check 2: Leaf MLAG Status -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show mlag detail" -ssh admin@clab-arista-evpn-fabric-leaf3 "show mlag detail" -``` -**Expected:** All pairs show `MLAG is active` -``` -MLAG is active -Active per VLAN: yes -``` - -**Status:** ☐ Pass / ☐ Fail - ---- - -### Check 3: Leaf BGP EVPN -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn summary" -``` -**Expected:** Both spine neighbors in ESTABLISHED -``` -10.0.250.1 4 65000 8 9 Estab 0 -10.0.250.2 4 65000 8 8 Estab 0 -``` - -**Status:** ☐ Pass / ☐ Fail - ---- - -### Check 4: Host VLAN Interfaces -```bash -docker exec clab-arista-evpn-fabric-host1 ip -d link show bond0.40 -docker exec clab-arista-evpn-fabric-host2 ip -d link show bond0.34 -docker exec clab-arista-evpn-fabric-host3 ip -d link show bond0.40 -docker exec clab-arista-evpn-fabric-host4 ip -d link show bond0.78 -``` -**Expected:** All show VLAN tagging -``` -vlan protocol 802.1Q id 40 -``` - -**Status:** ☐ Pass / ☐ Fail - ---- - -## 🧪 Connectivity Tests - -### Test 1: Host to Gateway (VLAN40) -```bash -docker exec clab-arista-evpn-fabric-host1 ping -c 2 10.40.40.1 -docker exec clab-arista-evpn-fabric-host3 ping -c 2 10.40.40.1 -``` -**Expected:** 2/2 packets successful -**Status:** ☐ Pass / ☐ Fail -**Time:** ~5 seconds - ---- - -### Test 2: L2 VXLAN Connectivity (Host1 → Host3) -```bash -docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103 -``` -**Expected:** 4/4 packets successful -``` -PING 10.40.40.103 (10.40.40.103): 56 data bytes -64 bytes from 10.40.40.103: seq=0 ttl=64 time=X.XXms -``` -**Status:** ☐ Pass / ☐ Fail -**Time:** ~10 seconds - ---- - -### Test 3: MAC Learning on Leaf1 -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show mac address-table vlan 40" -``` -**Expected:** At least 1 MAC learned -``` -Vlan Mac Address Type Ports -40 XXXX.XXXX.XXXX DYNAMIC Po1 -``` -**Status:** ☐ Pass / ☐ Fail - ---- - -### Test 4: Remote MAC Learning via VXLAN -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show vxlan address-table vlan 40" -``` -**Expected:** MAC from host3 learned via Vxlan1 -``` -VLAN Mac Address Type Prt VTEP -40 XXXX.XXXX.XXXX EVPN Vx1 10.0.255.13 -``` -**Status:** ☐ Pass / ☐ Fail - ---- - -### Test 5: EVPN Type-2 Routes -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show bgp evpn route-type mac-ip | head -20" -``` -**Expected:** Both local and remote MACs advertised -``` -RD: 65001:110040 mac-ip XXXX.XXXX.XXXX - - - -RD: 65003:110040 mac-ip XXXX.XXXX.XXXX - 10.0.255.13 -``` -**Status:** ☐ Pass / ☐ Fail - ---- - -### Test 6: Host to Gateway (VLAN34) -```bash -docker exec clab-arista-evpn-fabric-host2 ping -c 2 10.34.34.1 -``` -**Expected:** 2/2 packets successful -**Status:** ☐ Pass / ☐ Fail -**Time:** ~5 seconds - ---- - -### Test 7: L3 VXLAN Connectivity (Host2 → Host4) -```bash -docker exec clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104 -``` -**Expected:** 4/4 packets successful -**Status:** ☐ Pass / ☐ Fail -**Time:** ~10 seconds - ---- - -### Test 8: VRF Routing on Leaf3 -```bash -ssh admin@clab-arista-evpn-fabric-leaf3 "show ip route vrf gold" -``` -**Expected:** Routes to both 10.34.34.0/24 and 10.78.78.0/24 -``` -C 10.34.34.0/24 is directly connected, Vlan34 -B E 10.78.78.0/24 [200/0] via VTEP 10.0.255.14 -``` -**Status:** ☐ Pass / ☐ Fail - ---- - -### Test 9: EVPN Type-5 Routes -```bash -ssh admin@clab-arista-evpn-fabric-leaf3 "show bgp evpn route-type ip-prefix ipv4" -``` -**Expected:** IP prefixes for both VTEPs -``` -RD: 10.0.250.13:1 ip-prefix 10.34.34.0/24 -RD: 10.0.250.17:1 ip-prefix 10.78.78.0/24 -``` -**Status:** ☐ Pass / ☐ Fail - ---- - -## 📊 Summary Table - -| Component | Check | Expected | Actual | Status | -|-----------|-------|----------|--------|--------| -| Spine BGP | All leaves established | 8/8 ESTAB | ? | ☐ | -| Leaf MLAG | Pair status | active/active | ? | ☐ | -| EVPN | Spine peers | 2/2 ESTAB | ? | ☐ | -| Host Interfaces | VLAN tags | 4 VLAN ifaces | ? | ☐ | -| L2 Gateway | Ping host→gw | 2/2 success | ? | ☐ | -| L2 VXLAN | Host1→Host3 | 4/4 success | ? | ☐ | -| MAC Learning | Leaf1 VLAN40 | ≥1 MAC | ? | ☐ | -| Remote MACs | VXLAN table | MACs from Vx1 | ? | ☐ | -| Type-2 Routes | EVPN MACs | Local + Remote | ? | ☐ | -| L3 Gateway | Ping host→gw | 2/2 success | ? | ☐ | -| L3 VXLAN | Host2→Host4 | 4/4 success | ? | ☐ | -| VRF Routes | Leaf3 VRF gold | 2+ routes | ? | ☐ | -| Type-5 Routes | EVPN prefixes | Local + Remote | ? | ☐ | - ---- - -## 🔧 If Tests Fail - -### L2 ping fails -```bash -# 1. Check host VLAN interface -docker exec clab-arista-evpn-fabric-host1 ip addr show bond0.40 -# Should show: inet 10.40.40.101/24 dev bond0.40 - -# 2. Check port-channel status -ssh admin@clab-arista-evpn-fabric-leaf1 "show interface Port-Channel1" -# Should show: up, up - -# 3. Check VLAN 40 exists on leaf -ssh admin@clab-arista-evpn-fabric-leaf1 "show vlan 40" -# Should show: VLAN 40 exists - -# 4. Check MAC learning (generate traffic) -docker exec clab-arista-evpn-fabric-host1 arping -c 3 10.40.40.1 -ssh admin@clab-arista-evpn-fabric-leaf1 "show mac address-table vlan 40" -# Should show host1 MAC -``` - -### L3 ping fails -```bash -# 1. Check VRF VLAN interface -ssh admin@clab-arista-evpn-fabric-leaf3 "show interface Vlan34" -# Should show: up, up - -# 2. Check VRF routing enabled -ssh admin@clab-arista-evpn-fabric-leaf3 "show ip route vrf gold" -# Should show routes - -# 3. Check VXLAN VRF mapping -ssh admin@clab-arista-evpn-fabric-leaf3 "show interface Vxlan1" -# Should show: vxlan vrf gold vni 100001 -``` - ---- - -## 📝 Notes for Next Steps - -1. **If all tests pass** ✅ - - Create pull request to merge `fix-bgp-and-mlag` into `main` - - Document the changes in FIXES_APPLIED.md - - Update main branch documentation - -2. **If specific tests fail** ⚠️ - - Review the troubleshooting section above - - Check device logs: `show log` - - Review configuration with `show running-config` - -3. **Keep for reference** - - END_TO_END_TESTING.md - Comprehensive testing guide - - VLAN_TAGGING_FIX_EXPLANATION.md - Explains the root cause and fix - ---- - -## 🎯 Success Criteria - -**Lab is ready for production use when:** -- ✓ All pre-testing checks pass -- ✓ All 9 connectivity tests pass -- ✓ No errors in device logs -- ✓ MLAG is active/active on all pairs -- ✓ BGP neighbors all established -- ✓ EVPN routes being advertised diff --git a/VLAN_TAGGING_FIX_EXPLANATION.md b/VLAN_TAGGING_FIX_EXPLANATION.md deleted file mode 100644 index 29d5441..0000000 --- a/VLAN_TAGGING_FIX_EXPLANATION.md +++ /dev/null @@ -1,167 +0,0 @@ -# Quick Diagnostic: Why Hosts Weren't Talking - -## The Problem - -You were getting **empty MAC tables and no ping replies** when testing end-to-end connectivity between hosts. The root cause was **VLAN tagging mismatch** between hosts and leaf switches. - -## The Mismatch Explained - -### ❌ OLD Configuration (Broken) - -**Hosts were sending untagged traffic:** -```yaml -host1: - exec: - - ip link add bond0 type bond mode balance-rr - - ip link set eth1 master bond0 - - ip link set eth2 master bond0 - - ip link set bond0 up - - ip addr add 10.40.40.101/24 dev bond0 # ← UNTAGGED traffic! -``` - -**Leaf switches expected VLAN-tagged traffic:** -``` -interface Port-Channel1 - switchport mode access - switchport access vlan 40 # ← Expecting tagged VLAN 40! - mlag 1 -``` - -### Traffic Flow (Broken): -``` -Host1 (untagged) - ↓ -eth1/eth2 (bonds) - ↓ -Leaf1 Port-Channel1 (access VLAN 40) - ↓ -Traffic dropped because VLAN doesn't match! - ↗ No MAC learning - ↗ No connectivity -``` - ---- - -## ✅ NEW Configuration (Fixed) - -**Hosts now send VLAN-tagged traffic:** -```yaml -host1: - exec: - - ip link add bond0 type bond mode balance-rr - - ip link set eth1 master bond0 - - ip link set eth2 master bond0 - - ip link set bond0 up - # Create VLAN 40 subinterface - - ip link add link bond0 name bond0.40 type vlan id 40 - - ip link set bond0.40 up - - ip addr add 10.40.40.101/24 dev bond0.40 # ← TAGGED traffic! -``` - -**Leaf switches expect VLAN-tagged traffic:** -``` -interface Port-Channel1 - switchport mode access - switchport access vlan 40 # ← Now matches! - mlag 1 -``` - -### Traffic Flow (Fixed): -``` -Host1 (VLAN 40 tagged) - ↓ -bond0.40 interface (sends tagged frames) - ↓ -eth1/eth2 (carries tagged traffic) - ↓ -Leaf1 Port-Channel1 (access VLAN 40) - ↓ -Frames untagged and placed in VLAN 40 - ↓ -Switches forward in VLAN 40 - ↓ -VXLAN encapsulation for remote VTEP - ↓ -✓ MAC learning works - ✓ Connectivity established -``` - ---- - -## VLAN Tagging Mapping - -| Host | Interface | VLAN Tag | Purpose | Test | -|------|-----------|----------|---------|------| -| host1 | bond0.40 | 40 | L2 VXLAN test | Ping host3 | -| host2 | bond0.34 | 34 | L3 VXLAN (VRF gold) VLAN | Ping host4 | -| host3 | bond0.40 | 40 | L2 VXLAN test | Ping host1 | -| host4 | bond0.78 | 78 | L3 VXLAN (VRF gold) VLAN | Ping host2 | - ---- - -## Why This Works - -### Layer 2 Switching Basics - -When a **Linux host sends traffic on a VLAN subinterface** (e.g., `bond0.40`): -1. The interface **adds a VLAN tag (802.1Q)** to the Ethernet frame -2. Frame contains: `[Dest MAC][Source MAC][**VLAN Tag (40)**][Type][Data]` - -When a **Leaf switch receives the tagged frame**: -1. It reads the VLAN tag (40) -2. The frame matches the port's access VLAN (40) -3. Frame is **untagged** and forwarded in VLAN 40 -4. Switch learns MAC and floods/forwards appropriately - ---- - -## Testing the Fix - -```bash -# 1. Verify host VLAN interface exists -docker exec clab-arista-evpn-fabric-host1 ip -d link show bond0.40 -# Expected: vlan protocol 802.1Q id 40 - -# 2. Verify host has IP on VLAN interface -docker exec clab-arista-evpn-fabric-host1 ip addr show bond0.40 -# Expected: inet 10.40.40.101/24 dev bond0.40 - -# 3. Ping the gateway (virtual router on Leaf) -docker exec clab-arista-evpn-fabric-host1 ping -c 1 10.40.40.1 -# Expected: Should get reply from leaf VLAN40 gateway - -# 4. Ping remote host -docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103 -# Expected: 4/4 packets successful -``` - ---- - -## Key Files Changed - -1. **evpn-lab.clab.yml** - - Updated all 4 host definitions with VLAN subinterface configuration - - Each host now creates and configures its own VLAN tagged interface - -2. **END_TO_END_TESTING.md** (new) - - Comprehensive testing guide for all connectivity scenarios - - Troubleshooting procedures - - Expected results validation - ---- - -## Why VLAN Tagging is Required Here - -The topology uses **access mode port-channels on leafs** because: - -1. **Each host has a single VLAN** (no trunk needed) -2. **VLAN tagging from the host side** is cleaner than reconfiguring leaf ports -3. **Matches production design** where hosts are single-VLAN attached -4. **Avoids manual leaf reconfiguration** after deployment - -Alternative approach (NOT used): -- Could change leaf port-channels to trunk mode -- Would require manually configuring allowed VLANs -- More complex and less automated - -This is the automated, repeatable approach that avoids manual post-deployment configuration. diff --git a/docs/HOST_INTERFACE_CONFIGURATION.md b/docs/HOST_INTERFACE_CONFIGURATION.md deleted file mode 100644 index 02f7a93..0000000 --- a/docs/HOST_INTERFACE_CONFIGURATION.md +++ /dev/null @@ -1,154 +0,0 @@ -# Host Interface Configuration Guide - -## Overview - -All four hosts in the lab use **persistent interface configuration files** mounted via ContainerLab's `binds` feature. This approach provides cleaner, more maintainable configuration compared to using `exec` commands. - -## Architecture - -### Dual-Homing with LACP Bonding - -Each host is dual-homed to an MLAG pair of leaf switches: -- **host1**: dual-homed to leaf1 + leaf2 (VTEP1) -- **host2**: dual-homed to leaf3 + leaf4 (VTEP2) -- **host3**: dual-homed to leaf5 + leaf6 (VTEP3) -- **host4**: dual-homed to leaf7 + leaf8 (VTEP4) - -### VLAN Configuration - -Hosts handle VLAN tagging using sub-interfaces on the bond: - -| Host | VLAN | IP Address | Purpose | VRF | -|------|------|------------|---------|-----| -| host1 | 40 | 10.40.40.101/24 | L2 VXLAN test | default | -| host2 | 34 | 10.34.34.102/24 | L3 VXLAN test | gold | -| host3 | 40 | 10.40.40.103/24 | L2 VXLAN test | default | -| host4 | 78 | 10.78.78.104/24 | L3 VXLAN test | gold | - -## Interface Files Structure - -Each host has a configuration file in `hosts/` directory: -- `hosts/host1_interfaces` → mounted to `/etc/network/interfaces` in host1 -- `hosts/host2_interfaces` → mounted to `/etc/network/interfaces` in host2 -- `hosts/host3_interfaces` → mounted to `/etc/network/interfaces` in host3 -- `hosts/host4_interfaces` → mounted to `/etc/network/interfaces` in host4 - -## Interface Configuration Format - -### Example: host1_interfaces - -``` -auto lo -iface lo inet loopback - -# Bond interface with LACP (802.3ad) -auto bond0 -iface bond0 inet manual - bond-mode 4 - bond-miimon 100 - bond-lacp-rate 1 - bond-slaves eth1 eth2 - -# VLAN 40 on bond0 -auto bond0.40 -iface bond0.40 inet static - address 10.40.40.101 - netmask 255.255.255.0 - vlan-raw-device bond0 -``` - -### Key Parameters Explained - -**Bond Configuration:** -- `bond-mode 4`: LACP (802.3ad) mode - requires LACP on switch side -- `bond-miimon 100`: Link monitoring interval (100ms) -- `bond-lacp-rate 1`: Fast LACP (1 second intervals) -- `bond-slaves eth1 eth2`: Physical interfaces in the bond - -**VLAN Sub-interface:** -- `bond0.40`: VLAN interface notation (bond0.VLAN_ID) -- `vlan-raw-device bond0`: Parent interface for VLAN -- Static IP configuration with address/netmask - -## Deployment Process - -When ContainerLab starts a host: - -1. **Mount interface file** via binds -2. **Install packages**: `apk add ifupdown bonding vlan` -3. **Load kernel modules**: - - `modprobe bonding` - enables LACP bonding - - `modprobe 8021q` - enables VLAN tagging -4. **Bring up interfaces**: `ifup -a` reads `/etc/network/interfaces` - -## Switch Configuration Requirements - -For proper LACP operation, leaf switches must have: - -``` -interface Port-Channel1 - description host-X - switchport mode trunk - switchport trunk allowed vlan - mlag 1 - port-channel lacp fallback timeout 5 - port-channel lacp fallback individual - no shutdown - -interface Ethernet1 - description host-X-link1 - channel-group 1 mode active - lacp timer fast - no shutdown -``` - -**Critical settings:** -- `port-channel lacp fallback`: Required for ContainerLab timing -- `lacp timer fast`: Matches host's fast LACP rate -- `no shutdown`: Must explicitly enable Port-Channel interface - -## Advantages of This Approach - -1. **Persistence**: Configuration survives container restarts -2. **Clarity**: Single file shows complete network config -3. **Maintainability**: Easy to modify VLAN assignments -4. **Production-like**: Mirrors real-world dual-homing scenarios -5. **Clean deployment**: No manual post-deployment fixes needed - -## Testing Connectivity - -### L2 VXLAN (same VLAN) -```bash -# host1 (VLAN 40) → host3 (VLAN 40) -docker exec clab-arista-evpn-fabric-host1 ping -c 4 10.40.40.103 -``` - -### L3 VXLAN (inter-VRF) -```bash -# host2 (VLAN 34, VRF gold) → host4 (VLAN 78, VRF gold) -docker exec clab-arista-evpn-fabric-host2 ping -c 4 10.78.78.104 -``` - -## Troubleshooting - -### Verify bond status on host -```bash -docker exec clab-arista-evpn-fabric-host1 cat /proc/net/bonding/bond0 -``` - -### Check VLAN interface -```bash -docker exec clab-arista-evpn-fabric-host1 ip addr show bond0.40 -``` - -### Verify LACP on switch -```bash -ssh admin@clab-arista-evpn-fabric-leaf1 "show port-channel 1 detailed" -``` - -## References - -- Alpine Linux ifupdown-ng documentation -- Linux bonding documentation: `/usr/src/linux/Documentation/networking/bonding.txt` -- Arista MLAG configuration guide -- srl-labs/srl-evpn-mh-lab (reference implementation) diff --git a/docs/configuration-guide.md b/docs/configuration-guide.md deleted file mode 100644 index bc24229..0000000 --- a/docs/configuration-guide.md +++ /dev/null @@ -1,400 +0,0 @@ -# Configuration Guide - -This guide walks through the key configuration concepts used in this EVPN-VXLAN lab. - -## Table of Contents -- [Architecture Overview](#architecture-overview) -- [Underlay Configuration](#underlay-configuration) -- [Overlay Configuration](#overlay-configuration) -- [MLAG Configuration](#mlag-configuration) -- [L2 VXLAN Configuration](#l2-vxlan-configuration) -- [L3 VXLAN Configuration](#l3-vxlan-configuration) -- [Best Practices](#best-practices) - -## Architecture Overview - -### Topology Design -- **Spine-Leaf Architecture**: 2 Spines, 8 Leafs forming 4 VTEPs -- **Underlay**: BGP with eBGP between Spine-Leaf, iBGP between MLAG pairs -- **Overlay**: BGP EVPN for control plane -- **Data Plane**: VXLAN encapsulation - -### AS Number Scheme -``` -Spine: AS 65000 -VTEP1: AS 65001 (Leaf1/Leaf2) -VTEP2: AS 65002 (Leaf3/Leaf4) -VTEP3: AS 65003 (Leaf5/Leaf6) -VTEP4: AS 65004 (Leaf7/Leaf8) -``` - -### IP Addressing Plan -``` -Management: 172.16.0.0/24 -Router-ID Loopbacks: 10.0.250.0/24 -VTEP Loopbacks: 10.0.255.0/24 -Spine1 P2P Links: 10.0.1.0/24 -Spine2 P2P Links: 10.0.2.0/24 -MLAG iBGP Peering: 10.0.3.0/24 -MLAG Peer-Link: 10.0.199.0/24 -``` - -## Underlay Configuration - -### 1. Enable Multi-Agent Routing Protocol Model - -Required for EVPN to function properly: - -``` -service routing protocols model multi-agent -``` - -### 2. Configure Loopback Interfaces - -Each device needs two loopbacks: - -``` -! Router-ID Loopback (unique per device) -interface Loopback0 - ip address 10.0.250.x/32 - -! VTEP Loopback (shared within MLAG pair) -interface Loopback1 - ip address 10.0.255.x/32 -``` - -### 3. Configure Point-to-Point Interfaces - -Use /31 subnets for efficiency: - -``` -interface Ethernet11 - description spine1 - no switchport - ip address 10.0.1.1/31 - mtu 9214 -``` - -### 4. Configure BGP Underlay - -#### On Spines: -``` -router bgp 65000 - router-id 10.0.250.1 - no bgp default ipv4-unicast - distance bgp 20 200 200 - - neighbor 10.0.1.1 remote-as 65001 - neighbor 10.0.1.3 remote-as 65001 - # ... more neighbors - - address-family ipv4 - neighbor 10.0.1.1 activate - network 10.0.250.1/32 - maximum-paths 4 ecmp 64 -``` - -#### On Leafs: -``` -router bgp 65001 - router-id 10.0.250.11 - no bgp default ipv4-unicast - distance bgp 20 200 200 - - neighbor underlay peer group - neighbor underlay remote-as 65000 - neighbor 10.0.1.0 peer group underlay - neighbor 10.0.2.0 peer group underlay - - address-family ipv4 - neighbor underlay activate - network 10.0.250.11/32 - network 10.0.255.11/32 - maximum-paths 4 ecmp 64 -``` - -### Why These Settings? - -- **no bgp default ipv4-unicast**: Requires explicit activation per address family -- **distance bgp 20 200 200**: eBGP=20, iBGP=200, Local=200 (prefer eBGP routes) -- **maximum-paths 4 ecmp 64**: Enable ECMP with up to 4 paths -- **mtu 9214**: Support jumbo frames for VXLAN overhead - -## Overlay Configuration - -### 1. Configure EVPN Neighbors - -#### On Leafs: -``` -router bgp 65001 - neighbor evpn peer group - neighbor evpn remote-as 65000 - neighbor evpn update-source Loopback0 - neighbor evpn ebgp-multihop 3 - neighbor evpn send-community extended - neighbor 10.0.250.1 peer group evpn - neighbor 10.0.250.2 peer group evpn - - address-family evpn - neighbor evpn activate -``` - -#### On Spines: -``` -router bgp 65000 - neighbor evpn peer group - neighbor evpn next-hop-unchanged - neighbor evpn update-source Loopback0 - neighbor evpn ebgp-multihop 3 - neighbor evpn send-community extended - - neighbor 10.0.250.11 peer group evpn - neighbor 10.0.250.11 remote-as 65001 - # ... more neighbors - - address-family evpn - neighbor evpn activate -``` - -### Why These Settings? - -- **update-source Loopback0**: Use loopback for stable peering -- **ebgp-multihop 3**: Allow multi-hop eBGP through underlay -- **send-community extended**: Required for EVPN route-targets -- **next-hop-unchanged**: On spines, preserve original next-hop for optimal routing - -### 2. Configure VXLAN Interface - -``` -interface Vxlan1 - vxlan source-interface Loopback1 - vxlan udp-port 4789 - vxlan learn-restrict any -``` - -- **source-interface Loopback1**: Use VTEP loopback as source -- **udp-port 4789**: Standard VXLAN port -- **learn-restrict any**: Use EVPN control plane only (no data plane learning) - -## MLAG Configuration - -### 1. Configure MLAG VLANs - -``` -vlan 4090 - name mlag-peer - trunk group mlag-peer - -vlan 4091 - name mlag-ibgp - trunk group mlag-peer -``` - -### 2. Configure MLAG SVIs - -``` -interface Vlan4090 - description MLAG Peer-Link - ip address 10.0.199.254/31 - no autostate - -interface Vlan4091 - description MLAG iBGP Peering - ip address 10.0.3.0/31 - mtu 9214 -``` - -### 3. Configure Peer-Link - -``` -interface Ethernet10 - channel-group 999 mode active - -interface Port-Channel999 - switchport mode trunk - switchport trunk group mlag-peer - spanning-tree link-type point-to-point -``` - -### 4. Configure MLAG Domain - -``` -mlag configuration - domain-id leafs - local-interface Vlan4090 - peer-address 10.0.199.255 - peer-link Port-Channel999 - dual-primary detection delay 10 action errdisable all-interfaces - peer-address heartbeat 172.16.0.50 vrf mgmt -``` - -### 5. Configure iBGP Between MLAG Peers - -``` -router bgp 65001 - neighbor underlay_ibgp peer group - neighbor underlay_ibgp remote-as 65001 - neighbor underlay_ibgp next-hop-self - neighbor 10.0.3.1 peer group underlay_ibgp - - address-family ipv4 - neighbor underlay_ibgp activate -``` - -### 6. Configure Virtual Router MAC - -``` -ip virtual-router mac-address c001.cafe.babe -``` - -This MAC is used for anycast gateway functionality across the MLAG pair. - -## L2 VXLAN Configuration - -For extending Layer 2 domains across the fabric: - -### 1. Create VLAN - -``` -vlan 40 - name test-l2-vxlan -``` - -### 2. Map VLAN to VNI - -``` -interface Vxlan1 - vxlan vlan 40 vni 110040 -``` - -### 3. Configure BGP EVPN for VLAN - -``` -router bgp 65001 - vlan 40 - rd 65001:110040 - route-target both 40:110040 - redistribute learned -``` - -### Key Concepts - -- **VNI (VXLAN Network Identifier)**: 24-bit segment ID (110040) -- **RD (Route Distinguisher)**: Makes routes unique (AS:VNI format) -- **RT (Route Target)**: Controls route import/export (VLAN:VNI format) -- **redistribute learned**: Advertise locally learned MAC addresses - -## L3 VXLAN Configuration - -For routing between VRFs across the fabric: - -### 1. Create VRF - -``` -vrf instance gold - -ip routing vrf gold -``` - -### 2. Map VRF to VNI - -``` -interface Vxlan1 - vxlan vrf gold vni 100001 -``` - -### 3. Configure VRF VLAN Interface - -``` -vlan 34 - name vrf-gold-subnet - -interface Vlan34 - vrf gold - ip address 10.34.34.2/24 - ip virtual-router address 10.34.34.1 -``` - -### 4. Configure BGP for VRF - -``` -router bgp 65002 - vrf gold - rd 10.0.250.13:1 - route-target import evpn 1:100001 - route-target export evpn 1:100001 - redistribute connected -``` - -### Key Concepts - -- **VRF**: Virtual Routing and Forwarding instance -- **L3 VNI**: VNI for routing between VRFs -- **Anycast Gateway**: Same gateway IP/MAC on both MLAG peers -- **Type-5 Routes**: EVPN IP prefix routes for inter-subnet routing - -## Best Practices - -### IP Addressing -1. Use consistent /31 for P2P links -2. Reserve /32 blocks for loopbacks -3. Use non-overlapping private address space - -### BGP Configuration -1. Always use peer groups for scalability -2. Set appropriate maximum-routes limits -3. Enable logging for troubleshooting -4. Use `distance bgp 20 200 200` for predictable behavior - -### VXLAN/EVPN -1. Use meaningful VNI numbers (e.g., 1XXYYY where XX is VLAN/VRF) -2. Keep RD unique per device -3. Keep RT consistent across devices in same domain -4. Enable `vxlan learn-restrict any` to avoid data-plane learning - -### MLAG -1. Always configure dual-active detection -2. Use trunk groups to isolate MLAG VLANs -3. Configure iBGP between peers for redundancy -4. Use consistent domain-id across pairs - -### MTU -1. Set MTU to 9214 on underlay links for VXLAN overhead -2. Ensure consistent MTU across the fabric -3. Account for 50-byte VXLAN header overhead - -### Security -1. Change default passwords immediately -2. Configure management VRF -3. Use authentication for BGP peers (not shown in lab configs) -4. Implement prefix-lists and route-maps in production - -## Verification Checklist - -After configuration, verify: - -- [ ] All BGP neighbors established -- [ ] Loopbacks reachable via underlay -- [ ] EVPN routes being exchanged -- [ ] MLAG state is Active -- [ ] VXLAN interface is up -- [ ] Remote VTEPs discovered -- [ ] MAC addresses learned via EVPN -- [ ] VRF routing working end-to-end - -Refer to [validation-commands.md](validation-commands.md) for detailed verification steps. - -## Troubleshooting Tips - -1. **No BGP neighbors**: Check IP connectivity and firewall rules -2. **No EVPN routes**: Verify `send-community extended` is configured -3. **No MAC learning**: Check VNI mapping and route-targets -4. **MLAG not working**: Verify peer-link and domain-id match -5. **No VXLAN traffic**: Check MTU and VNI configuration - -## References - -- [Arista EVPN Design Guide](https://www.arista.com/en/solutions/design-guides) -- [RFC 7432 - BGP MPLS-Based Ethernet VPN](https://tools.ietf.org/html/rfc7432) -- [RFC 8365 - A Network Virtualization Overlay Solution Using EVPN](https://tools.ietf.org/html/rfc8365) -- [Original Blog Post](https://overlaid.net/2019/01/27/arista-bgp-evpn-configuration-example/) diff --git a/docs/quick-reference.md b/docs/quick-reference.md deleted file mode 100644 index b1b5ac9..0000000 --- a/docs/quick-reference.md +++ /dev/null @@ -1,288 +0,0 @@ -# Quick Reference Guide - -Quick commands and references for the Arista EVPN-VXLAN lab. - -## Quick Start - -```bash -# Deploy lab -sudo containerlab deploy -t evpn-lab.clab.yml - -# Check status -sudo containerlab inspect -t evpn-lab.clab.yml - -# Destroy lab -sudo containerlab destroy -t evpn-lab.clab.yml -``` - -## Using Helper Scripts - -```bash -# Make scripts executable -chmod +x scripts/*.sh - -# Interactive deployment menu -sudo ./scripts/deploy.sh - -# Direct commands -sudo ./scripts/deploy.sh deploy -sudo ./scripts/deploy.sh status -sudo ./scripts/deploy.sh validate - -# Test connectivity -sudo bash scripts/test-connectivity.sh - -# Cleanup -sudo bash scripts/cleanup.sh -``` - -## Device Access - -### SSH Access -```bash -ssh admin@clab-arista-evpn-fabric-spine1 -ssh admin@clab-arista-evpn-fabric-leaf1 -# Password: admin -``` - -### Docker Exec -```bash -docker exec -it clab-arista-evpn-fabric-spine1 Cli -docker exec -it clab-arista-evpn-fabric-leaf1 Cli -``` - -## Management IPs - -| Device | Management IP | Loopback0 | Loopback1 | -|---------|---------------|----------------|---------------| -| spine1 | 172.16.0.1 | 10.0.250.1 | N/A | -| spine2 | 172.16.0.2 | 10.0.250.2 | N/A | -| leaf1 | 172.16.0.25 | 10.0.250.11 | 10.0.255.11 | -| leaf2 | 172.16.0.50 | 10.0.250.12 | 10.0.255.11 | -| leaf3 | 172.16.0.27 | 10.0.250.13 | 10.0.255.12 | -| leaf4 | 172.16.0.28 | 10.0.250.14 | 10.0.255.12 | -| leaf5 | 172.16.0.29 | 10.0.250.15 | 10.0.255.13 | -| leaf6 | 172.16.0.30 | 10.0.250.16 | 10.0.255.13 | -| leaf7 | 172.16.0.31 | 10.0.250.17 | 10.0.255.14 | -| leaf8 | 172.16.0.32 | 10.0.250.18 | 10.0.255.14 | - -## AS Numbers - -| Device Pair | AS Number | -|------------|-----------| -| Spines | 65000 | -| Leaf1/2 | 65001 | -| Leaf3/4 | 65002 | -| Leaf5/6 | 65003 | -| Leaf7/8 | 65004 | - -## VNI Mapping - -| VLAN/VRF | VNI | Type | VTEPs | -|----------|--------|------|----------| -| VLAN 40 | 110040 | L2 | 1, 3 | -| VRF gold | 100001 | L3 | 2, 4 | -| VLAN 34 | - | L3 | 2 | -| VLAN 78 | - | L3 | 4 | - -## Essential Show Commands - -### Quick Status Check -```bash -show ip interface brief -show bgp summary -show bgp evpn summary -show mlag -show vxlan vtep -``` - -### Detailed Verification -```bash -# Underlay -show ip bgp -show ip route -show bgp ipv4 unicast summary - -# Overlay -show bgp evpn -show bgp evpn route-type mac-ip -show bgp evpn route-type ip-prefix ipv4 - -# VXLAN -show interface vxlan1 -show vxlan address-table -show vxlan vni -show vxlan config-sanity - -# MLAG -show mlag detail -show mlag interfaces -show port-channel summary - -# VRF -show vrf -show ip route vrf gold -show bgp ipv4 unicast vrf gold summary -``` - -## Common Troubleshooting Commands - -```bash -# Check BGP neighbors -show ip bgp neighbors -show bgp evpn neighbors - -# Check routes -show ip route detail -show bgp evpn detail - -# Check counters -show interfaces counters errors -show vxlan counters - -# Check logs -show logging -show logging last 50 - -# Packet capture -bash tcpdump -i et11 -n port 179 -bash tcpdump -i et11 -n port 4789 -``` - -## Configuration Snippets - -### Save Configuration -```bash -write memory -# or -copy running-config startup-config -``` - -### View Configuration -```bash -show running-config -show running-config | section bgp -show running-config | section vxlan -``` - -### Enable Configuration Mode -```bash -enable -configure terminal -``` - -## Testing Connectivity - -### From Leaf Devices -```bash -# Ping loopbacks -ping 10.0.250.1 -ping 10.0.255.13 - -# Ping in VRF -ping vrf gold 10.78.78.1 - -# Traceroute -traceroute 10.0.255.14 -traceroute vrf gold 10.34.34.1 -``` - -### From Host Containers -```bash -# Enter host container -docker exec -it clab-arista-evpn-fabric-host1 sh - -# Test connectivity -ping 10.40.40.1 -``` - -## Performance Monitoring - -```bash -# Interface statistics -show interfaces ethernet 11 counters -show interfaces ethernet 11 counters rate - -# BGP statistics -show bgp evpn summary -show bgp evpn route-type mac-ip | count - -# System resources -show processes top -show version -``` - -## Useful Filters - -```bash -# Grep examples -show bgp evpn summary | grep Estab -show interfaces status | include up -show running-config | section vxlan - -# JSON output (for automation) -show bgp evpn summary | json -show interfaces status | json -``` - -## Lab Topology Reference - -``` - Spine1 -------- Spine2 - | | - +---------+-----------+---+----------+ - | | | | - Leaf1/2 Leaf3/4 Leaf5/6 Leaf7/8 - (VTEP1) (VTEP2) (VTEP3) (VTEP4) - | | | | - Host1 Host2 Host3 Host4 -``` - -## Feature Matrix - -| Feature | VTEP1 | VTEP2 | VTEP3 | VTEP4 | -|------------------|-------|-------|-------|-------| -| L2 VXLAN (VLAN40)| ✓ | - | ✓ | - | -| L3 VXLAN (VRF) | - | ✓ | - | ✓ | -| BGP Border | - | - | - | ✓ | -| MLAG | ✓ | ✓ | ✓ | ✓ | - -## Keyboard Shortcuts (CLI) - -``` -Ctrl+Z - Exit to privileged EXEC mode -Ctrl+C - Interrupt current command -Tab - Command completion -? - Context-sensitive help -``` - -## Reset to Factory - -```bash -# Erase startup config -enable -bash sudo /mnt/flash/zerotouch reset - -# Or manually -enable -write erase -reload -``` - -## Additional Resources - -- Full documentation: `docs/` -- Validation commands: `docs/validation-commands.md` -- Configuration guide: `docs/configuration-guide.md` -- Helper scripts: `scripts/` - -## Support - -For issues or questions: -- Check logs: `show logging` -- Review documentation in `docs/` directory -- Original blog post: https://overlaid.net/2019/01/27/arista-bgp-evpn-configuration-example/ - ---- - -**Tip**: Bookmark this page for quick reference during lab work! diff --git a/docs/validation-commands.md b/docs/validation-commands.md deleted file mode 100644 index 78929d5..0000000 --- a/docs/validation-commands.md +++ /dev/null @@ -1,375 +0,0 @@ -# Validation Commands Guide - -This document provides a comprehensive list of commands to validate the EVPN-VXLAN fabric. - -## Table of Contents -- [Underlay Validation](#underlay-validation) -- [Overlay Validation](#overlay-validation) -- [MLAG Validation](#mlag-validation) -- [VXLAN Validation](#vxlan-validation) -- [VRF Validation](#vrf-validation) -- [Troubleshooting](#troubleshooting) - -## Underlay Validation - -### Check BGP IPv4 Unicast Neighbors - -```bash -# On Spine -show bgp ipv4 unicast summary - -# On Leaf -show bgp ipv4 unicast summary -``` - -Expected: All neighbors in `Established` state - -### Verify Loopback Reachability - -```bash -# From any leaf, ping spine loopbacks -ping 10.0.250.1 -ping 10.0.250.2 - -# From spine, ping all leaf loopbacks -ping 10.0.250.11 -ping 10.0.250.12 -# ... etc -``` - -### Check BGP Routes - -```bash -# View all BGP routes -show ip bgp - -# View routes for specific prefix -show ip bgp 10.0.250.0/24 - -# View ECMP paths -show ip route 10.0.250.11 -``` - -Expected: Multiple equal-cost paths via both spines - -### Verify Interface Status - -```bash -# Check all interfaces -show interfaces status - -# Check specific interface -show interfaces ethernet 11 -``` - -## Overlay Validation - -### Check BGP EVPN Neighbors - -```bash -# On Spine -show bgp evpn summary - -# On Leaf -show bgp evpn summary -``` - -Expected: All EVPN neighbors in `Established` state - -### View EVPN Routes - -```bash -# Show all EVPN routes -show bgp evpn - -# Show Type-2 routes (MAC/IP) -show bgp evpn route-type mac-ip - -# Show Type-5 routes (IP Prefix) -show bgp evpn route-type ip-prefix ipv4 - -# Show routes for specific VNI -show bgp evpn vni 110040 -show bgp evpn vni 100001 -``` - -### Check Route Distinguishers and Route Targets - -```bash -# View RD/RT configuration -show running-config | section bgp - -# View imported routes -show bgp evpn route-type ip-prefix ipv4 | grep RT -``` - -## MLAG Validation - -### Check MLAG Status - -```bash -# Overall MLAG status -show mlag - -# MLAG interfaces -show mlag interfaces - -# MLAG config-sanity -show mlag config-sanity -``` - -Expected output: -- State: Active -- Negotiation status: Connected -- Peer-link status: Up - -### Verify Dual-Active Detection - -```bash -# Check dual-active detection status -show mlag detail | include dual - -# Verify heartbeat -show mlag detail | include Heartbeat -``` - -### Check Port-Channel Status - -```bash -# View all port-channels -show port-channel summary - -# Detailed port-channel info -show interfaces port-channel 999 -show interfaces port-channel 1 -``` - -## VXLAN Validation - -### Check VXLAN Interface - -```bash -# VXLAN interface summary -show interface vxlan1 - -# Detailed VXLAN info -show vxlan config-sanity -``` - -### Verify VTEPs - -```bash -# Show remote VTEPs -show vxlan vtep - -# Show VXLAN VNI mapping -show vxlan vni - -# Show flood VTEPs -show vxlan flood vtep -``` - -### Check VXLAN Address Table - -```bash -# Show all MAC addresses learned via VXLAN -show vxlan address-table - -# Show MAC addresses for specific VLAN -show mac address-table vlan 40 - -# Show MAC addresses for specific VNI -show vxlan address-table vni 110040 -``` - -### Verify Overlay Learning - -```bash -# Check if EVPN control plane is learning MACs -show bgp evpn route-type mac-ip - -# Compare with local MAC table -show mac address-table dynamic -``` - -## VRF Validation - -### Check VRF Configuration - -```bash -# List all VRFs -show vrf - -# VRF routing table -show ip route vrf gold - -# VRF interfaces -show ip interface vrf gold brief -``` - -### Verify VRF BGP - -```bash -# BGP summary for VRF -show bgp ipv4 unicast vrf gold summary - -# BGP routes in VRF -show bgp ipv4 unicast vrf gold -``` - -### Test VRF Connectivity - -```bash -# Ping from VRF -ping vrf gold 10.78.78.78 - -# Traceroute in VRF -traceroute vrf gold 10.78.78.78 -``` - -### Check VNI to VRF Mapping - -```bash -# Show VRF to VNI mapping -show vxlan vrf - -# Show Type-5 routes for VRF -show bgp evpn route-type ip-prefix ipv4 vrf gold -``` - -## Troubleshooting - -### General Health Checks - -```bash -# System health -show version -show inventory -show environment all - -# Check for errors -show logging -show interfaces counters errors -``` - -### BGP Troubleshooting - -```bash -# BGP process status -show ip bgp summary - -# BGP neighbor details -show ip bgp neighbors 10.0.250.1 - -# BGP update messages -show bgp evpn neighbors 10.0.250.1 advertised-routes -show bgp evpn neighbors 10.0.250.1 received-routes -``` - -### VXLAN Troubleshooting - -```bash -# VXLAN counters -show interfaces vxlan1 counters - -# VXLAN flood list -show vxlan flood vtep - -# Check for VXLAN errors -show vxlan counters -``` - -### MLAG Troubleshooting - -```bash -# MLAG detailed status -show mlag detail - -# MLAG inconsistencies -show mlag config-sanity - -# Port-channel LACP status -show lacp interface -show lacp neighbor -``` - -### Packet Capture - -```bash -# Capture BGP packets -bash tcpdump -i et11 -n port 179 - -# Capture VXLAN packets -bash tcpdump -i et11 -n port 4789 - -# Capture on VXLAN interface -monitor session vxlan source vxlan1 both -``` - -## Useful Show Commands by Category - -### Quick Status Commands -```bash -show ip interface brief -show bgp summary -show vxlan vtep -show mlag -``` - -### Detailed Analysis Commands -```bash -show tech-support -show running-config -show ip route detail -show bgp evpn detail -``` - -### Real-time Monitoring -```bash -watch 1 show bgp evpn summary -watch 1 show vxlan address-table -watch 1 show mlag -``` - -## Expected Normal Output Examples - -### Healthy BGP EVPN Summary (Leaf) -``` -Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc -10.0.250.1 4 65000 50 48 0 0 00:24:30 Estab 10 10 -10.0.250.2 4 65000 49 47 0 0 00:24:25 Estab 10 10 -``` - -### Healthy MLAG Status -``` -MLAG Status: -state : Active -negotiation status : Connected -peer-link status : Up -local-int status : Up -system-id : c0:01:ca:fe:ba:be -dual-primary detection : Configured -``` - -### Healthy VXLAN Interface -``` -Vxlan1 is up, line protocol is up (connected) - Hardware is Vxlan - Source interface is Loopback1 and is active with 10.0.255.11 - Replication/Flood Mode is headend with Flood List Source: EVPN - Remote MAC learning via EVPN -``` - -## Tips - -1. **Always check both spines and leafs** - Verify configurations are symmetric -2. **Use 'watch' command** for real-time monitoring during changes -3. **Check logs** if something doesn't work as expected -4. **Verify bidirectional** connectivity and routing -5. **Test failure scenarios** by shutting down interfaces/devices - ---- - -For more information, refer to: -- [Arista EOS EVPN Documentation](https://www.arista.com/en/um-eos/eos-section-41-1-evpn) -- [Arista VXLAN Configuration Guide](https://www.arista.com/en/um-eos/eos-vxlan) diff --git a/hosts/README.md b/hosts/README.md index 5723687..44bdaac 100644 --- a/hosts/README.md +++ b/hosts/README.md @@ -31,7 +31,7 @@ iface lo inet loopback auto bond0 iface bond0 inet manual - bond-mode 4 # LACP (802.3ad) + bond-mode 4 # LACP (802.3ad) bond-miimon 100 bond-lacp-rate 1 bond-slaves eth1 eth2 @@ -71,5 +71,6 @@ No need to manually configure hosts after deployment - these files ensure clean, ## See Also -- [HOST_INTERFACE_CONFIGURATION.md](../docs/HOST_INTERFACE_CONFIGURATION.md) - Detailed documentation -- [DEPLOYMENT_GUIDE.md](../DEPLOYMENT_GUIDE.md) - Lab deployment instructions +- [Main README](../README.md) - Project overview and quick start +- [TROUBLESHOOTING.md](../TROUBLESHOOTING.md) - Troubleshooting guide +- [END_TO_END_TESTING.md](../END_TO_END_TESTING.md) - Testing procedures diff --git a/scripts/cleanup.sh b/scripts/cleanup.sh deleted file mode 100644 index df34910..0000000 --- a/scripts/cleanup.sh +++ /dev/null @@ -1,91 +0,0 @@ -#!/bin/bash -# Cleanup script for Arista EVPN-VXLAN lab - -set -e - -# Colors -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -NC='\033[0m' - -print_info() { - echo -e "${GREEN}[INFO]${NC} $1" -} - -print_warning() { - echo -e "${YELLOW}[WARNING]${NC} $1" -} - -print_error() { - echo -e "${RED}[ERROR]${NC} $1" -} - -echo "=========================================" -echo " EVPN-VXLAN Lab Cleanup" -echo "=========================================" -echo "" - -# Check if running as root -if [ "$EUID" -ne 0 ]; then - print_error "Please run with sudo" - exit 1 -fi - -# Confirm cleanup -print_warning "This will destroy the lab and clean up all resources!" -read -p "Are you sure you want to continue? (yes/no): " confirm - -if [ "$confirm" != "yes" ]; then - print_info "Cleanup cancelled." - exit 0 -fi - -# Destroy the lab -print_info "Destroying ContainerLab topology..." -if containerlab destroy -t evpn-lab.clab.yml --cleanup 2>/dev/null; then - print_info "Lab destroyed successfully" -else - print_warning "Lab may not be running or already destroyed" -fi - -# Clean up any remaining containers -print_info "Checking for remaining lab containers..." -containers=$(docker ps -a | grep "clab-arista-evpn-fabric" | awk '{print $1}' || true) -if [ -n "$containers" ]; then - print_info "Removing remaining containers..." - echo "$containers" | xargs docker rm -f -else - print_info "No remaining containers found" -fi - -# Clean up networks -print_info "Checking for lab networks..." -networks=$(docker network ls | grep "evpn-mgmt" | awk '{print $1}' || true) -if [ -n "$networks" ]; then - print_info "Removing lab networks..." - echo "$networks" | xargs docker network rm -else - print_info "No lab networks found" -fi - -# Clean up clab directory -print_info "Cleaning up clab directory..." -if [ -d "clab-arista-evpn-fabric" ]; then - rm -rf clab-arista-evpn-fabric - print_info "Removed clab directory" -fi - -# Optional: Clean up docker system -read -p "Do you want to run docker system prune? (y/N): " prune -if [[ $prune =~ ^[Yy]$ ]]; then - print_info "Running docker system prune..." - docker system prune -f -fi - -echo "" -print_info "Cleanup complete!" -echo "" -print_info "To redeploy the lab, run:" -print_info " sudo ./scripts/deploy.sh deploy" -echo "" diff --git a/scripts/deploy.sh b/scripts/deploy.sh deleted file mode 100644 index 013fbab..0000000 --- a/scripts/deploy.sh +++ /dev/null @@ -1,248 +0,0 @@ -#!/bin/bash -# Deployment script for Arista EVPN-VXLAN ContainerLab - -set -e - -# Colors for output -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -NC='\033[0m' # No Color - -# Function to print colored output -print_info() { - echo -e "${GREEN}[INFO]${NC} $1" -} - -print_warning() { - echo -e "${YELLOW}[WARNING]${NC} $1" -} - -print_error() { - echo -e "${RED}[ERROR]${NC} $1" -} - -# Function to check prerequisites -check_prerequisites() { - print_info "Checking prerequisites..." - - # Check if running as root or with sudo - if [ "$EUID" -ne 0 ]; then - print_error "Please run with sudo" - exit 1 - fi - - # Check if containerlab is installed - if ! command -v containerlab &> /dev/null; then - print_error "ContainerLab is not installed. Please install it first." - print_info "Visit: https://containerlab.dev/install/" - exit 1 - fi - - # Check if docker is running - if ! docker info &> /dev/null; then - print_error "Docker is not running. Please start Docker first." - exit 1 - fi - - # Check if cEOS image exists - if ! docker images | grep -q "ceos.*4.35.0"; then - print_warning "cEOS 4.35.0 image not found." - print_info "Please import the cEOS image first:" - print_info " docker import cEOS64-lab-4.35.0F.tar ceos:4.35.0" - read -p "Do you want to continue anyway? (y/N) " -n 1 -r - echo - if [[ ! $REPLY =~ ^[Yy]$ ]]; then - exit 1 - fi - fi - - print_info "All prerequisites met!" -} - -# Function to deploy the lab -deploy_lab() { - print_info "Deploying Arista EVPN-VXLAN lab..." - - # Deploy with containerlab - if containerlab deploy -t evpn-lab.clab.yml; then - print_info "Lab deployed successfully!" - echo "" - print_info "Lab Details:" - containerlab inspect -t evpn-lab.clab.yml - echo "" - print_info "Access devices using:" - print_info " ssh admin@" - print_info " Default password: admin" - echo "" - print_info "Or use docker exec:" - print_info " docker exec -it clab-arista-evpn-fabric-leaf1 Cli" - else - print_error "Deployment failed!" - exit 1 - fi -} - -# Function to display status -show_status() { - print_info "Lab Status:" - containerlab inspect -t evpn-lab.clab.yml -} - -# Function to destroy the lab -destroy_lab() { - print_warning "This will destroy the entire lab!" - read -p "Are you sure? (y/N) " -n 1 -r - echo - if [[ $REPLY =~ ^[Yy]$ ]]; then - print_info "Destroying lab..." - containerlab destroy -t evpn-lab.clab.yml --cleanup - print_info "Lab destroyed successfully!" - else - print_info "Destruction cancelled." - fi -} - -# Function to restart the lab -restart_lab() { - print_info "Restarting lab..." - destroy_lab - if [[ $? -eq 0 ]]; then - sleep 2 - deploy_lab - fi -} - -# Function to show device access info -show_access_info() { - print_info "Device Access Information:" - echo "" - echo "SSH Access (password: admin):" - echo " Spines:" - echo " ssh admin@clab-arista-evpn-fabric-spine1" - echo " ssh admin@clab-arista-evpn-fabric-spine2" - echo "" - echo " Leafs:" - for i in {1..8}; do - echo " ssh admin@clab-arista-evpn-fabric-leaf$i" - done - echo "" - echo "Docker Exec:" - echo " docker exec -it clab-arista-evpn-fabric- Cli" - echo "" - echo "Management IPs:" - containerlab inspect -t evpn-lab.clab.yml | grep -E "spine|leaf" | awk '{print $2, $6}' -} - -# Function to run basic validation -validate_lab() { - print_info "Running basic validation..." - - # Check if containers are running - if ! docker ps | grep -q "clab-arista-evpn-fabric"; then - print_error "No lab containers found. Deploy the lab first." - exit 1 - fi - - print_info "Checking BGP EVPN status on spine1..." - docker exec clab-arista-evpn-fabric-spine1 Cli -p 15 -c "show bgp evpn summary" - - print_info "Checking VXLAN status on leaf1..." - docker exec clab-arista-evpn-fabric-leaf1 Cli -p 15 -c "show vxlan vtep" - - print_info "Checking MLAG status on leaf1..." - docker exec clab-arista-evpn-fabric-leaf1 Cli -p 15 -c "show mlag" - - print_info "Validation complete! Check output above for any issues." -} - -# Main menu -show_menu() { - echo "" - echo "=========================================" - echo " Arista EVPN-VXLAN Lab Manager" - echo "=========================================" - echo "1. Deploy Lab" - echo "2. Show Status" - echo "3. Destroy Lab" - echo "4. Restart Lab" - echo "5. Show Access Info" - echo "6. Validate Lab" - echo "7. Exit" - echo "=========================================" -} - -# Main script -main() { - # Check prerequisites first - check_prerequisites - - # If arguments provided, execute directly - if [ $# -gt 0 ]; then - case "$1" in - deploy) - deploy_lab - ;; - status) - show_status - ;; - destroy) - destroy_lab - ;; - restart) - restart_lab - ;; - access) - show_access_info - ;; - validate) - validate_lab - ;; - *) - print_error "Unknown command: $1" - echo "Usage: $0 {deploy|status|destroy|restart|access|validate}" - exit 1 - ;; - esac - exit 0 - fi - - # Interactive menu - while true; do - show_menu - read -p "Select option [1-7]: " choice - case $choice in - 1) - deploy_lab - ;; - 2) - show_status - ;; - 3) - destroy_lab - ;; - 4) - restart_lab - ;; - 5) - show_access_info - ;; - 6) - validate_lab - ;; - 7) - print_info "Exiting..." - exit 0 - ;; - *) - print_error "Invalid option. Please select 1-7." - ;; - esac - - echo "" - read -p "Press Enter to continue..." - done -} - -# Run main function -main "$@" diff --git a/scripts/test-connectivity.sh b/scripts/test-connectivity.sh deleted file mode 100644 index a6fd884..0000000 --- a/scripts/test-connectivity.sh +++ /dev/null @@ -1,146 +0,0 @@ -#!/bin/bash -# Connectivity test script for Arista EVPN-VXLAN lab - -set -e - -# Colors -GREEN='\033[0;32m' -RED='\033[0;31m' -YELLOW='\033[1;33m' -NC='\033[0m' - -print_test() { - echo -e "${YELLOW}[TEST]${NC} $1" -} - -print_pass() { - echo -e "${GREEN}[PASS]${NC} $1" -} - -print_fail() { - echo -e "${RED}[FAIL]${NC} $1" -} - -# Test counter -TESTS_RUN=0 -TESTS_PASSED=0 -TESTS_FAILED=0 - -run_test() { - local test_name="$1" - local device="$2" - local command="$3" - local expected="$4" - - TESTS_RUN=$((TESTS_RUN + 1)) - print_test "$test_name" - - if output=$(docker exec "clab-arista-evpn-fabric-$device" Cli -p 15 -c "$command" 2>&1); then - if echo "$output" | grep -q "$expected"; then - print_pass "$test_name" - TESTS_PASSED=$((TESTS_PASSED + 1)) - return 0 - else - print_fail "$test_name - Expected pattern not found" - TESTS_FAILED=$((TESTS_FAILED + 1)) - return 1 - fi - else - print_fail "$test_name - Command failed" - TESTS_FAILED=$((TESTS_FAILED + 1)) - return 1 - fi -} - -echo "=========================================" -echo " EVPN-VXLAN Connectivity Tests" -echo "=========================================" -echo "" - -# Test 1: BGP Underlay - Spine to Leaf -echo "--- Testing BGP Underlay ---" -run_test "Spine1 BGP IPv4 neighbors" "spine1" "show bgp ipv4 unicast summary" "Estab" -run_test "Spine2 BGP IPv4 neighbors" "spine2" "show bgp ipv4 unicast summary" "Estab" -run_test "Leaf1 BGP IPv4 neighbors" "leaf1" "show bgp ipv4 unicast summary" "Estab" -echo "" - -# Test 2: BGP EVPN Overlay -echo "--- Testing BGP EVPN Overlay ---" -run_test "Spine1 BGP EVPN neighbors" "spine1" "show bgp evpn summary" "Estab" -run_test "Spine2 BGP EVPN neighbors" "spine2" "show bgp evpn summary" "Estab" -run_test "Leaf1 BGP EVPN neighbors" "leaf1" "show bgp evpn summary" "Estab" -run_test "Leaf3 BGP EVPN neighbors" "leaf3" "show bgp evpn summary" "Estab" -echo "" - -# Test 3: Loopback Reachability -echo "--- Testing Loopback Reachability ---" -run_test "Leaf1 can reach Spine1 loopback" "leaf1" "ping 10.0.250.1 repeat 3" "3 received" -run_test "Leaf1 can reach Spine2 loopback" "leaf1" "ping 10.0.250.2 repeat 3" "3 received" -run_test "Leaf1 can reach Leaf3 loopback" "leaf1" "ping 10.0.250.13 repeat 3" "3 received" -echo "" - -# Test 4: MLAG Status -echo "--- Testing MLAG ---" -run_test "Leaf1 MLAG state" "leaf1" "show mlag" "Active" -run_test "Leaf2 MLAG state" "leaf2" "show mlag" "Active" -run_test "Leaf3 MLAG state" "leaf3" "show mlag" "Active" -run_test "Leaf4 MLAG state" "leaf4" "show mlag" "Active" -echo "" - -# Test 5: VXLAN Interface -echo "--- Testing VXLAN ---" -run_test "Leaf1 VXLAN interface" "leaf1" "show interface vxlan1" "line protocol is up" -run_test "Leaf3 VXLAN interface" "leaf3" "show interface vxlan1" "line protocol is up" -run_test "Leaf5 VXLAN interface" "leaf5" "show interface vxlan1" "line protocol is up" -run_test "Leaf7 VXLAN interface" "leaf7" "show interface vxlan1" "line protocol is up" -echo "" - -# Test 6: VXLAN VTEPs Discovery -echo "--- Testing VTEP Discovery ---" -run_test "Leaf1 discovers remote VTEPs" "leaf1" "show vxlan vtep" "10.0.255" -run_test "Leaf3 discovers remote VTEPs" "leaf3" "show vxlan vtep" "10.0.255" -run_test "Leaf5 discovers remote VTEPs" "leaf5" "show vxlan vtep" "10.0.255" -echo "" - -# Test 7: EVPN Routes -echo "--- Testing EVPN Routes ---" -run_test "Leaf1 has EVPN Type-2 routes" "leaf1" "show bgp evpn route-type mac-ip" "RD:" -run_test "Leaf3 has EVPN Type-5 routes" "leaf3" "show bgp evpn route-type ip-prefix ipv4" "RD:" -run_test "Leaf7 has EVPN Type-5 routes" "leaf7" "show bgp evpn route-type ip-prefix ipv4" "RD:" -echo "" - -# Test 8: VRF Routing -echo "--- Testing VRF Routing ---" -run_test "Leaf3 VRF gold exists" "leaf3" "show vrf" "gold" -run_test "Leaf3 has routes in VRF gold" "leaf3" "show ip route vrf gold" "10." -run_test "Leaf7 VRF gold exists" "leaf7" "show vrf" "gold" -run_test "Leaf7 has routes in VRF gold" "leaf7" "show ip route vrf gold" "10." -echo "" - -# Test 9: VRF Connectivity -echo "--- Testing VRF Connectivity ---" -run_test "Leaf3 can reach Leaf7 in VRF gold" "leaf3" "ping vrf gold 10.78.78.1 repeat 3" "received" -run_test "Leaf7 can reach Leaf3 in VRF gold" "leaf7" "ping vrf gold 10.34.34.1 repeat 3" "received" -echo "" - -# Test 10: ECMP Paths -echo "--- Testing ECMP ---" -run_test "Leaf1 has ECMP to remote loopbacks" "leaf1" "show ip route 10.0.250.13" "via" -echo "" - -# Summary -echo "=========================================" -echo " Test Summary" -echo "=========================================" -echo "Total Tests Run: $TESTS_RUN" -echo -e "Tests Passed: ${GREEN}$TESTS_PASSED${NC}" -echo -e "Tests Failed: ${RED}$TESTS_FAILED${NC}" -echo "=========================================" - -if [ $TESTS_FAILED -eq 0 ]; then - echo -e "${GREEN}All tests passed!${NC}" - exit 0 -else - echo -e "${RED}Some tests failed. Check the output above.${NC}" - exit 1 -fi