Update troubleshooting guide

Improve formatting and add details for clarity.
This commit is contained in:
2025-11-29 16:39:21 +00:00
parent fb682fdb19
commit c3aef36a8e

View File

@@ -20,11 +20,13 @@ This guide provides systematic troubleshooting steps for Arista EVPN-VXLAN fabri
## 🔍 Troubleshooting Methodology ## 🔍 Troubleshooting Methodology
**Always troubleshoot bottom-up:** **Always troubleshoot bottom-up:**
``` ```
Physical Links → MLAG → Underlay BGP → Overlay EVPN → VXLAN → Traffic Flow Physical Links → MLAG → Underlay BGP → Overlay EVPN → VXLAN → Traffic Flow
``` ```
**For each layer:** **For each layer:**
1. ✅ Verify expected state 1. ✅ Verify expected state
2. ❌ Identify issues 2. ❌ Identify issues
3. 🔧 Apply fixes 3. 🔧 Apply fixes
@@ -50,6 +52,7 @@ show interfaces Ethernet11 | include error|drop|discard
``` ```
**Expected Output:** **Expected Output:**
``` ```
Ethernet11 is up, line protocol is up (connected) Ethernet11 is up, line protocol is up (connected)
Hardware is Ethernet, address is 001c.7300.000b Hardware is Ethernet, address is 001c.7300.000b
@@ -58,6 +61,7 @@ Ethernet11 is up, line protocol is up (connected)
``` ```
**Troubleshooting:** **Troubleshooting:**
- `down/down` → Physical issue (cable, peer interface) - `down/down` → Physical issue (cable, peer interface)
- `up/down` → Layer 2 issue (switchport config, STP) - `up/down` → Layer 2 issue (switchport config, STP)
- Check MTU: Should be **9214** on underlay P2P links - Check MTU: Should be **9214** on underlay P2P links
@@ -82,6 +86,7 @@ show mlag interfaces
``` ```
**Expected Output (show mlag):** **Expected Output (show mlag):**
``` ```
MLAG Configuration: MLAG Configuration:
domain-id : leafs domain-id : leafs
@@ -101,7 +106,7 @@ dual-primary detection : Configured
**Troubleshooting:** **Troubleshooting:**
| Issue | Cause | Fix | | Issue | Cause | Fix |
|-------|-------|-----| | ------------------------- | ----------------------------------- | ----------------------------------------- |
| state: `Inactive` | Peer-link down | Check Po999 and Ethernet10 | | state: `Inactive` | Peer-link down | Check Po999 and Ethernet10 |
| negotiation: `Connecting` | VLAN4090 issue | Verify IP addressing, peer-address config | | negotiation: `Connecting` | VLAN4090 issue | Verify IP addressing, peer-address config |
| peer-link: `Down` | Port-Channel999 down | Check `show port-channel 999` | | peer-link: `Down` | Port-Channel999 down | Check `show port-channel 999` |
@@ -123,12 +128,14 @@ show lacp interface Port-Channel999
``` ```
**Expected Output:** **Expected Output:**
``` ```
Port Channel Port-Channel999 (Fallback State: Unconfigured): Port Channel Port-Channel999 (Fallback State: Unconfigured):
Active Ports: Ethernet10 Active Ports: Ethernet10
``` ```
**Troubleshooting:** **Troubleshooting:**
- No active ports → Check `show interfaces Ethernet10` - No active ports → Check `show interfaces Ethernet10`
- Wrong mode → Should be `switchport mode trunk` - Wrong mode → Should be `switchport mode trunk`
- Missing VLANs → Check `switchport trunk group mlag-peer` - Missing VLANs → Check `switchport trunk group mlag-peer`
@@ -151,12 +158,14 @@ show lacp neighbor
``` ```
**Expected Output (show port-channel 1):** **Expected Output (show port-channel 1):**
``` ```
Port Channel Port-Channel1 (Fallback State: individual): Port Channel Port-Channel1 (Fallback State: individual):
Active Ports: Ethernet1 Active Ports: Ethernet1
``` ```
**Expected Output (show mlag interfaces):** **Expected Output (show mlag interfaces):**
``` ```
local/remote local/remote
mlag desc state local remote status mlag desc state local remote status
@@ -167,7 +176,7 @@ Active Ports: Ethernet1
**Troubleshooting:** **Troubleshooting:**
| Issue | Cause | Fix | | Issue | Cause | Fix |
|-------|-------|-----| | --------------------- | ---------------------------- | -------------------------------- |
| `inactive` | MLAG peering down | Fix MLAG first (section 2.1) | | `inactive` | MLAG peering down | Fix MLAG first (section 2.1) |
| `active-partial` | Remote Po1 down on peer leaf | Check peer leaf's Po1 | | `active-partial` | Remote Po1 down on peer leaf | Check peer leaf's Po1 |
| `configured-inactive` | Missing `mlag 1` config | Add `mlag 1` to Po1 | | `configured-inactive` | Missing `mlag 1` config | Add `mlag 1` to Po1 |
@@ -186,6 +195,7 @@ ping vrf default 10.0.3.1 source 10.0.3.0
``` ```
**Expected:** **Expected:**
- Interface: `up/up` - Interface: `up/up`
- Ping: Successful - Ping: Successful
@@ -206,6 +216,7 @@ show ip bgp neighbor 10.0.1.1
``` ```
**Expected Output:** **Expected Output:**
``` ```
Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
10.0.1.1 4 65001 245 243 0 0 02:01:23 Estab 2 2 10.0.1.1 4 65001 245 243 0 0 02:01:23 Estab 2 2
@@ -224,6 +235,7 @@ show bgp peer-group underlay
``` ```
**Expected neighbors:** **Expected neighbors:**
- eBGP to both spines (state: `Estab`) - eBGP to both spines (state: `Estab`)
- iBGP to MLAG peer (state: `Estab`) - iBGP to MLAG peer (state: `Estab`)
@@ -250,10 +262,12 @@ ping 10.0.255.14 source 10.0.255.11
``` ```
**Expected:** **Expected:**
- All pings successful - All pings successful
- RTT < 10ms (virtual environment) - RTT < 10ms (virtual environment)
**Troubleshooting:** **Troubleshooting:**
```bash ```bash
# Check routing table # Check routing table
show ip route show ip route
@@ -266,6 +280,7 @@ show ip bgp neighbors 10.0.1.0 advertised-routes
``` ```
**Common issues:** **Common issues:**
- Missing `network 10.0.250.X/32` in BGP config - Missing `network 10.0.250.X/32` in BGP config
- Missing `network 10.0.255.X/32` (VTEP loopback!) - Missing `network 10.0.255.X/32` (VTEP loopback!)
- BGP neighbor not activated in IPv4 address-family - BGP neighbor not activated in IPv4 address-family
@@ -283,6 +298,7 @@ show ip route 10.0.250.13 detail
``` ```
**Expected Output:** **Expected Output:**
``` ```
B E 10.0.250.13/32 [20/0] via 10.0.1.0, Ethernet11 B E 10.0.250.13/32 [20/0] via 10.0.1.0, Ethernet11
via 10.0.2.0, Ethernet12 via 10.0.2.0, Ethernet12
@@ -307,6 +323,7 @@ show bgp evpn neighbor 10.0.250.11
``` ```
**Expected:** **Expected:**
- All 8 leafs in `Estab` state - All 8 leafs in `Estab` state
- PfxRcd > 0 (receiving EVPN routes) - PfxRcd > 0 (receiving EVPN routes)
@@ -318,6 +335,7 @@ show bgp evpn summary
``` ```
**Expected:** **Expected:**
- Both spines in `Estab` state - Both spines in `Estab` state
- PfxRcd > 0 - PfxRcd > 0
@@ -345,6 +363,7 @@ show bgp evpn route-type mac-ip
``` ```
Output should show: Output should show:
- Local MACs (learned on Port-Channel1) - Local MACs (learned on Port-Channel1)
- Remote MACs (from other VTEPs via EVPN) - Remote MACs (from other VTEPs via EVPN)
@@ -355,6 +374,7 @@ show bgp evpn route-type ip-prefix ipv4
``` ```
Output should show: Output should show:
- Local subnets (e.g., 10.34.34.0/24 on VTEP2) - Local subnets (e.g., 10.34.34.0/24 on VTEP2)
- Remote subnets (e.g., 10.78.78.0/24 from VTEP4) - Remote subnets (e.g., 10.78.78.0/24 from VTEP4)
@@ -363,6 +383,7 @@ Output should show:
### 4.3 Troubleshoot EVPN Issues ### 4.3 Troubleshoot EVPN Issues
**No EVPN neighbors:** **No EVPN neighbors:**
```bash ```bash
# Check if EVPN is activated # Check if EVPN is activated
show running-config | section evpn show running-config | section evpn
@@ -373,6 +394,7 @@ show running-config | section evpn
``` ```
**No EVPN routes received:** **No EVPN routes received:**
```bash ```bash
# Check route-target configuration # Check route-target configuration
show running-config | section vlan 40 show running-config | section vlan 40
@@ -385,6 +407,7 @@ show running-config | section vlan 40
``` ```
**EVPN routes received but not installed:** **EVPN routes received but not installed:**
```bash ```bash
# Check VXLAN interface # Check VXLAN interface
show interfaces Vxlan1 show interfaces Vxlan1
@@ -414,6 +437,7 @@ show vxlan address-table
``` ```
**Expected Output (show interfaces Vxlan1):** **Expected Output (show interfaces Vxlan1):**
``` ```
Vxlan1 is up, line protocol is up (connected) Vxlan1 is up, line protocol is up (connected)
Hardware is Vxlan Hardware is Vxlan
@@ -428,6 +452,7 @@ Vxlan1 is up, line protocol is up (connected)
``` ```
**Expected Output (show vxlan vtep):** **Expected Output (show vxlan vtep):**
``` ```
Remote VTEPS for Vxlan1: Remote VTEPS for Vxlan1:
@@ -458,6 +483,7 @@ show mac address-table vlan 40
``` ```
**Expected Output:** **Expected Output:**
``` ```
Mac Address Table Mac Address Table
------------------------------------------------------------------ ------------------------------------------------------------------
@@ -483,6 +509,7 @@ show vxlan address-table vlan 40
``` ```
**Expected Output:** **Expected Output:**
``` ```
Vxlan Mac Address Table Vxlan Mac Address Table
---------------------------------------------------------------------- ----------------------------------------------------------------------
@@ -506,6 +533,7 @@ Both hosts in VLAN 40 (10.40.40.0/24)
#### Step 1: Host Sends Packet #### Step 1: Host Sends Packet
**On host1:** **On host1:**
```bash ```bash
docker exec -it clab-arista-evpn-fabric-host1 sh docker exec -it clab-arista-evpn-fabric-host1 sh
@@ -520,6 +548,7 @@ ping 10.40.40.103
``` ```
**Expected:** **Expected:**
- bond0: `state UP` - bond0: `state UP`
- bond0.40: `state UP` - bond0.40: `state UP`
@@ -540,6 +569,7 @@ show mac address-table dynamic vlan 40
``` ```
**Traffic flow:** **Traffic flow:**
``` ```
host1:bond0.40 → [802.1Q VLAN 40] → leaf1:Eth1 → Po1 host1:bond0.40 → [802.1Q VLAN 40] → leaf1:Eth1 → Po1
``` ```
@@ -567,6 +597,7 @@ show vxlan address-table address 00c1.ab00.0033
``` ```
**Encapsulation:** **Encapsulation:**
``` ```
Original: [Eth: host1→host3][IP: 10.40.40.101→103][ICMP] Original: [Eth: host1→host3][IP: 10.40.40.101→103][ICMP]
@@ -594,6 +625,7 @@ show ip route 10.0.255.13
ECMP: Packet can go via spine1 OR spine2! ECMP: Packet can go via spine1 OR spine2!
**Spine forwards based on outer IP:** **Spine forwards based on outer IP:**
```bash ```bash
# On spine1 # On spine1
show ip route 10.0.255.13 show ip route 10.0.255.13
@@ -616,12 +648,14 @@ show interfaces Vxlan1 | include packets
``` ```
**Decapsulation:** **Decapsulation:**
``` ```
VXLAN packet → Strip outer IP/UDP/VXLAN headers VXLAN packet → Strip outer IP/UDP/VXLAN headers
→ Original frame: [Eth: host1→host3][IP: 10.40.40.101→103][ICMP] → Original frame: [Eth: host1→host3][IP: 10.40.40.101→103][ICMP]
``` ```
**Leaf5 checks MAC table:** **Leaf5 checks MAC table:**
```bash ```bash
show mac address-table address 00c1.ab00.0033 show mac address-table address 00c1.ab00.0033
@@ -638,6 +672,7 @@ leaf5:Vxlan1 → VLAN 40 → Po1 → Eth1 → host3:bond0.40
``` ```
**On host3:** **On host3:**
```bash ```bash
docker exec -it clab-arista-evpn-fabric-host3 sh docker exec -it clab-arista-evpn-fabric-host3 sh
@@ -685,6 +720,7 @@ leaf1:Eth11 ──────► spine1 ──────► leaf5:Eth11 ─
### Issue 1: Ping Fails Between Hosts in Same VLAN ### Issue 1: Ping Fails Between Hosts in Same VLAN
**Symptoms:** **Symptoms:**
- Host1 cannot ping Host3 (both VLAN 40) - Host1 cannot ping Host3 (both VLAN 40)
- MACs not learning - MACs not learning
@@ -723,7 +759,7 @@ show vxlan address-table vlan 40
**Common Causes:** **Common Causes:**
| Issue | Fix | | Issue | Fix |
|-------|-----| | -------------------- | ------------------------------------------------- |
| Port-Channel down | Check LACP, add fallback config | | Port-Channel down | Check LACP, add fallback config |
| MLAG not synced | Fix MLAG peering (VLAN 4090) | | MLAG not synced | Fix MLAG peering (VLAN 4090) |
| VNI not configured | Add `vxlan vlan 40 vni 110040` | | VNI not configured | Add `vxlan vlan 40 vni 110040` |
@@ -735,6 +771,7 @@ show vxlan address-table vlan 40
### Issue 2: Ping Fails Between VRFs (L3 VXLAN) ### Issue 2: Ping Fails Between VRFs (L3 VXLAN)
**Symptoms:** **Symptoms:**
- host2 (10.34.34.102) cannot ping host4 (10.78.78.104) - host2 (10.34.34.102) cannot ping host4 (10.78.78.104)
- Both in VRF gold - Both in VRF gold
@@ -762,7 +799,7 @@ show ip virtual-router
**Common Causes:** **Common Causes:**
| Issue | Fix | | Issue | Fix |
|-------|-----| | ---------------------- | --------------------------------------------- |
| SVI not in VRF | Add `vrf gold` under `interface Vlan34` | | SVI not in VRF | Add `vrf gold` under `interface Vlan34` |
| VRF not mapped to VNI | Add `vxlan vrf gold vni 100001` | | VRF not mapped to VNI | Add `vxlan vrf gold vni 100001` |
| Route-target mismatch | Verify `route-target both evpn 1:100001` | | Route-target mismatch | Verify `route-target both evpn 1:100001` |
@@ -773,6 +810,7 @@ show ip virtual-router
### Issue 3: MLAG Port-Channel Inactive ### Issue 3: MLAG Port-Channel Inactive
**Symptoms:** **Symptoms:**
``` ```
show mlag interfaces show mlag interfaces
# mlag 1: configured-inactive # mlag 1: configured-inactive
@@ -797,6 +835,7 @@ show running-config interfaces Port-Channel1
``` ```
**Fix:** **Fix:**
- Ensure BOTH leafs have `mlag 1` configured - Ensure BOTH leafs have `mlag 1` configured
- Ensure MLAG peering is up first - Ensure MLAG peering is up first
- Check peer leaf's Port-Channel status - Check peer leaf's Port-Channel status
@@ -806,6 +845,7 @@ show running-config interfaces Port-Channel1
### Issue 4: LACP Not Establishing ### Issue 4: LACP Not Establishing
**Symptoms:** **Symptoms:**
``` ```
show port-channel 1 show port-channel 1
# No Active Ports # No Active Ports
@@ -814,6 +854,7 @@ show port-channel 1
``` ```
**Fix:** **Fix:**
```bash ```bash
# Add LACP fallback # Add LACP fallback
configure configure
@@ -823,6 +864,7 @@ interface Port-Channel1
``` ```
**Verify:** **Verify:**
```bash ```bash
show port-channel 1 show port-channel 1
# → Should show Ethernet1 in "Active Ports" (fallback mode) # → Should show Ethernet1 in "Active Ports" (fallback mode)
@@ -837,6 +879,7 @@ show lacp neighbor
### Issue 5: BGP EVPN Neighbors Not Establishing ### Issue 5: BGP EVPN Neighbors Not Establishing
**Symptoms:** **Symptoms:**
``` ```
show bgp evpn summary show bgp evpn summary
# Neighbors stuck in "Connect" or "Active" state # Neighbors stuck in "Connect" or "Active" state
@@ -861,6 +904,7 @@ show log | include BGP|EVPN
``` ```
**Common Fixes:** **Common Fixes:**
- Add `neighbor evpn activate` in `address-family evpn` - Add `neighbor evpn activate` in `address-family evpn`
- Check `update-source Loopback0` is configured - Check `update-source Loopback0` is configured
- Verify `ebgp-multihop 3` for leaf-spine peering - Verify `ebgp-multihop 3` for leaf-spine peering