213 lines
5.4 KiB
Markdown
213 lines
5.4 KiB
Markdown
# VXLAN Monitoring Without Native Paths
|
|
|
|
## The Problem
|
|
|
|
Arista's VXLAN-specific telemetry paths (`arista-exp-eos-vxlan`) don't have well-documented OpenConfig equivalents, and the native paths are not standardized.
|
|
|
|
## The Solution
|
|
|
|
**You already have VXLAN visibility** through existing subscriptions! Here's how:
|
|
|
|
### 1. VXLAN Interface Metrics (Already Collected!)
|
|
|
|
The `Vxlan1` interface IS your VXLAN endpoint. Our existing `interfaces` subscription captures:
|
|
|
|
```prometheus
|
|
# VXLAN tunnel traffic
|
|
gnmic_interfaces_interface_state_counters_in_octets{interface_name="Vxlan1"}
|
|
gnmic_interfaces_interface_state_counters_out_octets{interface_name="Vxlan1"}
|
|
|
|
# VXLAN tunnel errors
|
|
gnmic_interfaces_interface_state_counters_in_errors{interface_name="Vxlan1"}
|
|
gnmic_interfaces_interface_state_counters_out_errors{interface_name="Vxlan1"}
|
|
|
|
# VXLAN interface status
|
|
gnmic_interfaces_interface_state_oper_status{interface_name="Vxlan1"}
|
|
```
|
|
|
|
### 2. VTEP Reachability (via BGP EVPN!)
|
|
|
|
BGP EVPN neighbors = VTEP reachability:
|
|
|
|
```prometheus
|
|
# EVPN neighbor state (1 = Established, VTEP is up)
|
|
gnmic_bgp_neighbors_neighbor_state_session_state{neighbor_address="10.0.250.13"}
|
|
|
|
# EVPN routes received = VNI propagation working
|
|
gnmic_bgp_neighbors_neighbor_afi_safis_state_prefixes_received{
|
|
neighbor_address="10.0.250.1",
|
|
afi_safi_name="L2VPN_EVPN"
|
|
}
|
|
```
|
|
|
|
### 3. Underlay Health = VXLAN Health
|
|
|
|
If underlay (spine-leaf) interfaces are up and BGP is established, VXLAN tunnels will form automatically:
|
|
|
|
```prometheus
|
|
# Underlay interfaces to spines
|
|
gnmic_interfaces_interface_state_oper_status{
|
|
interface_name=~"Ethernet1[12]",
|
|
role="leaf"
|
|
}
|
|
```
|
|
|
|
## Grafana Queries for VXLAN Monitoring
|
|
|
|
### VXLAN Tunnel Bandwidth
|
|
|
|
```promql
|
|
# VXLAN tunnel TX rate (bits/sec)
|
|
rate(gnmic_interfaces_interface_state_counters_out_octets{interface_name="Vxlan1"}[1m]) * 8
|
|
|
|
# VXLAN tunnel RX rate (bits/sec)
|
|
rate(gnmic_interfaces_interface_state_counters_in_octets{interface_name="Vxlan1"}[1m]) * 8
|
|
```
|
|
|
|
### VTEP Reachability Matrix
|
|
|
|
```promql
|
|
# Show which VTEPs can reach each other (via EVPN)
|
|
gnmic_bgp_neighbors_neighbor_state_session_state{
|
|
afi_safi_name="L2VPN_EVPN"
|
|
} == 6 # 6 = Established in OpenConfig BGP
|
|
```
|
|
|
|
### VNI Count per VTEP
|
|
|
|
```promql
|
|
# Count of EVPN routes = approximation of active VNIs
|
|
gnmic_bgp_neighbors_neighbor_afi_safis_state_prefixes_received{
|
|
afi_safi_name="L2VPN_EVPN"
|
|
}
|
|
```
|
|
|
|
### VXLAN Errors
|
|
|
|
```promql
|
|
# VXLAN tunnel errors
|
|
rate(gnmic_interfaces_interface_state_counters_in_errors{interface_name="Vxlan1"}[5m])
|
|
```
|
|
|
|
## What You're Missing (and Why It's OK)
|
|
|
|
### ❌ Not Directly Available:
|
|
- Per-VNI packet/byte counters
|
|
- Individual VTEP discovery lists
|
|
- Flood list details
|
|
- VNI-to-VLAN mappings
|
|
|
|
### ✅ Why It's OK:
|
|
1. **Total VXLAN traffic** (Vxlan1 interface) is usually more useful than per-VNI
|
|
2. **VTEP reachability** is inferred from BGP EVPN neighbor states
|
|
3. **VNI health** is inferred from EVPN route counts
|
|
4. **Configuration info** (VNI-to-VLAN) doesn't change often, can be in docs
|
|
|
|
## If You Really Need Native VXLAN Paths
|
|
|
|
### Discovery Method:
|
|
|
|
```bash
|
|
# SSH to a leaf
|
|
ssh admin@172.16.0.25
|
|
|
|
# Enter bash
|
|
bash
|
|
|
|
# Try to get native VXLAN paths
|
|
gnmi -get /Sysdb/bridging/vxlan/status
|
|
gnmi -get /Smash/bridging/status/vxlanStatus
|
|
|
|
# Or use EOS native provider in gnmi config
|
|
```
|
|
|
|
### Add to gnmic.yaml (if discovery works):
|
|
|
|
```yaml
|
|
subscriptions:
|
|
arista_vxlan:
|
|
paths:
|
|
- /Sysdb/bridging/vxlan/status # If this works
|
|
mode: stream
|
|
stream-mode: sample
|
|
sample-interval: 30s
|
|
encoding: json # Note: probably needs 'json' not 'json_ietf'
|
|
```
|
|
|
|
### Add to switch config:
|
|
|
|
```
|
|
management api gnmi
|
|
transport grpc default
|
|
provider eos-native
|
|
```
|
|
|
|
This enables Arista native YANG paths alongside OpenConfig.
|
|
|
|
## Recommended Dashboard Panels
|
|
|
|
### 1. VXLAN Tunnel Bandwidth (per VTEP)
|
|
|
|
Shows total VXLAN encapsulated traffic per leaf pair:
|
|
|
|
```promql
|
|
sum by (source, vtep) (
|
|
rate(gnmic_interfaces_interface_state_counters_out_octets{
|
|
interface_name="Vxlan1",
|
|
role="leaf"
|
|
}[1m]) * 8
|
|
)
|
|
```
|
|
|
|
### 2. VTEP Connectivity Heat Map
|
|
|
|
Matrix showing which VTEPs can reach each other:
|
|
|
|
```promql
|
|
gnmic_bgp_neighbors_neighbor_state_session_state{
|
|
afi_safi_name="L2VPN_EVPN"
|
|
}
|
|
```
|
|
|
|
### 3. EVPN Route Count (Proxy for VNI Health)
|
|
|
|
```promql
|
|
gnmic_bgp_neighbors_neighbor_afi_safis_state_prefixes_received{
|
|
afi_safi_name="L2VPN_EVPN"
|
|
}
|
|
```
|
|
|
|
### 4. VXLAN vs Underlay Traffic Comparison
|
|
|
|
Compare VXLAN encapsulated vs total underlay:
|
|
|
|
```promql
|
|
# VXLAN traffic (overlay)
|
|
sum(rate(gnmic_interfaces_interface_state_counters_out_octets{interface_name="Vxlan1"}[1m])) * 8
|
|
|
|
# vs
|
|
|
|
# Total underlay traffic
|
|
sum(rate(gnmic_interfaces_interface_state_counters_out_octets{interface_name=~"Ethernet.*"}[1m])) * 8
|
|
```
|
|
|
|
## Summary
|
|
|
|
**You already have comprehensive VXLAN monitoring** through:
|
|
- ✅ Vxlan1 interface metrics (tunnel traffic)
|
|
- ✅ BGP EVPN neighbors (VTEP reachability)
|
|
- ✅ EVPN route counts (VNI propagation)
|
|
- ✅ Underlay interface health (tunnel foundation)
|
|
|
|
This is **sufficient for production monitoring** and will power your Flow Plugin visualization perfectly.
|
|
|
|
If you discover the native Arista VXLAN paths, we can add them as an enhancement, but they're not required for a functional monitoring stack.
|
|
|
|
## Next Steps
|
|
|
|
1. **Use current config** - It's production-ready
|
|
2. **Create VXLAN dashboard** - Use the queries above
|
|
3. **Optional: Discover native paths** - If you need per-VNI details later
|
|
|
|
The beauty of this approach: **It works right now** and gives you 90% of what you need for VXLAN monitoring!
|