5.7 KiB
5.7 KiB
Quick Start Guide - EVPN-VXLAN Monitoring Stack
Prerequisites
- ContainerLab topology deployed with management network named
evpn-mgmt - Docker and Docker Compose installed
- gNMI enabled on all switches (should already be configured)
Deployment Steps
1. Deploy the Monitoring Stack
# Navigate to monitoring directory
cd monitoring
# Start all services
docker-compose up -d
# Verify all services are running
docker-compose ps
# Expected output:
# NAME STATUS PORTS
# gnmic Up (healthy) 0.0.0.0:9804->9804/tcp
# prometheus Up (healthy) 0.0.0.0:9090->9090/tcp
# grafana Up (healthy) 0.0.0.0:3000->3000/tcp
2. Verify gnmic is Collecting Metrics
# Check gnmic logs
docker logs gnmic
# Should see successful subscription messages like:
# "starting connection to target 'spine1'"
# "target 'spine1' gNMI connection established"
# Check metrics endpoint
curl http://localhost:9804/metrics | grep gnmic_interfaces | head -5
# Should see interface metrics:
# gnmic_interfaces_interface_state_counters_in_octets{...} 12345
# gnmic_interfaces_interface_state_counters_out_octets{...} 67890
3. Verify Prometheus is Scraping
# Check Prometheus targets
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | {job, health}'
# Should show gnmic target as "up":
# {
# "job": "gnmic",
# "health": "up"
# }
# Query a specific metric
curl -G http://localhost:9090/api/v1/query \
--data-urlencode 'query=gnmic_interfaces_interface_state_counters_out_octets{source="spine1"}' \
| jq '.data.result[0]'
4. Access Grafana
- Open browser: http://localhost:3000
- Login (optional): admin/admin
- Or use anonymous access (Viewer role)
- Navigate to dashboards:
- Dashboards → Browse
- Select "EVPN-VXLAN Fabric Flow Topology"
5. Generate Traffic (Optional)
To see bandwidth visualization in action:
# From your lab directory (not monitoring/)
cd ..
# Generate traffic between clients
# (Assumes you have traffic generation scripts)
bash scripts/generate-traffic.sh
Accessing the Stack
Service URLs
| Service | URL | Credentials |
|---|---|---|
| Grafana | http://localhost:3000 | admin/admin or anonymous |
| Prometheus | http://localhost:9090 | None |
| gnmic metrics | http://localhost:9804/metrics | None |
Available Dashboards
-
EVPN-VXLAN Fabric Flow Topology (
fabric-flow-topology.json)- Interactive flowchart of fabric topology
- Real-time bandwidth overlays on links
- Spine and leaf interface graphs
-
Fabric Overview (
fabric-overview.json)- General fabric statistics
- Device health overview
Troubleshooting
Problem: gnmic not collecting data
Check switch gNMI configuration:
# SSH to any switch
ssh admin@172.16.0.1
# Verify gNMI is enabled
show management api gnmi
# Should show:
# Enabled: yes
# Transport: GRPC
If not enabled, add to switch configs:
management api gnmi
transport grpc default
Problem: Prometheus shows no data
Check:
# 1. Verify gnmic is exposing metrics
curl http://localhost:9804/metrics | grep gnmic
# 2. Check Prometheus logs
docker logs prometheus | tail -20
# 3. Check Prometheus config is valid
docker exec prometheus promtool check config /etc/prometheus/prometheus.yml
Problem: Grafana dashboard shows "No Data"
Check:
-
Prometheus datasource: Configuration → Data Sources → Prometheus
- URL should be: http://prometheus:9090
- Click "Save & Test" - should show green "Data source is working"
-
Query in Explore:
- Menu → Explore
- Select "Prometheus" datasource
- Run query:
gnmic_interfaces_interface_state_counters_out_octets - Should return results
-
Time range: Ensure dashboard time range shows recent data (last 1h)
Problem: Flow diagram not rendering
Check:
-
Plugin installed:
docker exec grafana grafana-cli plugins ls | grep agentyShould show: agenty-flowcharting-panel
-
If missing, reinstall:
docker-compose down docker-compose up -d
Stopping the Stack
# Stop all services
docker-compose down
# Stop and remove volumes (fresh start)
docker-compose down -v
Updating Configuration
Update gnmic subscriptions
- Edit
gnmic/gnmic.yaml - Restart gnmic:
docker-compose restart gnmic
Update Prometheus scrape config
- Edit
prometheus/prometheus.yml - Reload Prometheus (no restart needed):
curl -X POST http://localhost:9090/-/reload
Update Grafana dashboards
- Edit JSON files in
grafana/dashboards/ - Restart Grafana:
OR update via UI and export
docker-compose restart grafana
Next Steps
- Explore metrics: Use Prometheus Explore to see all available metrics
- Create custom dashboards: Build specific views for your use cases
- Add alerting: Configure Prometheus alerting rules
- Add more visualizations: Enhanced BGP, VXLAN, and MLAG dashboards
Useful Commands
# View logs for all services
docker-compose logs -f
# View logs for specific service
docker-compose logs -f gnmic
# Restart specific service
docker-compose restart prometheus
# Check resource usage
docker stats gnmic prometheus grafana
# Execute command in container
docker exec -it gnmic sh
Support
- gnmic: https://gnmic.openconfig.net
- Prometheus: https://prometheus.io/docs
- Grafana: https://grafana.com/docs
- Flow Plugin: https://grafana.com/grafana/plugins/agenty-flowcharting-panel/
For issues specific to this lab, check the main repository documentation.