Multi-Site BGP & Routing Audit Automation

Problem

In a large multi-site network, BGP and routing policy should be consistent between equivalent sites — but real configurations drift. A missing route-map at one site, an extra prefix filter at another, a neighbor count that doesn’t match its twin: any of these can silently break reachability or leak routes between VRFs.

Auditing this by hand means logging into dozens of devices, running the same show commands, and eyeballing thousands of lines of output for differences. In practice it doesn’t get done, so anomalies live undetected for months until something breaks.

Solution

Built an engine that collects BGP and routing state with a small, fixed set of commands per device, parses it into structured data — neighbors, advertised and received prefixes, route-maps, VRF associations — and compares equivalent sites against each other.

It surfaces the anomalies that matter: a vpnv4 route filter applied at one site but absent at its peer, neighbor counts that don’t match between twins, prefixes advertised in one place and not another. VRF context — which the raw routing output loses — is reconstructed by joining the routing data back to the device config, so per-VRF policy is compared correctly rather than smeared together.

Architecture

SecureCRT collapsed-command collection  (4 commands / device)
    │  raw BGP + routing state
    ▼
Python parser
    ├── per-neighbor (prefixes in/out, state)
    ├── route-maps / prefix filters
    └── VRF reconstruction via config-join
    │
    ▼
Site-vs-peer comparison  ──→ anomalies (filter mismatch, neighbor delta)
    │
    ▼
Google Sheet  +  email

The engine was validated against a known-good captured fixture before it was trusted against the live fleet.

Key Decisions

Collapsed command set. Four commands per device instead of a long interactive script — a faster crawl, less load on the control plane, and something safe to run inside a change window.

VRF reconstruction via config-join. Routing tables alone lose the VRF intent. Joining the routing output back to the configuration restores per-VRF context so policy is compared apples-to-apples.

Twin comparison, not baseline. Equivalent sites are compared to each other rather than to a hand-built baseline — the same consensus philosophy used elsewhere in the toolchain. The fleet defines “normal.”

Fixture-validated first. Parsing routing output is unforgiving; the engine was proven against a known-good capture before anyone trusted its findings on production devices.

Results

BGP and routing state across dozens of sites audited in a single pass
Caught a real anomaly — a vpnv4 route filter applied at one site and missing at its peer
VRF-aware comparison (per-VRF neighbors and policy), not a flattened view
Findings delivered to a shared dashboard and emailed, so routing drift is visible rather than buried

How This Scales

Continuous route-leak monitoring — run on a schedule and alert on new policy divergence or unexpected prefixes.
Source-of-truth integration — compare live neighbor counts against intended peerings from NetBox or an inventory.
Flap detection — track neighbor state across crawls and alert on instability.
Multi-vendor — extend the parser to other routing platforms behind the same comparison engine.

Tech Stack

Collection: SecureCRT scripting (collapsed command set)
Engine: Python (BGP/route parsing, VRF reconstruction, site comparison)
Dashboard: Google Apps Script + Google Sheets
Alerting: scheduled comparison with email