Fixing the iBGP Next-Hop
Diagnose an external prefix that arrives over iBGP but stays INVALID, then fix it with next-hop-self.
Download the lab (topology + configs), unzip it, then from that folder run containerlab deploy -t topology.clab.yml.
Fixing the iBGP Next-Hop
This lab boots with a problem already in place. An external prefix, 3.3.3.3/32, is being advertised into your AS from a neighbor, it travels across your iBGP session, and it lands on your internal router r2. So far so good. But when you ask r2 whether it can actually use that route, the answer is no: the prefix is there, but it is INVALID. There is no > next to it, and r2 will not install it.
Your job is to figure out why a route that arrived perfectly fine is unusable, then fix it with a single line. The culprit is one of the most classic gotchas in all of iBGP: the next-hop.
Topology
The key detail: OSPF inside AS 65001 advertises the internal link (10.0.12.0/24) and the loopbacks, but not the eBGP edge subnet (10.0.13.0/24). r2 has no idea that link exists. Hold onto that fact.
Deploy the lab
Download the lab and unzip it (the download includes the topology and the router configs). From inside the unzipped folder, run:
containerlab deploy -t topology.clab.ymlThat boots all three routers. The route is already flowing, so head straight for the router where the symptom shows up, r2:
docker exec -it clab-bgp-next-hop-self-r2 vtyshStep 1: Diagnose on r2
Ask r2 what it knows about the external prefix:
The prefix is present and it carries AS_PATH 65002, so it clearly made it across iBGP from r1. But look closely: the next-hop is 10.0.13.3, and there is no > marking it as the best path. FRR even flags it as not valid. The route is in the table but unusable.
Why 10.0.13.3? Because 10.0.13.3 is r3, the original eBGP next-hop. Here is the rule that bites everyone once:
When a router re-advertises an eBGP-learned route to an iBGP peer, it does not change the next-hop.
r1learned3.3.3.3/32with next-hop10.0.13.3and handed it tor2with that same next-hop, unchanged.
Now ask r2 whether it can actually reach that next-hop:
You get nothing, the address is unreachable. r2 has no route to 10.0.13.0/24, because that eBGP edge subnet was deliberately left out of OSPF. BGP's rule is strict: if the next-hop is not resolvable in the routing table, the path is invalid and cannot be best. That is exactly what you are seeing. The route exists, but r2 cannot tell where to send the packets, so it refuses to use it.
The fix is not on r2. r2 is doing the right thing. The fix belongs on r1, the router that handed over a next-hop its peer can never reach.
Step 2: Fix on r1
Open r1 in another terminal:
docker exec -it clab-bgp-next-hop-self-r1 vtyshTell r1 to advertise itself as the next-hop on its iBGP session to r2:
next-hop-self makes r1 rewrite the next-hop of every route it sends to r2 to its own loopback, 1.1.1.1. And 1.1.1.1 is advertised in OSPF, so r2 absolutely can reach it.
Nudge BGP to re-advertise with the new next-hop:
(A soft refresh, clear ip bgp 2.2.2.2 soft out, works too and is gentler in production.)
Step 3: Verify on r2
Back on r2, look at the prefix again:
This time the next-hop is 1.1.1.1 and the path is marked best with a >. Confirm r2 can resolve it and that the route is now installed in the table:
1.1.1.1 resolves via OSPF, so the BGP next-hop is valid, and 3.3.3.3/32 is now a real, usable route.
✅ Objective 1: r1's iBGP session to r2 (2.2.2.2) is Established.
✅ Objective 2: r2 has a VALID best path to 3.3.3.3/32 with next-hop 1.1.1.1.
Troubleshooting
- Next-hop still
10.0.13.3on r2? Thenext-hop-selfline may not have taken effect on the session yet. Re-runclear ip bgp *(orclear ip bgp 2.2.2.2 soft out) onr1, then re-check onr2. - Route still INVALID after the next-hop changed? Confirm
r2can resolve the new next-hop:show ip route 1.1.1.1must return an OSPF route. If it does not, check that OSPF is up (show ip ospf neighbor) and that both loopbacks are advertised. - iBGP session not Established? It peers on loopbacks and is sourced from
lo. Verify OSPF has converged so1.1.1.1and2.2.2.2are mutually reachable, thenshow ip bgp summary.
Tear down
containerlab destroy -t topology.clab.ymlWhat you learned
- iBGP does not change the next-hop. A route learned over eBGP is passed to iBGP peers with its original next-hop intact, which is often an address that lives on the edge link and is unknown deep inside your AS.
- A BGP path is only valid if its next-hop is resolvable in the routing table. An unresolvable next-hop means no
>, no best path, no installed route, even though the prefix is right there. neighbor <peer> next-hop-selftells a router to advertise its own (loopback) address as the next-hop to that iBGP peer, so the peer can resolve it through the IGP. It is the standard fix and is applied on the router injecting external routes into iBGP.- An alternative fix is to carry the edge subnet into the IGP (advertise
10.0.13.0/24in OSPF) so every internal router can resolve the original next-hop. That works, butnext-hop-selfis cleaner: it keeps external link subnets out of your IGP and gives you one well-known next-hop per border router.
Next: put it all together from a blank slate in the bgp-ibgp-capstone challenge, where you build the IGP underlay, a full iBGP mesh, and pass an external prefix end-to-end across a transit AS.
Objectives
0/2 verifiedRun each command against your running lab, confirm what you see, and tick it off. Self-assessed for now; a hosted auto-grader will check these for you later.
r1's iBGP session to r2 (2.2.2.2) is Established.
$ docker exec -it clab-bgp-next-hop-self-r1 vtysh -c 'show ip bgp summary'r2 has a VALID best path to 3.3.3.3/32 with next-hop 1.1.1.1 (after next-hop-self).
$ docker exec -it clab-bgp-next-hop-self-r2 vtysh -c 'show ip bgp 3.3.3.3/32'