Exceptions For Removing Private Autonomous Systems in BGP
Quite a while ago, I got assigned to what seemed like a straightforward network task: turning up a simple cross connect with an external company using eBGP for route advertisement. However, this mundane and simple task would lead me to a greater understanding of BGP quirks and exceptions.
To set the stage, our network and the other company’s network needed to advertise routes to each other on a stable and robust cross connect. We set up eBGP peering, ensuring everything was configured correctly. For this example let’s say my company was using private BGP AS 64512
while the other company was using 65534
. The peering came up without any problems. But after further verification, I realized there was a small snag to writing this off as completed.
The other company was receiving everything except one specific subnet advertisement from our router. It was rather peculiar that all the other subnets were making it over fine. Only this single subnet wasn’t making it across. Everything seemed fine on our end and the engineer I was working with at the other company claimed their configuration was equally sound. I began to doubt everything. What could possibly be wrong?
After much back-and-forth, meticulous troubleshooting, and pulling in other senior engineers, we discovered the culprit: overlapping private BGP AS numbers. Both our network and the external company’s network were using an identical private AS numbers. But not in the obvious way you might think. However, this caused a peculiar problem that made the remove-private-as
command on our Cisco IOS router ineffective.
Understanding Removal
A large enterprise network environment frequently uses the remove-private-as
command in their perimeter to sanitize information before it goes out to whichever vendor or client. It is designed to strip private BGP AS numbers (64512-65534) from the AS_PATH
attribute in BGP route advertisements. This is helpful when setting up peering relationships to other large enterprises that might have unique networks that would conflict with your own. The normally useful BGP attribute information is no longer relevant and can be safely discarded as you are traversing across two widely different entities.
But the information cleansing feature of remove-private-as
is also a risk. Without deeper knowledge of the other company’s network there are potential chances for conflict. According to Cisco’s documentation about removing private autonomous system numbers in BGP we hit an edge case that caused this specific subnet to not propagate over to the other company’s router.
The root cause of the problem arose because the subnet we were advertising contained the other company’s AS number of 65534
in the AS_PATH
attribute. Somewhere deeper in my company’s network this subnet originated in an area with the same private BGP AS number as the one we had chosen to use for this cross connect. When our router evaluated the subnet for the purposes of sanitizing it with remove-private-as
it saw the other company’s private BGP AS number in the AS_PATH
and consequently did not remove the private BGP AS of 65534
. Per the previously linked Cisco documentation:
If the AS_PATH contains the AS number of the eBGP neighbor, BGP does not remove the private AS number.
With the AS_PATH
attribute unmodified our router sent this subnet advertisement over to the other company’s router. But once the subnet arrived it led to the other company’s router dropping the advertised subnet because it saw its own AS number in the AS_PATH
, causing the BGP loop prevention mechanism to kick in. Their router thought there was a routing loop and correctly discarded the route.
Realizing this overlap was the root cause took some time. We initially suspected everything from misconfigurations to legacy hardware issues. Once we pinpointed the overlapping AS numbers as the issue, it was easy to draft a solution. We had to strip the private AS numbers further within our network to resolve the BGP discard issue. Specifically, we made a router further within our network perform the remove-private-as
AS_PATH stripping. After implementing the fix, the remove-private-as
command worked as intended, and the subnet in question propagated over to the other company without a hitch.
Lessons Learned
- Double-Check AS Numbers: When dealing with eBGP, always verify that your private AS numbers do not overlap with those of your external peers. Even on private cross connections. It’s a simple step that can save hours of troubleshooting.
- Understand Your Tools: Knowing how commands like
remove-private-as
work can help you understand why things break. Cisco’s documentation is incredibly helpful for understanding the nuances of these commands. They don’t always do exactly as they say. It was an error on my part thinking the command wouldn’t have exceptions. - Expect the Unexpected: Network configurations can be tricky, and what seems like a minor detail can lead to significant issues. Always be prepared for the unexpected and approach troubleshooting with an open mind outside of normal configuration checks.
I find it funny how a small detail like overlapping private BGP AS numbers can cause such a quagmire. It is only inevitable that something intended to be used strictly inside a company’s network would cause these issues when connecting to another network. However, these experiences are what make us better network engineers as it pushes us to have a better understanding of the protocol fundamentals. They teach us to be meticulous, to understand the tools at our disposal, and to always plan on learning more.
In the world of networking, it’s the insignificant details that cause the biggest headaches. An overlap of private BGP AS numbers taught me a valuable lesson about broadening your view of a network and having a better awareness of the commands I use on a daily basis. By sharing this story, I hope to shed light on a unique yet important BGP situation and help fellow network engineers avoid wasting their maintenance windows.