by Craig Roman
A broadcast storm occurs when a network system becomes overloaded by continuous multicast or broadcast traffic. This happens when different nodes start sending and broadcasting data over a network link, and the other network devices respond by rebroadcasting the data back to that same network link. Eventually, this overload will cause the entire network to melt down and lead to the failure of network communication. It is a system admin’s worst nightmare (especially if it is a Monday morning!) and promises a very interesting day to say the least!
So, what to do when your forecast as the system admin at Findaway World for the day is cloudy with a chance of a broadcast storm??
Weather Today: Findaway World – Solon,Ohio
Mostly clear **WARNING** Chance of broadcast storm
early morning and throughout the afternoon.
High: 67° F Low: 49° F
Feels Like: 58°
but who cares if I can’t check email and connect to Netsuite???!!
Best plan: Be prepared. Have a resolution and an escalation process in place. And be thankful that your boss has a network background and is willing to roll up his sleeves and get involved.
When troubleshooting, the first rule is never enter into the issue with any preconceived notions of what is going on. I had a core switch go bad last month and I had replaced it. In a previous life I had port security enabled switch by switch and port by port via a mac address table. All inactive ports were disabled as a rule, so I am not even considering an issue on the client side. Got to be the switch right? Wrong… Here is the story…
10:53 am. Wham…the network goes down hard for the whole company. No email. No phone. No nuthin’. The first order of business is to isolate the problem. Then remind yourself that when you inherit a network that you might not have the full picture in your initial diagnosis. First step, isolate that core switch and see if the issue still exists. No. Next, plug in the secondary core switch and bang network down. Next, isolate the secondary switch unplugging the fiber link as well as the matrix module. But there’s still an issue on the second switch!! What is going on?? Time has now passed. Hours now and Findaway is not operational. So it’s time to call in the Big Guns. Get your network guy who actually designed the network on the phone.
**Suddenly your boss remembers that there is a CAT uplink cable to an edge switch that was never removed and overlooked. Run to the wiring closet in the AP Department and console into the edge switch and do the following:
at prompt type:
okay, the edge switch is flat-lined at 100% utilization
show lldp info remote-device BAM!
port 16 and 18 are identical
the switch is looped
unplug 18 and the network is back
do the happy dance
Now find the location on the patch panel and identity the cubicle in question. Proceed over to the cubicle to find a dumb hub plugged into two rj45 ports on the wall. “REVERB!” In this case spanning-tree is not going to help.
Yank the hub out of the wall as a co-worker saunters by and says, “Oh yeah, that was plugged in this morning.” Yup, at 10:53 am. : )
Note to self: Always strive to be proactive. In the immediate, make sure to enable loop protection on the ports in that department to prevent a future incident. Done. And remember, troubleshooting is often an art and not always science. Be open to a range of possibilities.