My small engines shop teacher said it in high school. Countless Air Force electronics instructors said the words when I went through Electronic Warfare school. I myself even harped on it when I became an Air Force instructor, and again years after when I taught basic electronics classes at a local vo-tech center.
Always first do a visual inspection when you’re troubleshooting. Always.
It’s easy to say, and just as easy to blow right past. Like I did yesterday when troubleshooting a wireless bridge link, which cost an extra hour of troubleshooting time.
In this scenario, a farm campus is tied together by three Ubiquiti bridges. It’s an environment that I took over and cleaned up a few years ago. I had my hands full eliminating all the oddball consumer routers that were in way too many places and moving the entire environment to a manageable topology that both I and the owner could understand. I inherited two M5 Nano Station bridge links, that were actually pretty well done- or so it seemed. Later, I would add a 900 MHz bridge link to get past a large stand of tall pines for a new connection, but this tale of my own shortcomings focuses on one of the M5 links.
The trouble call was for the single PC in the Robot Barn- a facility used for automatic feeding of dairy cow calves. The PC has two network connections; one goes to the modem that uplinks the robot feeders on proprietary low-voltage protocols, and the other connects to one of the M5s and ultimately back to the Meraki MX that head-ends the network. Basically, nothing was working.
A quick stop at the barn, and I found that the PC was in the kind of shape that comes when someone doesn’t know what they are doing, but are trying to fix it anyway. Both adapters had all kinds of oddball, nonsensical settings. I quickly got the dairy application side up so the important robot data was at least being buffered, and it could upload to offsite servers when I got the network link figured out.It was pretty clear that the PC was not talking back into the network, nor would my own laptop. But… from the remote end I could get to the far-side bridge admin interface, and see that it showed link down. On the way out of the building, I took a quick look and saw this:
Then, I drove to the other end of the farm to where the root bridge is. As I walked in to the building to check to make sure the root had link-light and such, I got distracted by one of the owners. He told me he had re-arranged some of the power cords and the monitor for the CCTV system, which are co-located with the network equipment the same time the problem started. Ah-hah! I’m highly skeptical of coincidences, and bit right into the probability that THIS MUST BE THE PROBLEM. I sat down, got into the root bridge UI, and started thinking desperate thoughts. Like… even though I can get into the UI on both bridges, maybe one died on the radio side. Or maybe one of the cheap power supplies wasn’t getting it done (despite both bridges eagerly presenting their UIs to me).
For the next hour, I let myself go down goofy rabbit holes. I replaced both bridge power injectors. I dorked with settings on each bridge. I falsely concluded that one bridge or the other was at least corrupted, if not bad. My next step was to take them both down and see if I could reset them and start over getting them to talk. I walked outside with one of the owners to show her where I needed to get access to take down the root bridge- and then felt profoundly stupid.
The root bridge was not where it was supposed to be. It was laying down on the metal roof, looking sadder than a country song on a Sunday morning. Remember, I inherited this bridge, along with the others. The “mast mount” was an anemic two sheet metal screws into the thin metal peek of the roof, and it’s amazing it held up as long as it did. Up I scurried, and cobbed it back into place with wire as it was getting dark with proper mounting to follow. And- the link came back up.
LESSONS:
- When I took responsibility of this network over, I should have looked closer at the shoddy way this bridge was mounted and dealt with it then.
- Whoever hosed up the computer shouldn’t have. The owners will work with the staff to ensure that doesn’t happen again.
- I SHOULD HAVE gotten out of my vehicle and walked immediately to where I could see the root bridge installed, after having verified all at the non-root site was seemingly fine.
- I SHOULD NOT HAVE gotten starry eyed jumping to the conclusion that the problem came from things being touched near the network equipment.
Having skipped the important visual inspection step at the root end pushed me into a trap of bad judgement that we all land in occasionally, and when I realized that had happened my mind was immediately flooded with voices from the past (including my own) saying yet again “Always do a visual inspection first!”.
Whether you’re looking for a wireless bridge laying on a roof, a burnt-out resistor on a circuit board, a corroded Ethernet jack, or a damaged fiber cable, a quick once-over with the eyes is sound practice before you start digging in on configurations.
Had I followed my own guidance, I would have had my client back in service a lot quicker.
(And yes… I did make sure all of the other bridges were mounted right before I left!)