Category Archives: Code bugs

The Horrible Bags We Hold For WLAN Vendors

Conventional wisdom says that “you get what you pay for” and “buy the best that you can afford” when it comes to quality in networking gear. Yeah… if only. Let me share what one of the most expensive solutions on the market gets you if you’re not careful. No vendor names will be named.

The call comes in. “Suddenly in this one area, I can see the Wi-Fi signal but just can’t get on the network. If I walk down the hallway the same device gets right on.” You look and see that the AP serving the area in question has the same uptime as those around it. The radios are on, and there are clients seemingly associated. Channel utilization is low on both radios, and there is no sign of RF trouble. Hmmm.

So you methodically rule everything out, and the end user who trusts that you keep a tight wireless ship waits. You’re both going on the assumption that the WLAN building blocks that you shell out fat coin for should be an operational foundation that you can trust. But when you’ve factored out all of the realistic possibilities, that little voice in your head starts questioning how solid that foundation is.

Too often, the one thing that we have very little control over (code) is the issue, and we find that suddenly there is a very ugly bag in our collective hand.

Welcome to the bug zone, Axl Rose.

Welcome to the bug zone we got fun and games
We got everything you don’t want- honey, you’ll call us names
We are the people that can’t find code you actually need
If you got the money honey we got your disease
In the bug zone, welcome to the bug zone
Watch it bring your Wi-Fi to it’s sha na na na na knees knees
I wanna watch your network bleed

(Sorry, Guns ‘n Roses- love you guys)

Maybe you open a support case, or take your angst to private channels where you share information with other wireless professionals who live the same pain are happy to compare notes. However you get there, you do get there… and then you find this sort of thing:

Yikes. Freaking yikes. The fix? (Always) migrate to new code.

That word “migrate” is kinda funny, too. Sounds adventurous… leave where you are, and go to someplace new.  Kind of exotic, even.

But there are no guarantees that Someplace New is any better than Where You Were, especially when it comes to expensive WLAN systems. Yet we find ourselves migratin’ all over the freakin place, outrunning one bug after another. Sigh…

Which brings us to yet another song, by the great Moe Bandy:

You always leave me holding the bag
Don’t you know it’s gettin’ purty heavy to drag
You think it’s funny but it ain’t no gag
How come you always leave me holding the bag

Indeed.

That Which Pisses Us Wireless Folk Off- Vendor Edition

Now there’s a title. And since you’re reading this, you bit on it… Sucka. Now that you’re here, let’s share some observations from the WLAN community over the last few weeks. This is not (totally) a “Lee’s complaining again” blog; it’s more a collection of sentiments from dozens of friends and colleagues from across the Wi-Fi Fruited Plain that stuck with me for one reason or another.

Most of these observations are aimed squarely at our vendors- those who we do business with “above” as we shape their offerings into the systems and services we offer to clients “below”, with us in the middle.

You may not agree with all of these. Perhaps some of your own beefs didn’t make my list. Either way, I’d love to hear from you in the comments section. Now, in no specific order:

  • Marketing claims. OK, we’re starting out with the obvious. Wi-Fi marketing has always been about hype, far-fetchedness, and creative blather. Nothing new under the sun here. I truly hope that your 10x better Wi-Fi is serving up 500 APs per client that are all streaming 62 Netflix movies each simultaneously from a range of 37 miles away from the AP.
  • “Enterprise” switches that don’t stack. Stacking is neither new, nor special. Do your bigger switches stack? Is it not even an option? If not, maybe tone down calling them “enterprise”.
  • Big Bucks for power cords. You got major balls as a vendor if you’re pricing garden variety power cables at $20 per.  Shame on you. Same same for PoE injectors, nothing-special antennas, rack mounts and assorted other parts/pieces that can be gotten for pennies on YOUR dollar elsewhere. C’mon…
  • No version numbers. By now, we all get “cloud”. And most cloud infrastructure vendors ARE using OS version numbers as a point of reference for their customers. The absence of version numbers becomes more onerous as ever more features get added. Give us the damn version number. Do it. Doooooo it.
  •  No CPU/Memory/Interface stats. It doesn’t matter what the “thing” is, or whether it’s cloud-managed or not. EVERY interface needs to show statistics and errors, and every thingy needs to show CPU and memory information. Whatever your argument to the contrary may be, I promise that you are wrong.
  • Frequent product name changes. Just stop already.
  • The same stinking model numbers used for everything. Why? Maybe someone has a 3 and 5 fetish out in Silly Valley. It’s confusing, it’s weird, and it’s weirdly confusing in it’s weirdness, which leaves me confused.
  • The notion that EVERYTHING to do with wireless must be monetized. After a while, we start to feel like pimps as opposed to WLAN admins. I get that vendors need to be creative with new revenue streams, but it can be carried to extremes when applied to the WLAN ecosystem.
  • Too many models. It seems like some vendors must be awarding bonuses to HW developers based on how many different versions of stuff they can turn out, but customers are left confused about what to use when and where and why versus the other thing down the page a bit. Variety is good, but massive variety is not.
  • Complexity. This might be news to some vendors: the ultimate goals in deploying your systems for both us and the end user are STABILITY and WELL-PERFORMING ACCESS. Somewhere, vendors have lost track of that, and they are delivering BLOATED and HYPER-COMPLICATED FRAMEWORKS that place a cornucopia of buggy features higher on the priority list than wireless that simply works as users expect it to.
  • Slow quote/support ticket turnaround. Most times when we ask for pricing or open a case with technical support, it’s because there is a need. As in, we need something. And our assumptions are that our needs will be fielded with some degree of urgency, as we’re all in the business of service at the end of the day. No one likes slow service. No one likes asking over, and over, and over, and over, and over if there are any updates to our need possibly getting addressed.
  • Escalation builds/engineering code bugs. At the WLAN professional level, most of us work off the assumption that if we don’t typically do our jobs right the first time, we may not get follow up work and ultimately may be unemployed. That’s kind of how we see the world. I’m guessing that WLAN code developers play by different rules. ‘Nough said.
  • Bad, deceitful specs. Integrity is what keeps many of us in the game as professionals. Our word is our bond, as they say. Can you imagine telling someone that you can deliver X, but then when they need X, you can actually only provide a fraction of X- and then expecting that person to not be pissed off? Why are networking specs any different? Enough truth-stretching and hyper-qualified performance claims that you have to call a product manager and sign an NDA to get the truth about.
  • Mixed messages. OK, we ALL own this one- not just the vendors. The examples are many- grand platitudes and declarations that might sound elegant and world-changing in our own minds, but then they often fizzle in the light of day. Things like…
    • We need mGig switches for 802.11ac! 
    • We’ll never need more than a Gig uplink for 802.11ac!
    • 2.4 GHz is dead!
    • Boy, there’s a lot of 2.4 GHz-only clients out there!
    • We’re Vendor X, and we’re enterprise-grade!
    • Why do I see Vendor X gear everywhere, mounted wrong and in nonsensical quantities for the situation?
    • That one agency is awesome at interoperability!
    • Why does so much of this stuff NOT interoperate?
    • You must be highly-skilled with $50K worth of licensed WLAN tools or your Wi-Fi will suck!
    • Vendor X sells more Wi-Fi than anyone, most people putting it in are obviously untrained, yet there are lots of happy clients on those networks!
    • Pfft- just put in one AP per classroom. Done!
    • Cloud Wi-Fi is a ripoff!
    • Cloud Wi-Fi saves me soooo much money and headaches!
    • Here’s MY version of “cloud!”
    • Here’s MY version of “cloud!”
    • I freakin hate how buggy this expensive gear is!
    • At least those bugs are numbered on a pretty table!

It goes on and on and on. Always has, always will. Behind the electronics that we bring to life and build systems from are We the People. The humanity involved pervades pretty much everything written here, from all sides and all angles. And I have no doubt that every vendor could write their own blog called “That Which Pisses Us Vendor Folk Off- WLAN Pro Edition”.  Touche on that.

Ah well- there’s still nothing I’d rather be doing for a living.

Will Reliability Be Prioritized Before Wi-Fi’s Whizzbang Future Gets Here?

This blog looks forward, but before we go there we need to zoom back to 1983 where I will corrupt John Mellencamp’s “Crumblin Down“:

Some features ain’t no damn good
You can’t trust ’em, you can’t love em
No good deed goes unpunished
And I don’t mind being their whipping boy
I’ve had that pleasure for years and years

Indeed. I too have had that pleasure for years and years. Whether it’s what comes out of mechanisms that are supposed to ensure that standards and interoperability testing bring harmony to the wireless world (but don’t), or code suck that flows like an avalanche coming down a mountain, I’ve been there and suffered that a-plenty. Somewhere during one of many wireless system malfunctions, the opening lyrics of “Crumblin’ Down” started blaring in my head, usually followed up Annie Lennox singing this line from 1992’s “Why”:

Why can’t you see this boat is sinking
(this boat is sinking this boat is sinking)

But enough of the musical ghosts trapped in my head, waiting to sing to me when the network breaks. We’re going forward, and as Timbuk3 sang in 1986- The future is so bright I gotta wear shades.

Maybe, maybe not on that.

Super-Systems Become Super-Terrific Systems

Soon, market-leading WLAN vendors will likely unveil grand strategies that finally bring real SDN kinda stuff to the Wi-Fi space. And just like the day is fast coming where you can’t just buy a simple RADIUS server from the same folks (you have to invest in a NAC system then simply NOT use the parts that aren’t RADIUS to get a RADIUS server), one day some Grand Orchestrator of All Networky Things will get it’s tentacles into our wireless access points and controllers and you might not have a say in that. (Some of this is already happening with specific vendors, but it’s all just warm-up for the big show, in my opinion.)

This magic in the middle will promise API-enabled everything network-wide, so provisioning and on-going operations on LAN and WLAN will be child’s play. The frameworks will have spiffy marketing names, and get pushed heavy as “where our customers should be going”.

Some of you are probably thinking “So what? This is evolution. Deal with it.” I’m down with that, to a point.

What If They Don’t Fix What’s Broke First?

I know well that I’m not alone in feeling a bit behind the 8-ball when it comes to our networking vendors. There are far too many code bugs impacting far too many components, end users, and networking teams. There’s also an entrenched culture that keeps chronically problematic operating systems alive when they should arguably be scrapped and the bug factories in full production.

I personally shudder to think what might happen if that grand vision for the future meets the Culture of Suck, and a whole new species of bug is unleashed on end users. Ideally, vendors would take a hard look at their code bases, their developers, and their cultures and ask if what’s in place today is worth rigging up a bunch of APIs to as part of The New Stuff.

As an end user, it terrifies me.

A House Built on Suck Can Not Stand

As a man-of-action-living-in-the-world, I’ve been around.  I’ve seen first-hand what happens during earthquakes to buildings and people when there are no rules governing building quality. I’ve seen carnage and devastation in multiple situations “out there” that all could have been prevented, and when I became Deputy Mayor of my village, I was able to appreciate what our Code Enforcement Officer does to keep people and buildings safe. Often it’s just curbing somebody’s foolish way of doing something.

As silly as it sounds, I’d love to see independent Code Enforcement Officers  for the network industry who enforce… well, code quality.  They would audit developers, their track records, and the pain inflicted on end users. Any vendor that gets too sloppy gets fined, or has to probably clean up their mess before they can keep developing. Like I said, I know how silly that sounds- but the current culture of poor Quality Assurance and protracted debug sessions at customer expense does not serve as a suitable foundation for the Super-Terrific Systems that are coming our way.

What’s really scary is that vendors tend to go all-in on these initiatives. It’s not like they leave a de-bloated, scalable option (key phrase) for those who don’t want all the Terrific Superness as they develop these monster frameworks of complex functionality.

I’d like to put on my sunglasses for the future of wireless, but if things aren’t cleaned up first for certain vendors, the current cloud over their wireless units is just going to get darker.

Code Bugs Do Have Real World Consequences

I’m not sure if my expectations are just too high for today’s world. When I buy a new vehicle, I don’t want to see surface rust forming two weeks after it leaves the lot. I don’t like the current presidential election and the horrible choice that voters have to make. And I actually expect that network vendors will put out decent code, or at least be very up front and open when significant faults are found. 

You see, those significant faults have real-world consequences. They bring operations to a screeching halt, and diminish organizational credibility. And ill-conceived “work arounds” and cavalier vendor attitudes to the customer’s bug-induced plight just make matters worse.

Here’s a real-world example.

I had a carefully worked-out maintenance window to upgrade both ends of a site-to-site VPN topology that spans Syracuse to London, using my favorite cloud-managed vendor’s gear. I’ve done this procedure at least a half dozen times, and have installed at least 30 of this particular security appliance. My Syracuse work was coordinated with a gent on the other end, and we’d do one end at a time. But… we never got past my end.

I configured the new appliance with what few settings it needed: IP address, gateway, subnet mask, and DNS servers. I saved them, then I waited for the indications that the box had made contact with the cloud and pulled down it’s updates. But those indications never came.

Like many a networker would do, I went to verify that the settings that I entered were correct. Curiously, there were NO settings saved. OK- maybe I forgot to save… The second try yielded the exact same result as the first. It was time to open a support case- as my maintenance window ticked away and my partner in London waited patiently.

I opened the case, then immediately called the support line (for the sake of expedience). I was told that this particular appliance has a firmware bug straight from the factory and that I’d need to find a DHCP-served network to use because it won’t actually save anything you enter with out-of-box firmware. When I asked if this was documented anywhere, I was told very matter-of-factly “we don’t share that information with customers” and that it shouldn’t be a big deal to just use DHCP.

Grrrrr.

Most places I’ve installed these appliances don’t have DHCP services readily available, because ultimately the appliances use a static IP and eventually ARE the DHCP servers for inside clients. And, I don’t tend to lug around an extra SOHO router just on the off-chance I’ll have to jam something in that can act like a DHCP server to get around a code bug that my vendor doesn’t feel customers need to know about before they actually try to use the product.

Let’s skip to the end:

  • I got to use some of my best “military” language after I realized the gravity of the situation
  • The maintenance window was busted, and the scheduled change didn’t happen
  • I probably lost credibility with my London partner as I was the Guy in Charge for this
  • My vendor has absolutely lost my confidence given the bug, and the “you should just be okay with this” attitude. I’m just not sure I can trust them at this point
  • This vendor had my respect and trust for years, and those have pretty much been undone with this one incident

So… I dragged the appliance off to where I could hook it up to a DHCP server and it could get a firmware upgrade. We’ll have to do the same on the London end, and then reschedule the outage and maintenance.

Sadly, the examples don’t end here. Same vendor- different hardware set. Also dealing with a long-running problem with a feature set that absolutely adds to the appliance’s stratospheric price tag. The work around? Don’t use the feature. The feature that I bought- to use. It’s insanity, and it’s way too frequent.

And I can just deal with that, because code bugs are pretty much a way of life anymore with certain vendors.