11-05-2021 02:14 AM - edited 11-05-2021 02:15 AM
Hi,
I was wonderring about the Architectural difference between a N3k/N5k/N9k Switches.
I can't find good information on it, my Google-Fu seems to not be sufficient enough.
Since we got a problem at a customer where a Nexus 3k (N3548P-10GX to be exact) Series Switch suddenly reloads
when a Takeover/Giveback action is started on a NetApp Metrocluster. The thing is the same Problem is
documented in the BugSearch Tool on a N9k with the same error:
port_client hap reset.
The Explanation is the port client isn't fast enough in getting the ports up and crashes then. On further updates they increased
the time for the up transition per port so the port client has enough time to do so without throwing an exception as it seems
Because of this the Switch reloads. The fix is a software update on the N9k. So im gonna try this first and see what happens.
Source: CSCvo17792 in Bug Search Tool
PS: If someone knows on what Nexus the new FI 64 Series is based on, i would appreciate the knowledge of this also!
Regards and keep on cracking!
11-05-2021 02:43 AM
Nexus 3500 has a totally different ASIC and architecture compared with Nexus 9000 switches.
So the bug you see there is not that relevant for you. Especially considering it is about the port-client (the software component monitoring the hardware) and the affected version is a iNXOS (ACI image). An upgrade definitely is a good idea if you are facing a software defect. If you see the problem persistent in the latest version, open a TAC case.
The ASIC on FI 6400 is from the same family as the ones on Nexus 9000 EX/FX (it's called Tahoe).
Stay safe,
Sergiu
11-05-2021 03:14 AM
HI Sergiu,
first, thank you for some information on the Topic. But my thirst for knowledge on this Topic is still not quenched! If you know some place where i can dive deeper into this, i would appreciate it.
So now back to the meat of the question, i did say that the error is EXACTLY the same on the 3K as it is described in the Bug report on the 9K with the error message beeing "port_client hap reset". So when this is Software related it could also "maybe" be the "same problem" on the 3K even IF the ASIC/Microcontroller Architecture is different. Sure the communication between everything is/might inherently different on a Software Level because of the differences. But Bugs are Bugs, right? And they like to appear on every device for which software is made.
Regards and keep on cracking!
11-08-2021 03:47 AM
My recommendation when it comes to learning anything related to Nexus switches (hw architecture, latest features, troubleshooting, fancy protocols) is ciscolive. Search for any platform you are interested and you will find unlimited resources.
About the error: generally speaking, if you run the unified NXOS version (7x, 9x) which is the same image on Nexus 3000, 3500 and 9000. then sure, we can assume a software bug will be fixed for all platforms.
However, since you have a nexus 3500 running NXOS, and the bug refers to ACI image, then 100% the fix from the bug (this particular bug) is not present in NXOS. Why I am saying this is because for each bug, there is a "commit" which represents the fix, and the commit will only be pushed to the images or train releases associated to that bug.
There might be other existing bugs for the same problem, bugs which are linked to NXOS images, but the one you shared is not one of them.
Take care,
Sergiu
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide