2.1(3c) does not fix CSCuh61202

ekerdman88 · ‎08-19-2014

After we upgraded to 2.1(3c), the issue described in CSCuh61202 still occurs. We have logged a new bug: https://tools.cisco.com/bugsearch/bug/CSCuq40256

Has anyone else experienced the same?

caseygordon · ‎08-21-2014

We started upgrading our different domains to 2.1(3c) from 2.1(1a) to apply the patched fnic driver that was supposed to fix the issue. Seeing fnic aborts now and it's very similar to CSCuh61202 but the impact looks to be different. In one of our UCS domains, we didn't see any issues for 2+ weeks after upgrading. We got reports of high disk latency on VMs and we started digging. On another UCS domain upgrade, we saw the aborts start right after the upgrade was UCS completed but have not seen any signs of disk latency yet. We have four other UCS domains that were upgraded from 2.1(1a) to 2.1(3c) and do not see the issue in any of those domains.

ekerdman88 · ‎08-22-2014

Very interesting. It basically seems that they half-fixed the bug. The frequency of the occurrences has reduced, but it occurs nonetheless. We're pushing Cisco for an expedited release of the fix.

kenk · ‎09-26-2014

Hello,

I wanted to let you know the latest on our UCS engineering team's discovery regarding the above mentioned defects (CSCuh61202 & CSCuq40256).

CSCuq40256 was discovered to be a mis-programming of the register (on the IO Module) that controls how long the adapter should pause for when a pause request is sent from the IO Module to the adapter. Until a fixed release is available, resetting the DCE of the impacted blade has been confirmed by engineering to reprogram the register correctly [updated 9/26/2014].

Both defects have the same symptoms, but their respective root causes are unique.

Please monitor the status of CSCuq40256 for updates as they become available.

Regards,

Ken

moehlertGIS · ‎10-22-2014

We had this issue when we upgraded from 2.1 to 2.2. Now I need to upgrade to 2.2(3b) to protect from the BASH vulnerability. Does anyone know if we still have to worry about this defect or can we do the infrastructure upgrade without any 'special' concerns?

kenk · ‎10-22-2014

Once you have your blade's adapater firmware upgraded to 2.1(3a) or above, you can return to the normal upgrade process and order (this is assuming that you are referring to CSCuh61202).

Regards,

Ken

moehlertGIS · ‎10-22-2014

Great! Thank you.

ekerdman88 · ‎05-19-2015

Not according to this: https://tools.cisco.com/bugsearch/bug/CSCus61659

It says you need to be at least on 2.2(3f) to avoid this PFC issue altogether. It truly feels like Cisco just cannot squash this bug (or a family of bugs).