IOS shim client 'chasfs' has taken 272 msec (runtime: 272 msec) to process a 'stack chasfs fd' messa...

Wei Ting Lin · ‎07-25-2021

I have six stacked 9200 with an error message,
The version is 16.12.4,

PowerOn uptime is 35 weeks, 1 day, 8 hours, 47 minutes

log:

Jul 26 2021 12:22:08.348 TPE: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'chasfs' has taken 272 msec (runtime: 272 msec) to process a 'stack chasfs fd' message
Jul 26 2021 12:33:58.901 TPE: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'chasfs' has taken 260 msec (runtime: 256 msec) to process a 'stack chasfs fd' message
Jul 26 2021 12:46:40.741 TPE: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'chasfs' has taken 260 msec (runtime: 260 msec) to process a 'stack chasfs fd' message
Jul 26 2021 13:11:12.769 TPE: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'chasfs' has taken 256 msec (runtime: 256 msec) to process a 'stack chasfs fd' message

The discussion thread has seen other people have similar situations, is this a bug?

Leo Laohoo · ‎07-26-2021

I would avoid 16.12.X because it is very unstable.

16.12.4, particularly, have several PoE/SNMP bug that, even with the recommended SMU installed, cannot be fixed.

ron_fourie · ‎09-28-2022

Hi Leo,

What kind of POE issues have you experienced?

I have a number of cameras running on 9200 switches and it would appear that the switch is nor ptoviding suffient POE for the camera to fully operate...

Have you found a software version where this has been resovled in?
Cheers,
RF

Leo Laohoo · ‎09-28-2022

@ron_fourie wrote:
What kind of POE issues have you experienced?

Where do I start? I will "try" to keep this as succinct as I can.

One of the biggest issue(s) with 16.12.X, particularly 16.12.4, 16.12.5 and 16.12.6, is how easy it is to crash switch members (or entire stack) under "normal", day-to-day operation. It is trivial and does not require any effort at all. In a lot of times plugging a stacking cable is enough to cause the stack-mgr process to crash and take the entire stack with it in "blaze of glory". (And if this do happen, this is another story because there are several bugs whereby switch, routers, WLC will crash but will not leave any crashinfo, crashlogs, tracelogs, etc. It will just incorrectly report that the stack "reload by power". Remember, several bugs.)

All it takes is a single port to go down/up continuously for several days. If the switch crash, count yourself "lucky" because that resets the issue back to "0". If it does not crash, then several things will happen:

PoE process will silently crash in the background. PoE will stop working on a particular port or set of adjacent ports.
Port will be up/up but 0 traffic will move out of the switch. Like #1, either one port will stop working or set of adjacent ports will be affected.
There is a term we call "ghost ports". In the logs, we would see port(s) go down/up continuously. But upon visual inspection of the switch port, NOTHING is connected to the port. The port is EMPTY.

Now, I said "a single port":

What happens if there are more than one port flapping continuously?
What happens if operators use "no logging event link-status" and they do not see the port(s) go down/up?

If the above-mentioned behaviour is found in the logs (and we cannot stop the port flapping in time) we have to reboot 3850 switches every 4 weeks and 9300 have to be rebooted every 3 months. If we fail to do any proactive reboots, we will get a crash (within 2 months for a 3850 and 4 to 6 months for a 9300). Guaranteed.

And reporting these bugs we found bring another set of problems outside our control. We have tried, multiple times, to report the above behaviours with TAC but agents working on all of our TAC cases (yes, ALL of our TAC cases) are more interested in closing the cases. TAC seems to think that a workaround of "reboot switch" can be classified as a "solution" and gleefully closes the case. We have brought this to the attention to our Cisco AM/SE and they tried. (Personally, I have a feeling that fixing these bugs is not a priority.)

@ron_fourie wrote:
Have you found a software version where this has been resovled in?

If the switch/stack can support 16.6.X, then downgrade to the latest/last 16.6.X firmware.

IMPORTANT:

Do not forget to read FN - 72323 - Cisco IOS XE Software: QuoVadis Root CA 2 Decommission Might Affect Smart Licensing, Smart Call Home, and Other Functionality. This Field Notice affects any platform (router, switches, WLC, etc) that runs on IOS-XE version 16.X.X and 17.X.X. Even though I said, downgrade to 16.6.X, use 16.6.X but also perform the workaround. Look at the graph below:

9300, IOS-XE version 16.12.4

Above graph is the control-plane memory of a 9300 switch member (of a stack) that is on 16.12.4. Notice that from September 2021 until February 2022 there is a slow "increase" of the memory utilization? That is a "normal" memory leak. From March 2022, the increase went up significantly and this is due to the Field Notice (above). I applied the "workaround" somewhere in April 2022 and it took several days for the memory leak to "stabilize".

IOS-XE, regardless if on 16.X.X and 17.X.X, is extremely buggy. The only time I can have a stack of switches with >5 years of continuous uptime is when my 3850 is on 3.6.X. Any version after 3.X.X (like 16.X.X or 17.X.X) needs to be proactively rebooted. For 3850, I have to do proactive reboot every 6 months (up from every 4 weeks). For 9300 it is every 18 months.

Hope this helps.

jianchenzhang · ‎11-22-2022

do you have bug ID for this can be share

marce1000 · ‎07-26-2021

- Probably a bug , however a it is logged with a ...-6-... severity level which is informational (so not too important). You could mask these messages with a logging discriminator in the running configuration as in :

logging discriminator INFRA severity drops 6 facility drops IOSXE mnemonics drops PROCPATH_CLIENT_HOG

!

logging buffered discriminator INFRA 100000
logging console discriminator INFRA
logging monitor discriminator INFRA
logging host x.x.x.x discriminator INFRA

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '