03-26-2012 11:24 PM - edited 03-04-2019 03:48 PM
Hi All,
We are using Cisco 4510R-E Switch with 2 WS-X45-SUP6-E in SSO.
The IOS which running on Chassis: cat4500e-ipbasek9-mz.122-54.SG1.bin
From past 3-4 Days we are observing High CPU in this chassis and the process which is consuming maximum CPU is GaliosObflFilesys process.
The output is here:
Yogesh_Cisco4500#sh processes cpu sorted
CPU utilization for five seconds: 59%/0%; one minute: 65%; five minutes: 65%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
80 3582290369 198427228 18053 47.67% 50.69% 50.72% 0 GaliosObflFilesy
Apart from this, we are getting below error messages continously on Console:
384665: Mar 27 08:41:46.933 IST: %C4K_SWITCHINGENGINEMAN-4-TCAMINTERRUPT: (Suppressed 380 times)flCam0 aPErr interrupt. errAddr: 0x1FD9 dPErr: 0 mPErr: 1 valid: 1
384666: Mar 27 08:41:54.538 IST: %C4K_L3HWFORWARDING-4-FLTCAMPARITYERROR: (Suppressed 766 times)FL Tcam Perr with no FwdEntry Hw index: 8152 Hw entry: Sw entry:
384667: Mar 27 08:42:16.972 IST: %C4K_SWITCHINGENGINEMAN-4-TCAMINTERRUPT: (Suppressed 384 times)flCam0 aPErr interrupt. errAddr: 0x1FD9 dPErr: 0 mPErr: 1 valid: 1
384668: Mar 27 08:42:47.006 IST: %C4K_SWITCHINGENGINEMAN-4-TCAMINTERRUPT: (Suppressed 393 times)flCam0 aPErr interrupt. errAddr: 0x1FD9 dPErr: 0 mPErr: 1 valid: 1
384669: Mar 27 08:42:54.571 IST: %C4K_L3HWFORWARDING-4-FLTCAMPARITYERROR: (Suppressed 763 times)FL Tcam Perr with no FwdEntry Hw index: 8152 Hw entry: Sw entry:
Also got the error messages are generated because of SUP in slot 5 which is in Active state:
Yogesh_Cisco4500#sh platform software obfl module all
1 : no obfl storage
2 : no obfl storage
3 : no obfl storage
4 : no obfl storage
5 : slot-5: messages=95200748 int-logged=80533419 int-dropped=0 dirty=yes
5 : slot-5: version=1 sector: size=4096 written=47182041 dirty=21
6 : remote supervisor
7 : no obfl storage
8 : no obfl storage
9 : no obfl storage
10 : no obfl storage
We got one Bug CSCsv17545 which is not hitting in our case since it is resolved 12.2(54)SG1.
Wanted to know whether it is Software issue or hardware and how to isolate the same.
02-24-2015 11:00 PM
We have cisco4948E switches on which CPU utilization is hitting 99%.
We are also getting similar messages like below.
Feb 24 2015 19:21:03.227: %C4K_L3HWFORWARDING-4-FLTCAMPARITYERROR: FL Tcam Perr with no FwdEntry Hw index: 2858 Hw entry: Sw entry:
Feb 24 2015 19:21:03.227: %C4K_SWITCHINGENGINEMAN-4-TCAMINTERRUPT: flCam0 aPErr interrupt. errAddr: 0xB2B dPErr: 1 mPErr: 0 valid: 1
#############################################################################
See CPU utilization also:
CPU utilization for five seconds: 100%/0%; one minute: 99%; five minutes: 99%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
78 11050151481080503202 1022 87.18% 82.34% 80.73% 0 GaliosObflFilesy
59 36977361632981193082 0 9.74% 13.88% 13.92% 0 Cat4k Mgmt HiPri
237 104385816 95624411 1091 1.41% 0.32% 0.25% 0 OBFL MSG slot-1
60 10471031151190991537 879 0.47% 1.69% 2.96% 0 Cat4k Mgmt LoPri
#show platform software obfl module 1
1 : slot-1: messages=373929329 int-logged=320753973 int-dropped=0 dirty=yes
1 : slot-1: version=1 sector: size=4096 written=272295620 dirty=413
also interesting thing is bootflash doesn't show IOS images.
Error in reading FlashDriverCompactstarting at sector 286
Aborting read operation
Error in reading FlashDriverCompactstarting at sector 286
Aborting read operation
%Error show bootflash: (No such file or directory)
It seems that we are having some issues with bootflash
02-25-2015 02:49 AM
Hi Rizwan,
Here, as per observation, Software is unable to recover these error messages (As the HW Index and ErrAddress in these errors are common) or SUP Engine is having Hardware problem (Since the Errors are collected on Active SUP Engine only). We have to isolate this issue by below method.
Remove the SUP & Reinsert it again / if you have spare SUP then replace Active SUP Engin
If same errors observed, we need to upgrade the Image & if not then need to replace the SUP Engine.
In our case, issue got resolved after removing and reinserting the SUP Engine.
CPU was normal too.
Additionally, you can enable Bootup level complete, so that HW issue of Active SUP can be tackled during Bootup only & then you can remove the Active SUP Engine.
Regards,
YSR.
02-25-2015 06:59 AM
Thank you for your response. It was helpful to understand issue.
In my case though there is only one supervisor engine.
So we need to remove it and reinsert it again is what I can make out of your suggestion.
But there is another challenge in our case. I.e show bootflash command don't show any IOS image
It throws error saying that error reading at sector 286
So we have doubt that our switch will stuck in rmmon mode only
do you remember that while facing this issue you also had problem with bootflash memory or in our case only we have stuck in two independent issues at a time.
02-25-2015 07:02 AM
No, we didnt face Bootflash memory problem.
Can you check the size of the Bootflash using sh file system command? Need to make sure that the Image is present in Bootflash.
Hoping that IOS is stored in Bootflash not in disk.
02-25-2015 07:18 AM
Yes yogesh,we have checked with show file system command:
In bootflash section used column shows 0 only
So we think our bootflash is also corrupted
10-11-2016 01:09 PM
Finally got solution;hit by bug;enabled and disabled ip routing. Cpu utilization problem solved
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide