09-29-2015 02:58 AM - edited 03-05-2019 02:24 AM
Good day.
We have 2 3945E. One for IPS1 and the orher for ISP2.
They recieve BGP full-view from each ISP and have iBGP between them.
Both routers have almost identical config and IOS: c3900e-universalk9-mz.SPA.152-4.M2.bin
Just discovered that one of them have many RIB-failures:
Network Next Hop RIB-failure RIB-NH Matches
211.237.72.0/22 212.188.16.153 Mallocfail - rdb n/a
211.237.73.0 212.188.16.153 Mallocfail - rdb n/a
211.237.74.0 212.188.16.153 Mallocfail - rdb n/a
211.237.76.0 212.188.16.153 Mallocfail - rdb n/a
211.237.76.0/22 212.188.16.153 Mallocfail - rdb n/a
211.237.77.0 212.188.16.153 Mallocfail - rdb n/a
211.237.78.0/23 212.188.16.153 Mallocfail - rdb n/a
211.237.80.0 212.188.16.153 Mallocfail - rdb n/a
211.237.112.0/20 212.188.16.153 Mallocfail - rdb n/a
211.237.128.0/20 212.188.16.153 Mallocfail - rdb n/a
211.237.164.0 212.188.16.153 Mallocfail - rdb n/a
211.237.166.0 212.188.16.153 Mallocfail - rdb n/a
211.237.183.0 212.188.16.153 Mallocfail - rdb n/a
211.237.189.0 212.188.16.153 Mallocfail - rdb n/a
etc
but the other one have nothing.
for example: Net 212.176.160.0 has best way via ISP2, when traffic comes to R1 with ISP1 he via iBGP send it to R2 which have RIB-Failure for this NET
#sh ip bgp 212.176.160.0
BGP routing table entry for 212.176.160.0/20, version 521559
Paths: (1 available, best #1, table default, RIB-failure(22))
Advertised to update-groups:
2
so traffic for this net jumped beetwen our two 3945E
sh mem sum:
R1:
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 1F2FB780 492367072 469573932 22793140 109956 10177324
I/O C7FB780 313524224 46885776 266638448 266294768 263304668
R2 (with Rib-failure):
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 1F2FB780 492367072 476266736 16100336 104944 9723144
I/O C7FB780 313524224 88988292 224535932 224237872 224297052
Any ideas what the problem is? both routers were rebooted two days ago.
Solved! Go to Solution.
09-29-2015 09:25 AM
http://www.cisco.com/c/en/us/products/collateral/routers/3900-series-integrated-services-routers-isr/data_sheet_c78_553924.html
3945E only has 2Gig support. the other 3945 (without E) might support upto 4Gig but 2Gig is max IOS addressable.
but i think still 2Gig should help here.
09-29-2015 06:21 AM
Looks like your memory is maxed out on these boxes those outputs from show mem ,both running around 95% utilization
Are you receiving anything in logs related to memory failures ?
Do you receive full bgp tables? If you are you could restrict it so the ISP only sends you a default route to bring down the memory
There is a safe harbour image for those devices as well that would be a better choice IOS in case its something buggy in your IOS causing it as well
c3900-universalk9_npe-mz.SPA.154-3.M3.bin
09-29-2015 08:45 AM
no there is no any errors in logs, no monitoring system alarms, so i discovered this only when flapping occurs.
Yes we recieve full view table, yes it is possible to recieve only default route, but we need full table to make BGP decide which path is better. In near future we are going to add third provider and connect each ISP to each 3945 to make hardware failover at our side and let the BGP to rule traffic in best way.
Don't you think BGP itself make better traffic balance and path selection than we can make with defaults from each ISP?
May be it is possible to add some RAM to 3945? 16,32G ? Or change platform? Which platforms Cisco recommend for our setup at the momment?
Will be change IOS in next maintenance.
09-29-2015 09:00 AM
So nearly all of our routers are dual homed ISPs or we have 2 routers on each site with separate ISP links from different providers all BGP/MPLS running IBGP between them at core sites, we use prefix-lists with route-maps and community-lists to only allow in around 1000 routes to each router including a default route and that still gives us the ability for selecting the better paths rather than overwhelming the router with full tables which can obviously run up memory and cpu and are not really required to select best path as we don't need to know about everyone elses routes. Our core sites were running 7204s which will be upgraded soon to ASRs but on remote sites that are BGP/MPLS as well there only 3825s and when the routes are restricted they can cope fine
Im pretty sure your 3900s can support up to 4G of ram as well which would also rectify the issue
09-29-2015 09:25 AM
http://www.cisco.com/c/en/us/products/collateral/routers/3900-series-integrated-services-routers-isr/data_sheet_c78_553924.html
3945E only has 2Gig support. the other 3945 (without E) might support upto 4Gig but 2Gig is max IOS addressable.
but i think still 2Gig should help here.
09-30-2015 01:50 AM
so we need MEM-3900-1GU2GB 1GB to 2GB DRAM Upgrade (1GB+1GB) for Cisco 3925/3945 ISR for each router?
09-29-2015 09:02 AM
I dont think you can have the memory upgraded to more than 2Gig DRAM. right now you have 512 megs and this is not sufficient for Internet routing table.. earlier it was recommended to have atleast 1Gig but 2Gig DRAM should be helpful in your case.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide