Re: CGNAT problems ASR 1k

divadko · ‎12-31-2022

Hi all,

i have an interesting issue with asr 1006 and cgnat. Happends once per 2-3 months.

The customers reporting issuew with browsing. The need to refresh page in browser several times to load the page. The speedtests are ok, ping are ok, no packet loss. Just browsing issues. Tryed to change dns server and some upgrades nothing helped.

After that aroud 10 customers called i just found that all customers are using the same public IP from CGNAT pool. I thought that there should be some ip conflict or something so I changed nat pool to exclude that public IP from pool.

It helped for 2 monthes and problem is back with wifferent ip. So the issue is same and all customers are using the same IP.

How can investigate what is the problem? Clear ip nat translations * will solve the problem. But how to solve it pernamenlty?

If i dont clear nat then the problem wil be ther until i clear it or change customers ip that will be allocated to different public ip from pool.

This is my nat statistics:

#sho ip nat statistics 
Total active translations: 670933 (0 static, 670933 dynamic; 670933 extended)
Outside interfaces:
  Port-channel20.900
Inside interfaces: 
  Port-channel20.6, Port-channel20.11, Port-channel20.12, Port-channel20.50
  Port-channel20.61, Port-channel20.70, Port-channel20.71, Port-channel20.100
  Port-channel20.101, Port-channel20.102, Port-channel20.111
  Port-channel20.910, Port-channel20.1114, Port-channel20.1116
  Port-channel20.1127, Port-channel20.1128, Virtual-Template1
Hits: 14323814438622  Misses: 34093656711
Expired translations: 31174201530
Dynamic mappings:
-- Inside Source
[Id: 1] access-list 1 pool WAN_TEST_1 refcount 525899
 pool WAN_TEST_1: id 4, netmask 255.255.255.128
        start ...188 end ....253
        type generic, total addresses 66, allocated 61 (92%), misses 0
longest chain in pool: WAN_TEST_1's addr-hash: 0, average len 0,chains 0/256
[Id: 2] access-list 2 pool Docsis_1 refcount 145040
 pool Docsis_1: id 3, netmask 255.255.255.128
        start ...152 end ....180
        type generic, total addresses 29, allocated 12 (41%), misses 0
longest chain in pool: Docsis_1's addr-hash: 0, average len 0,chains 0/256
nat-limit statistics:
 max entry: max allowed 2147483647, used 670940, missed 0
 All Host Max allowed: 5000
In-to-out drops: 893247694  Out-to-in drops: 424586477
Pool stats drop: 1  Mapping stats drop: 0
Port block alloc fail: 0
IP alias add fail: 0
Limit entry add fail: 0

#show ip nat pool name WAN_TEST_1

NAT Pool Statistics

Pool name WAN_TEST_1, id 4
                              Assigned            Available
  Addresses                         61                    5
  UDP Low Ports                    981                32811
  TCP Low Ports                      6                33786
  UDP High Ports                309741              3948051
  TCP High Ports                597356              3660436

(Low ports are less than 1024. High ports are greater than or equal to 1024.)

Any idea pls?

balaji.bandi · ‎12-31-2022

How is your NAT config look like ?

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

David Kondicz · ‎12-31-2022

Here it is:

ip nat settings mode cgn
no ip nat settings support mapping outside
ip nat settings pap limit 60
ip nat settings nonpatdrop
ip nat log translations syslog
ip nat translation max-entries 2147483647
ip nat translation max-entries all-host 5000
ip nat pool WAN_TEST ....200 ....250 netmask 255.255.255.128
ip nat pool Docsis_1 ....152 ....180 netmask 255.255.255.128
ip nat pool WAN_TEST_1 ....188 ....253 netmask 255.255.255.128
ip nat inside source list 1 pool WAN_TEST_1 overload
ip nat inside source list 2 pool Docsis_1 overload

balaji.bandi · ‎12-31-2022

what is the logs when the user having an issue, before you doing NAT clear ?

also look at the below blog for some recommendations: (note I would suggest trying simple steps to see if there are any improvements)

http://www.dnzydn.com/2019/02/27/asr1000-cgnat-port-allocation-for-subscribers/

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

David Kondicz · ‎01-01-2023

There is absolutly nothing in logs... everything looks ok, but private ips mapped to this public are not working well

divadko · ‎01-01-2023

cant be the issue that i dont use "bpa" on the end of this line :

ip nat settings pap limit 60

?

MHM Cisco World · ‎01-01-2023

https://www.dcc.fc.up.pt/~rprior/1920/AR/CiscoDocs/ipaddr-cr-book.pdf

I think you are right check this command reference.

balaji.bandi · ‎01-01-2023

based on the requirement you need to config and monitor :

bulk-port allocation check the below guide

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipaddr_nat/configuration/xe-16/nat-xe-16-book/iadnat-bpa.html

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

David Kondicz · ‎01-01-2023

I applyed timeouts from link you shared and also added bpa to the end of the line in config.

But this errors are still rising every minute:

In-to-out drops: 904684907 Out-to-in drops: 426159528

Interesting thing ist that after i made this changes the active nat translations changed from 680000 ti 145000.

MHM Cisco World · ‎01-01-2023

I think you was in right track,

ip nat settings pap [limit {1000 | 120 | 250 | 30 | 500 | 60}] [bpa] [set-size set-size] [step-size step-size]
[single-set]

you select 60 which is
(Optional) Configures a limit of 60 local addresses per global address by using an
average of 1024 ports.

this make your global Add have can NAT only 60.
please review the link I share before and select the right value for your case.

David Kondicz · ‎01-01-2023

I have 3600 customers on 66 public IPs. In this case it should be anough to have pap 60 with by default 1024 ports. That should be really anough... so idk whe should be the problem

balaji.bandi · ‎01-01-2023

drops are a different topic all together....by doing that is that resolve the issue you originally reported?

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

divadko · ‎01-02-2023

Nope, after apply the timeouts and added bpa to the end of the line, it is ever wors. Much more customers called to support and they reported the same issue. So what can be wrong?

MHM Cisco World · ‎01-02-2023

I will check the all config tonight.
happy new year friend.

divadko · ‎01-02-2023

OK, so i foud an interesting thing.

The customers they report problems are translated to the same public IP.

I just found that this IP havent any ARP record in ASRs upstream router.

For eg. the customers with local ip 100.64.1.130 and 200 are translated to x.x.x.248

But on ASRs upstream router i can see for this ip x.x.x.248 only an incomplete arp record without MAC like this: Internet x.x.x.248 0 Incomplete ARPA.

But when i try to show ip nat translations | include 100.64.1.130 the i can clearly see that active translations are mapped to x.x.x.248 that is not visible on ASRs upstream router.

After I clear ip nat translation * the ARP issue on upsream router for ip x.x.x.248 will be fixed!

Any ideas? I tryed it several times, there is no other device with the same ip in network... but this happends with different publis ips too!