cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9180
Views
5
Helpful
20
Replies

csr 1000v low throughput & high CPU usage

Serg_tsk
Level 1
Level 1

We have virtual router Cisco CSR 1000v in a cloud. There are 8 vCPU, based on Intel Xeon E5-2660 v3 (2,6 GHz). We have 1Gbit/sec license and heavy data-plane cpu template:

csr1000v#show platform software cpu alloc
CPU alloc information:

  Control plane cpu alloc: 0

  Data plane cpu alloc: 1-7

  Service plane cpu alloc: 0

  Template used: None

Our problem is low network throughput with DMVPN + IPSEC tunnels. I didn't saw network speed more than 650 mbit\sec and the CPU usage in that situation is full:

 

CPU utilization for five seconds: 77%, one minute: 77%, five minutes: 77%
Core 0: CPU utilization for five seconds:  1%, one minute:  1%, five minutes:  1%
Core 1: CPU utilization for five seconds: 100%, one minute: 99%, five minutes: 99%
Core 2: CPU utilization for five seconds: 100%, one minute: 99%, five minutes: 99%
Core 3: CPU utilization for five seconds: 100%, one minute: 99%, five minutes: 99%
Core 4: CPU utilization for five seconds: 100%, one minute: 99%, five minutes: 99%
Core 5: CPU utilization for five seconds: 100%, one minute: 99%, five minutes: 99%
Core 6: CPU utilization for five seconds: 17%, one minute: 18%, five minutes: 18%
Core 7: CPU utilization for five seconds: 99%, one minute: 99%, five minutes: 99%
   Pid    PPid    5Sec    1Min    5Min  Status        Size  Name                  
--------------------------------------------------------------------------------
 22424   21934    624%    624%    621%  R       1176592384  ucode_pkt_PPE0   
csr1000v#show platform hardware qfp active datapath utilization summary 
  CPP 0:                     5 secs        1 min        5 min       60 min
Input:     Total (pps)        68889        68538        52964        41879
                 (bps)    663208264    658867352    464399304    409978144
Output:    Total (pps)        68857        68538        52963        41880
                 (bps)    641941992    638171736    462007992    397564136
Processing: Load (pct)           99           99           72           63

Network speed become ~750 mbit/sec if I remove crypto and send packets with 1350 bytes length.

Tunnel and crypto configuration:

interface Tunnel200
  bandwidth 10000
 ip address 10.10.200.229 255.255.252.0
 no ip redirects
 ip mtu 1396
 ip bandwidth-percent eigrp 1 100
 ip nhrp authentication <secret>
 ip nhrp map 10.10.200.1 9.9.9.9
 ip nhrp map multicast 9.9.9.9
 ip nhrp network-id 200
 ip nhrp nhs 10.10.200.1
 ip tcp adjust-mss 1356
 delay 1050
 tunnel mode gre multipoint
 tunnel key 200
 tunnel path-mtu-discovery
tunnel protection ipsec profile <profile-name> shared
crypto ipsec transform-set <ts-name> esp-3des esp-sha-hmac 
 mode transport

crypto ipsec profile <profile-name>
set transform-set <ts-name>
set pfs group5

I tried to use AES-256 crypto,  but the result was the same as with 3des.

 

Our service provider cant offer any ways of elimination this problem. Can i obtaint 1 Gbit/sec throughput in this case ? Thank you.

 

With best regards,

Sergey Kanovskiy.

20 Replies 20

v.sylantiev
Level 1
Level 1

Hello, did you found any solution for this problem?

Have a same problem.

Trying to use CSR1000v 17.3 on VMWare ESXi6.7U3 on HP DL360 G8 with dual CPU E5-2697v2

License Security 1Gbit/s

But when it Run 200-300Mbit/s DMVPN with IPSec - it takes all 8 cores and delay is raise to 100-200ms

Serg_tsk
Level 1
Level 1

Hello. I haven't any solution, but we accepted that fact that virtual router uses a lot of resources for encapsulation, encryption-decryption. Those tasks is not "native" for general purpose CPU and those tasks requires ASICs and specialized chips in my mind. We have come to terms with the fact that 1 gbps is marketing, not real performance. Current rate in our system is 750 mbit/sec with full load.

With best regards, Sergey

"Those tasks is not "native" for general purpose CPU and those tasks requires ASICs and specialized chips in my mind."

As an aside, some years (decades, but who's counting) ago, before Cisco started including crypto hardware "standard", and while waiting for a Cisco crypto board, tested a 3DES encrypted tunnel on a 7200 (?).  W/o crypto module, less than a T1 throughput drove CPU to 100%.  After crypto module installed, able to run T1 at full rate, with "only" a 10 to 20% CPU increase.

I.e. specialized chips do make a difference.

TomasGahura2939
Level 1
Level 1

Hi,

we are facing same issue on C8000v with 5GB license. We are experiencing high latency and packet loss even at 3Gb/s throughput. Depends on the number of packets coming thru - usually a lot of smaller packets is the issue. On big packets we can get full 5Gb/s without any issue. We had 8 8 CPU's assigned to the VM but while testing, we found out it works better with 2 CPU's only. Maybe workaround, maybe not, but its working fine even on high throughput.

BUG ID  - CSCve73211  

Smaller packets, due to higher PPS rate, do often create higher loads.  So, you're seeing a difference based on packet sizes isn't really too surprising.

(Years [decades] ago, often network hardware vendors would document throughput using 1500 byte Ethernet packets.  Sometimes, back then, getting vendors to document throughput for 64 byte Ethernet packets was like "pulling teeth".)

Seeing an improvement using 2 CPUs vs. 8, though, is surprising.

I agree, it was surprise to me as well. We used to run 8 CPU's and had issues. Now we have only 2 and its running without any problem (well 5GB/s cause the VMware to run turbo for that one CPU) but it was working fine.

C8000v#show platform hardware qfp active datapath utilization sum
  CPP 0:                     5 secs        1 min        5 min       60 min
Input:     Total (pps)       262002       223601       229342       210894
                 (bps)   1701509024   1222316728   1170410248   1012115936
Output:    Total (pps)       261431       222967       228700       210252
                 (bps)   1701576032   1221695768   1168853848   1010488736
Processing: Load (pct)           25           23           24           23

C8000v#show platform software status control-processor brief     
Load Average
 Slot  Status  1-Min  5-Min 15-Min
  RP0 Healthy   0.49   0.48   0.50

Memory (kB)
 Slot  Status    Total     Used (Pct)     Free (Pct) Committed (Pct)
  RP0 Healthy  8087728  2679648 (33%)  5408080 (67%)   4040940 (50%)

CPU Utilization
 Slot  CPU   User System   Nice   Idle    IRQ   SIRQ IOwait
  RP0    0   3.31   1.41   0.00  95.25   0.00   0.01   0.00
         1  38.41   7.06   0.00  54.51   0.00   0.00   0.00

It looks much better with 2 CPU's only. Maybe in the future we will go with 4, but for now its working fine. 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco