cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
362
Views
2
Helpful
3
Replies

from C6500 to C9500, ACL TCAM above maximum what consequence?

thibaultm
Level 1
Level 1

Hello I already opened a discussion to calculate TCAM utilization,
I got that very good docs here :
https://www.cisco.com/c/en/us/support/docs/switches/catalyst-9500-series-switches/217266-validate-security-acls-on-catalyst-9000.html
So I have calculated and we will be above maximum TCAM allocation for Ingress IPV4 Access Control entries.

To see how we can tweak TCAM entries with custom SDM templates, I used the SW configuration guide :
https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst9500/software/release/17-9/configuration_guide/sys_mgmt/b_179_sys_mgmt_9500_cg/configuring_sdm_templates.html#Cisco_Task.dita_c0f5c41f-f363-4f10-9605-3d19ef379b4a

In that guide I see (in Table 4.) that the maximun for ingress ACL is 26624
(it has to be divisable by 2048) .

but the problem is that this is for both :
Security Ingress IPv4 Access Control Entries*:
Security Ingress Non-IPv4 Access Control Entries*:

In the SW configuration guide it is not possible to configure a value for IPv4 and for non-IPv4, so I assumed there is a fixed ratio that can't be configured between the two. I went further and I made the assumption that the ratio* was a constant,
and with 26624 for 'acl-ingress' it gives 15530 entries for Security Ingress IPv4 Access Control Entries and
11093 for Security Ingress Non-IPv4 Access Control Entries.

The issue is that with its ACLs the customer is above 15530 entries.
In fact as for TCAM entries it is not above; but with the L4OP and VCU it consummes in excess, and the document states :

VCU Exhaustion
Once over the L4OPs limit or out of VCUs, the software performs ACL expansion and creates new ACE entries in order to perform equivalent action without using VCUs.
Once this happens TCAM can become exhausted from these added entries.

But what is not clear is where from those ACE are taken from in the TCAM ?
if it is from Non IPV4 great, but if it is only from Security Ingress IPv4 Access Control Entries
the customer will be above 15530.

So my problem is, I cannot anticipate the behavior,
if ACE are taken from anywhere in the TCAM it is OK, because apart from ACLs his usage is not big
but if it is from Security Ingress IPv4 Access Control Entries* then there will be an erratic behavior with a full TCAM.

I'm afraid that to realise the consequence that it will be necessary for the traffic to be there, and I can only think of a prodcution traffic, with impact on production then.

I know it's not easy to answer but I would very much appreciate some or even a feedback on a comparable issue on Cat9500H

* By the way (feature request) I think this ratio should be made configurable, because it's too bad to be in a difficult situation with IPv4 traffic when nothing is consumed in IPV6

3 Replies 3

Joseph W. Doherty
Hall of Fame
Hall of Fame

Quickly skimming other Cisco information about TCAM exhaustion, and L4OP exhaustion's impact to TCAM exhaustion (an issue that seems applicable to other Cisco switches), L4OP exhaustion will generate many "standard" TCAM entries for an individual ACE (which would not be generated if there was no L4OP exhaustion).  It's the "many" additional entries that can quickly exhaust TCAM.

The number of additional TCAM entries would depend on the original ACE, and possibly, the actual switch's hardware.

As to what portion of TCAM would be utilized, I believe it would be the same TCAM portion supporting ACEs not requiring any L4OPs.

From a ACL and QoS TCAM Exhaustion Avoidance on Catalyst 4500 Switches :

 

Also, if the L4Op limit is exceeded, the specific ACE is expanded in the TCAM. Additional TCAM utilization results. This ACE serves as an example:

access-list 101 permit tcp host 8.1.1.1 range 10 20 any 
With this ACE in an ACL, the switch uses only one entry and one L4Op. However, if six L4Ops are already used in this ACL, this ACE is expanded to 10 entries in the hardware. Such an expansion can potentially use up a lot of entries in the TCAM. Careful use of these L4Ops prevents TCAM overflow.

Note: If this case involves the Supervisor Engine V-10GE and WS-C4948-10GE, eight previously used L4Ops in the ACL results in the ACE expansion.

 

 Also TCAM Resource Issue Workarounds Explained :

 

ACE Expansion Threshold

ACEs using L4 operators  - range, gt, lt, neq. There are two ways for software to handle L4 operators:

Allocate L4op (hardware resource) and program LOU register (another hardware resource)
Expand the ACE into multiple eq entries (i.e., CL TCAM entries)

Global command hardware access-list lou resource threshold controls when option 1 vs option 2 occurs for an ACE. The expansion threshold controls when expansion occurs, the default threshold is 5. If an ACE can be expanded into <=5 CL TCAM entries, no L4op allocated.

Pros/cons:


Expansion results in more TCAM entry consumption
L4op/LOU usage limited by L4ops per label (10) and LOU registers (208)

 

Possibly, the above two snippets will provide additional insight.

The moral of the story, first, don't exhaust resources, but, unfortunately, predicting resource needs isn't easy.

Besides monitoring resource usage on a Cisco switch, with any ACL, if might help, if device doesn't do it automatically, to manually or use a tool to optimize (reduce the number of ACEs in) the ACL.

thibaultm
Level 1
Level 1

Thanks a lot for your reply and all it interesting content.

Absolutely all switches are subject to TCAM exaustion, but the show tcam counts of the 6500
I will be migrating from has it seems 32768 entries for ingress egress ipv4 (ipv6 ?).
Catalyst 9000k have a more fragmented and dedicated TCAM space, TCAM entries are separated
along IPv4 and non IPv4 and ingress and egress.
So even if the global quantity is very significant too,
you can fully use one of these type of TCAM entries while others are not used.
That's why I'm convinced that more flexibility is needed and
porting the Nexus feature of adjusting IPv4 vs non IPv4 ACE through SDM custom template
would help a lot !

Your example of C4500 is very interesting but if Cat9k had the same behaviour I imagine it would be described
in the first document I listed above, no ?

summarizing ACLs is a very good idea, the document is very interesting, I'll dig into it.

Talking about it the customer told me he tried even with IA for ACL simplification (copilot I think)
but the result is not functionaly equivalent, it is more permissive
so it changes the specification.


@thibaultm wrote:

Thanks a lot for your reply and all it interesting content.


Have even more, possibly interesting, material about ACL optimization.

Found mention of an ACL optimization feature on the Cisco FWSM.  Found an old (unanswered) forum community posting asking for some more information.  Found another mention of the Cisco FWSM ACL optimization, in the relevant device section.  Even found this short paper about ACL optimization.  Lastly, it appears Cisco's security manager can perform various ACL optimizations.

I suspect, on platforms that can do ACLs in hardware, like those using TCAMs, "sloppy" ACL constructs don't impact performance, just TCAM resource needs, and as long as you don't run short of TCAM space, no need to really optimize an ACL.

On platforms that do the ACLs in software, "sloppy" ACLs, or those not "tuned" for usage patterns, slow throughput.  So, also some years back, Cisco introduced the Turbo ACL feature, to speed up software processing of ACLs.


@thibaultm wrote:

Absolutely all switches are subject to TCAM exaustion, but the show tcam counts of the 6500
I will be migrating from has it seems 32768 entries for ingress egress ipv4 (ipv6 ?).
Catalyst 9000k have a more fragmented and dedicated TCAM space, TCAM entries are separated
along IPv4 and non IPv4 and ingress and egress.
So even if the global quantity is very significant too,
you can fully use one of these type of TCAM entries while others are not used.
That's why I'm convinced that more flexibility is needed and
porting the Nexus feature of adjusting IPv4 vs non IPv4 ACE through SDM custom template
would help a lot !


Certainly, improvements might be made, but any hardware vendor tries to match a (sellable) product to what the market (believes) it needs.  Nexus has a bit of a different market orientation than Catalysts, so, just guessing, possibly why Nexus has some TCAM management superior than 6500s or 9Ks.  Even in the 9K series, there are subtle differences between model series and even with the same series, such as differences in their ASIC/UADPs.

In my readings, lower tier switches, like the Catalyst 3850, also have (apparently) an inferior TCAM ACL setup than even the later Catalyst 4K sups.  (No real surprise there.  Many will focus on PPS and/or bandwidth capacity, but there much more, architecturally, that can distinguish between various switches.  Much of that, though, is sort of for uncommon usage cases, but if you do have an uncommon usage, often a huge "performance" between network devices that share other common attributes.  So, also it's unsurprising, a C4500 and C9K aren't exactly alike.


@thibaultm wrote:

Talking about it the customer told me he tried even with IA for ACL simplification (copilot I think)
but the result is not functionaly equivalent, it is more permissive
so it changes the specification.


To me, that's also unsurprising.  "AI", to me, appears to be currently mostly an "Idiot-Savant".  If there's some kind of (good)  knowledge base it can draw upon, it does very well, but if not, it seems to stumble, without even realizing it's incorrect.  The difficulty I, and possible you, have finding information, might be the same issue an AI has.

Review Cisco Networking for a $25 gift card