cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
25470
Views
10
Helpful
20
Replies

How to prove pause frames?

nicholaskriz
Level 1
Level 1

I have a pair of Cisco Nexus 5596 switches which are connected to a Dell Compellent SAN. I am trying to prove that the SAN is sending MAC Pause frames (flow control) to the Nexus. The Nexus shows the "RxPause" counter incrementing for each interface connected to the SAN. However, the support representative from Dell Compellent does not see the corresponding "TxPause" counter on the SAN incrementing. 

In other words: the Nexus says it is receiving pause frames, but the SAN says it is not sending them. This is a direct fiber connection, so one of the two devices is lying. I'm pretty sure it's the SAN, but I need more evidence to show it.

I attempted a traffic capture using the "monitor session" commands, but later learned that mac pause frames will not be forwarded in a SPAN session like this because the receiving interface intercepts and handles them before they can be forwarded. 

My question is this: how can I demonstrate beyond a doubt that I am receiving these pause frames? Ideally I'd like to have this in a PCAP file because I know the Dell support folks will believe what they can see in Wireshark. Failing that, what else can I do?

Thanks!

Nick

1 Accepted Solution

Accepted Solutions

Pause frames are consumed at the interface ASIC level and never sent to the CPU. It will not be captured by SPAN or ethanalyzer.

The Rx pause frames counters is a definite evidence that the switch is receiving the pauses. Not sure if its feasible for you to install a tap on the fiber to capture these frames but again this will just delay the resolution to the actual issue.

-Raj

View solution in original post

20 Replies 20

Hi Nick,

I did not try to capture PAUSE frame myself but I have use the Ethanalyzer for a similar purpose. I suggest you look at the Cisco Nexus 5000 Troubleshooting Guide for information about the Ethanalyzer built-in the Nexus switch.

Cisco Nexus 5000 Troubleshooting Guide - Ethanalyzer and SPAN

This capture utility may be able to capture the frames you are looking for.

BPDU should be seen on inbound-hi interface (eth4) but do try inbound-low interface (eth3) if you are not seeing them.

Hope this helps.

Cheers,

JF

Pause frames are consumed at the interface ASIC level and never sent to the CPU. It will not be captured by SPAN or ethanalyzer.

The Rx pause frames counters is a definite evidence that the switch is receiving the pauses. Not sure if its feasible for you to install a tap on the fiber to capture these frames but again this will just delay the resolution to the actual issue.

-Raj

Thanks JF. I did look through the guide you linked to, but was unable to find what I was looking for in there. I will try the captures again to see if it is visible, but based on Raj's answer below I don't think I'll find anything.

-Nick

nicolassadeg
Level 1
Level 1

We have exactly the same problem...

Cisco Nexus 5548UP, with Dell SAN Compellent connected.

Performances are really bad with multipath and Rx pause counter is increasing on switch side.

Be carful of the counter on the array side ! We where in WebEx with the support yesterday (and have a case open for one month now...), yes, the tx pause counter is not increasing on the array side, but the tx packet counter too...... seems that they have a problem with their counters !

Which model of Compellent you have and which SCOS ?

We have done some tests in 6.5, that where really better than 6.6 and 6.7 of SCOS.

If you agree, We will be happy to be in contact with you ;)

On my side, I've an optical TAP, to do a packet capture on the fiber. Will do it tomorrow and keep you in touch !

Nicolas

Thanks for the reply Nicolas! Sorry you are having the same problem, but at least I know I'm not the only one!

I believe we are on 6.5.20 for SCOS. I believe the model is CT-SC040. I am not the storage admin so I am not 100% certain.

If you are able to perform a capture with an optical tap, that would be a big stride forward in showing Dell that they are having this problem. I haven't been able to get anyone to believe me! I will keep an eye on this forum. If you are able to demonstrate the pause frames, I would be extremely grateful.

-Nick

Hi Nick,

Ok, for us, we have a SC8000, now in 6.7.

Unfortunately, the capture was not relevant.

Just to be sure that we have exactly the same problem, did you have performance problem linked to those pause frame ? Or only pause frame ? Because it could be normal to have some with the flow control activated.

And did you have Chelsio 10G dual port network card on it ? What we have see is that when both port on those cards are used, we have performance problem with pause frame. And when we move one of the port to another card, it works as expected without any pause...

On Dell side, ask them to have a look at the TxFrame counter when they are looking on the TxPause. As I told you, for us, both counter were not increasing. while there was traffic. If it the same for you, it prove that both counters are not usefull... 

Keep in touch,

Nicolas

Well that's the rub, isn't it? I can't tell you if I have a performance issue due to pause frames or not because I can't say 100% that I have pause frames in the first place. Yes, I do appear to be having performance problems with our Dell Compellent SAN. Yes, my Cisco Nexus tells me it is receiving pause frames. Then again, Dell says they aren't sending them...

All that said, I don't believe I am having performance issues because of the pause frames. I think I'm having issues, and the pause frames would be a great indicator that the problem is on the SAN. Perhaps one of these days my SAN vendor will be able to give me an explanation.

I will certainly have them look at the TxFrame counter. If that isn't incrementing, it would also mean that either they are sending zero packets (which is obviously not true), or that their counters are all messed up. Great proof that I need deeper investigation!

-N

Nicholas,

I am currently experiencing the same issue but with my Juniper 5100 Fabric connected to my clients Dell Compellent SAN.  And I am also experiencing performance issues on the interfaces reporting the most pause frames.  I think the SAN can not write fast enough internally for what it claims and we are not getting any "real" support from Dell either....

Best Regards,

Richard

Hi Richard,

If you have performance issue and tx pause counter increasing a lot on the switch side, on the ports facing the array, the array is the problem.

And if you have Chelsio T3 network cards, with 6.6 and more SCOS, you have the problem that we had.

Dell have an internal warning about that problem, they must be able to solve it !

PS : we have now a second performance problem, more on read with the Chelsio T5...

Did you ever receive additional information from Dell regarding this issue?

Hi,

For us, first issue (write latency) was solved by changing the network Chelsio card from T3 to T5.

And the second issue (read latency) was solved by enable flow control in tx on ports facing the Compellent (initially they recommend to be off...).

Hi Nicolas,

We are experiencing an issue on our SC8000 (connected to Nexus 5548) which sounds exactly like yours.. Read performance is excellent (2100MB/s) but write performance is horrible (200MB/s with 100ms latency when using larger blocks)

It seems to have started after upgrading to 6.7 from 6.5 and having an extra iSCSI NIC installed in each controller.. Do you have any case reference or anything regarding that internal warning I can point Copilot towards?

Hi Mark, Did you know if you have Chelsio T3 Network cards ? If yes, it will be the same problem :) Our case reference was SR 925374479. And they have an important internal notice about this. Regards, Nicolas

The Compellent unit we have has Chelsio T3s integrated into the controller, so we can't swap them out.  I've asked our storage team to go back to Dell with this information and see if we can swap out controllers for a unit with the T5 cards built in.