cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
14283
Views
0
Helpful
16
Replies

slow CIFS write performance

krattiger
Level 1
Level 1

Hi Community

I'm facing a problem here together with writing files to a CIFS share.

Reading  warmed data from the same CIFS share is perfect and fast. For a 4.5MB  Worddoc we do have like 0.5 seconds to download that from the share. If I  upload the same file, it does take more then 6 seconds.

The timing as well as the up- and downloads are scripted, so this behavior is showed during the whole night and day.

When I check "show stat conn opt" for the download, the session looks perfect an 99% of reduction. The uploads are looking also quite good, but related to the download, it takes ages.

Btw. the Fileserver is a NetApp Device and the client a Windows XP PC.

Any troubleshooting suggestions or well known issues on that ?

Kind Regards

-Lukas

16 Replies 16

Bhavin Yadav
Cisco Employee
Cisco Employee

Hi Lukas,

Before we come to any decision, few more details will help us determine if you are facing any known issues. If you are using 4.1.5f or older version, you might be hitting a known defect that was the result of MS security update.

Defect details:

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtg75908

Again, it depends on what is the version you are running on edge and server side plus the behaviour of the problem.

Hope this helps.

Regards,

Hi

So here some more information.

We are running WAAS version 4.2.3b. The lines are various bandwidth from 2Mb/s up to 100Mb/s. Latency is 8-10ms.

The thing I don't really understand is the big difference between read and write operation.

Kind Regards

-Lukas

Hi Lukas,

WAAS solution is for optimization and latency mitigation for WAN links. Generally 8-10 msec latency

considerd to be very low for wide area networks.

For your original question, WRITE operation is usually slower than the READ operation mostly due to the nature
of the CIFS protocol itself rather WAAS.Let's say a user edits a file and makes a change to a it, and write it back, that operation
will require a full write over the WAN.  A MS application is going to initially write 
the changed file as a temporary file and then to original file name before sending it out.
The write operation is going to be slower than pulling from cache, because the file will 
have to cross the WAN and will be subject to latency.
 
However, it will be much faster than a non-WAAS write operation because
you have optimization (TFO+DRE+LZ) as well as CIFS AO handling locally some of the
chattiness of CIFS.
 
On the write operation, we don't have the benefit of read-ahead, nor the
benefit of serving cache locally.  We are writing as fast as the client can
gives us data and it can be sent over the WAN, while buffering to speed
up what the client gives us.  The bits still have to be sent over the
WAN so the file server can store the new file.  No way around that.  But
We can get benefit of TFO/DRE/LZ.
 
Bottom line, READ and WRITE operations will not be equally comparable.
This link might help to understand some CIFS semantics:
http://www.cisco.com/en/US/docs/app_ntwk_services/waas/waas/v421/configuration/guide/filesvr.html#wp1042382

Hope this helps

Ashfaq-

Dear Ashfaq

I agree to your post. In normal conditions, my write is 4x slower then the reads, but this is not 28x slower what I expect here.

Let me give you some more information:

It takes about 1min to download a 100MB file from a CIFS share. This result is the same no matter if the files are on the NetApp or on the Win2k3-Server hosted.
Uploading the same 100MB file to the CIFS share hosted on a Win2k3-Server takes a little bit longer but still les then 2min.
These results are expected.
Upload the files onto the NetApp CIFS Share takes more then 10times longer -> 10min. This is with full optimization and CIFS AO.
To test if CIFS AO together with the small latency is causing the issue, we did disable CIFS AO and only use TFO, DER & LZ. With this settings, the transfer takes 16min.
Using a transfer with Passthru is reducing the time to 1min.
Removing the acceleration between Branch and DC improves the speed about 16 times -> 16min.
For testing if the issue is caused from a “caching” thing, we just enabled TFO. The uploading transfer with only TFO results also in 16mins.
It looks like only enabling TCP optimization breaks the performance in this specific case.
Kind Regards

-Lukas

Hi Lukas,

This seems odd.

What version of NetApp are you running?

Do you have DNS configured that can resolve NetApp Server?

Is it inline or wccp?

If you are not seeing any routing loops and your DNS resolves server name ok then I would suggest opening a tac with the sysreport of both WAEs and if possible a packet capture on the both WAEs on the client machine.

Thanks

Ashfaq-

Is there any news about the outcome of this issue.  We actually observe the same situation with WAAS 4.3.3 and a Netapp 3140 with Ontap 7.3.3.

We shut down CIFS AO, DRE and LZ in step by step manner.  Even with TFO only CIFS write performce is degraded by factor > 10.  By switching CIFS to passtru the CIFS could write a bandwidth typical performance.

Other filers (WIN  2008 and other Netapps on different hardware) are running with good performance.

Hello,

Couple of basics we need to find to address this issue.

1. What is the version of NetApp?

2. What sort of connection do you see when you use "sh stat conn" on the WAE? is it TCDL or TDL?

3. Do you think you have have smb signing turned ON on filers? you can use type-tail cifs_err.log file under \local\local1\errorlog\cifs folder to see if it is showing any SMB signing errors in the log files.

4. What happens if you create a policy to disable CIFS AO for the netapp filer? do you get better performance?

Regards.

Netapp hardware is 3140

ONTAP Version 4.3.3

WAAS 4.3.3

When the problen has been reported the connection was in the status TCDL.

Disabling of CIFS AO, DRE and LZ (step by step) diddn't improve thes situation. Only Passthru resulted in a performace typical for lind bandwidth and delay. Our assumption is a basic TCP flow correlated issue.

We crosschecked performance against MS 2008 filer (excelent result).

Hi Peter

We've been able to track down that  a software upgrade on the NetApp was the source of the problem.

Initially the custemer used NetApp Ontap 7.3.2 which did work fine. After the upgrade to 7.3.5p1 and 7.3.5.1 we started seeing the slow CIFS performance.

Changing the TCP windowsize to 65k on the NetApp solved the problem.

Some site notes: We had about 30 Branch-Sites with less then 10Mb/s which didn't faced the slowness. Only sites with 100Mb/s and low latency had the slownes.

I don't think that the WAAS Release is the issue. We had WAAS Release 4.2.3a installed and the issue is still there with 4.3.3.

Kind Regards

-Lukas

Thansk for sharing the update, Lukas.

Peter: Please make sure the netApp side is clean. If disabling CIFS AO did nto help, this is not related to CIFSAO. This may be overall issue. You may want to open a TAC case to further look in for more details. Please also make sure there there are no duplex / routing issues.

Regards.

Hello,

in preparation for CCO Ticket we setup a test lab with two WAVE 274 devices. We could not reproduce the problem.  Only with a 7341 in the core the problems occurs. The WAE 7341 was running in standard setup. We verfied the7341 and found that the the portchannel link to the switch was running in round robin mode. By switching into the to a src-dst-ip-port based mode the CIFS wrinting performace improves dramatically.  We imjplemented the change in the field and first tests indicates that CIFS writing is now running in a LAN like performance even over WAN links with high delays and poor bandwidth.

In the CCO I could not find any best practise regarding the configuration of WAAS&port channel&loadbalancing . Is any public document available.

Many thanks in advance. Peter

Hi Peter,

I'm facing some similar performance issues and wondered if you had any joy with documentation of the port channel and load balancing question?

I see from my set up that all my devices are using the factory default round-robin and I'm considering the change to src-dst-ip-port.

Many thanks in advance.

Regards,


Marc

Hi Marc,

- our implementation is running fine since we switched port channel LB on the WAAS devices to src-dst-ip-port.

- after internal discussions with BE our Cisco SE agrees with the decision to prefer src-dst-ip-port.

- no official Cisco document seems to be available

Regards

Peter

Thanks Peter. Will write that up ready for a global change and feedback my experience to the community.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: