cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5306
Views
0
Helpful
8
Replies

TFTP connection timed out copying image to 4900M

Terence_O
Level 1
Level 1

Running into an issue when trying to download an updated IOS image file to bootflash: from a TFTP server.

Pings to server from 4900M switch are good

Pings to 4900M switch from server are good

copy startup-config to tftp is successful.

copy tftp bootflash:  times out

 

example output:

SWITCH#copy tftp: bootflash:

Address or name of remote host []? 10.20.30.12

Source filename []? cat4500e-entservices-mz.152-4.E9.bin

Destination filename [cat4500e-entservices-mz.152-4.E9.bin]?

Accessing tftp://10.20.30.12/cat4500e-entservices-mz.152-4.E9.bin...

ifs_check_file 359 CPU_i86 0

ifs_check_file 361 cpu 183

ifs_check_file 362 cpu_family -1

Loading cat4500e-entservices-mz.152-4.E9.bin from 10.20.30.12 (via Vlan100): !O !OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO... [timed out]

%Error reading tftp://10.20.30.12/cat4500e-entservices-mz.152-4.E9.bin (Connection timed out)

 

 

tftp server address: 10.20.30.12    -  virtual server

4900 switch address: 10.20.30.2

 

SVI on 4900

interface Vlan100

 mtu 9198

 ip address 10.20.30.2 255.255.255.0

 ip access-group 110 out

 no ip redirects

 no ip unreachables

 standby 110 ip 10.20.30.1

 standby 110 timers msec 200 msec 600

 standby 110 priority 120

 standby 110 preempt delay reload 300

end

 

ACL 110 is just a tracking acl and does not filter traffic.

SWITCH#sh ip access-l 110

Extended IP access list 110   

    10 permit ip any any

 

The 4900M is scheduled for replacement in the near future, but upgrades are needed to address CVEs.

1 Accepted Solution

Accepted Solutions

Thanks for the additional information. Here are some comments about what I notice, what I am thinking, and what I might suggest:

- the switch and the server are in the same subnet (same vlan 100). So we do not have to be concerned with any layer 3 routing type issues.

- connectivity between the switch and the server is good for things that are fairly quick - such as ping or even copying a config file.

- but connectivity seems problematic for things that are big - like copying an image file.

- typically ping is short and quick. But I wonder if you did an extended ping to do several thousand, if you might see some pattern of dropped packets? I wonder what might happen if you did 2 or 3 of the extended pings at the same time if the behavior might change?

- would I be correct in assuming that there might be more than 1 layer 2 path from the switch to the server? 

- could you find the layer 2 path between the devices? using the mac address of the server do show mac address table on switches between the server and the switch to find the path from the switch to the server. and using the mac address of the switch do show mac address table to find the path from server to switch? Is the path going and coming the same?  I wonder while doing the copy of the image file if the path might change?

- I wonder if there is any possibility that there is spanning tree instability that might provoke the problem?

- I wonder if you check the logs on the switches between the server and the 4900 if there are any messages generated that might relate to this behavior?

- I remember working with a customer doing code upgrades to a bunch of routers and switches. These were not in the same subnet, and in fact many of them were over a WAN connection that was pretty heavily used. I had the experience many times of starting an image copy to a remote device, watching it run for a good amount of time, and then time out. I discovered that if I were careful about when I did the image copy (looking for times when traffic on the WAN was lower) I could get the image copies to work better. But then I found a better solution. Instead of using tftp (which uses UDP and is a not reliable transport) if I used something with a reliable transport (like FTP, or like SCP, or like HTTP) that I could run the image copies any time of the day and that they worked! So ultimately my suggestion is that perhaps you might want to use a different protocol to copy the image files.

HTH

Rick

View solution in original post

8 Replies 8

Richard Burts
Hall of Fame
Hall of Fame

The output you provide does give us a clue about the issue. As the copy start you get !OO. The ! indicates successful transfer. the O then indicates out of order packets. We do not know anything about the topology here but it looks to me like there are some issues on the connection from the 4900 to the server.  

 

HTH

Rick

The architecture looks like this top down with 4900s as core of a small datacenter,  9Ks as distribution, and Dell Switches at the access layer:

 

4900sw1      4900sw2

 

9300sw1      9300sw2

 

access sw1    access sw2 .....access swN

 

Dell/EMC Virtual environment

Thanks for the additional information. Here are some comments about what I notice, what I am thinking, and what I might suggest:

- the switch and the server are in the same subnet (same vlan 100). So we do not have to be concerned with any layer 3 routing type issues.

- connectivity between the switch and the server is good for things that are fairly quick - such as ping or even copying a config file.

- but connectivity seems problematic for things that are big - like copying an image file.

- typically ping is short and quick. But I wonder if you did an extended ping to do several thousand, if you might see some pattern of dropped packets? I wonder what might happen if you did 2 or 3 of the extended pings at the same time if the behavior might change?

- would I be correct in assuming that there might be more than 1 layer 2 path from the switch to the server? 

- could you find the layer 2 path between the devices? using the mac address of the server do show mac address table on switches between the server and the switch to find the path from the switch to the server. and using the mac address of the switch do show mac address table to find the path from server to switch? Is the path going and coming the same?  I wonder while doing the copy of the image file if the path might change?

- I wonder if there is any possibility that there is spanning tree instability that might provoke the problem?

- I wonder if you check the logs on the switches between the server and the 4900 if there are any messages generated that might relate to this behavior?

- I remember working with a customer doing code upgrades to a bunch of routers and switches. These were not in the same subnet, and in fact many of them were over a WAN connection that was pretty heavily used. I had the experience many times of starting an image copy to a remote device, watching it run for a good amount of time, and then time out. I discovered that if I were careful about when I did the image copy (looking for times when traffic on the WAN was lower) I could get the image copies to work better. But then I found a better solution. Instead of using tftp (which uses UDP and is a not reliable transport) if I used something with a reliable transport (like FTP, or like SCP, or like HTTP) that I could run the image copies any time of the day and that they worked! So ultimately my suggestion is that perhaps you might want to use a different protocol to copy the image files.

HTH

Rick

Some interesting points you've brought up.  I will definitely run some extended ping tests to see what happens.  Also, I was in the process of looking into pathing between my TFTP server and Core switch, but had two cut overs to prepare for. So, first thing back in the office I'll dig into the communications path question. One other thing I am looking into, is we're running HSRP with vPC, which is working solid for us, but I have to see if this is somehow contributing to the "O" - out of order results.

Where is the TFTP located in the network?

Deepak Kumar
VIP Alumni
VIP Alumni

Hi,

As @Richard Burts has been explained very well and this error is normal while you are working with tftp over the L3 routing or slow network. I would like to suggest you switch to the FTP server.

 

And at the same time, What is an error on the TFTP server?

 

Additional Checkup: If there are serval paths to reach the TFTP server then try with the Source interface command.

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Terence_O
Level 1
Level 1

Issue was found on the TFTP server itself.  After digging and testing and finally going the route to try and test an SCP type server, it was found that the utility server had two different TFTP services running for some reason. After shutting down both and restarting only one, I was able to successfully transfer the file with no "O".  Thanks to all who gave feedback on this. 

Thanks for the update. Glad that you found the problem and solved it. And for explaining the issue.  Thank you for marking this question as solved. This will help other participants in the community to identify discussions which have helpful information. This community is an excellent place to ask questions and to learn about networking. I hope to see you continue to be active in the community.

HTH

Rick