06-12-2020 12:26 PM
Running into an issue when trying to download an updated IOS image file to bootflash: from a TFTP server.
Pings to server from 4900M switch are good
Pings to 4900M switch from server are good
copy startup-config to tftp is successful.
copy tftp bootflash: times out
example output:
SWITCH#copy tftp: bootflash:
Address or name of remote host []? 10.20.30.12
Source filename []? cat4500e-entservices-mz.152-4.E9.bin
Destination filename [cat4500e-entservices-mz.152-4.E9.bin]?
Accessing tftp://10.20.30.12/cat4500e-entservices-mz.152-4.E9.bin...
ifs_check_file 359 CPU_i86 0
ifs_check_file 361 cpu 183
ifs_check_file 362 cpu_family -1
Loading cat4500e-entservices-mz.152-4.E9.bin from 10.20.30.12 (via Vlan100): !O !OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO... [timed out]
%Error reading tftp://10.20.30.12/cat4500e-entservices-mz.152-4.E9.bin (Connection timed out)
tftp server address: 10.20.30.12 - virtual server
4900 switch address: 10.20.30.2
SVI on 4900
interface Vlan100
mtu 9198
ip address 10.20.30.2 255.255.255.0
ip access-group 110 out
no ip redirects
no ip unreachables
standby 110 ip 10.20.30.1
standby 110 timers msec 200 msec 600
standby 110 priority 120
standby 110 preempt delay reload 300
end
ACL 110 is just a tracking acl and does not filter traffic.
SWITCH#sh ip access-l 110
Extended IP access list 110
10 permit ip any any
The 4900M is scheduled for replacement in the near future, but upgrades are needed to address CVEs.
Solved! Go to Solution.
06-12-2020 03:21 PM
Thanks for the additional information. Here are some comments about what I notice, what I am thinking, and what I might suggest:
- the switch and the server are in the same subnet (same vlan 100). So we do not have to be concerned with any layer 3 routing type issues.
- connectivity between the switch and the server is good for things that are fairly quick - such as ping or even copying a config file.
- but connectivity seems problematic for things that are big - like copying an image file.
- typically ping is short and quick. But I wonder if you did an extended ping to do several thousand, if you might see some pattern of dropped packets? I wonder what might happen if you did 2 or 3 of the extended pings at the same time if the behavior might change?
- would I be correct in assuming that there might be more than 1 layer 2 path from the switch to the server?
- could you find the layer 2 path between the devices? using the mac address of the server do show mac address table on switches between the server and the switch to find the path from the switch to the server. and using the mac address of the switch do show mac address table to find the path from server to switch? Is the path going and coming the same? I wonder while doing the copy of the image file if the path might change?
- I wonder if there is any possibility that there is spanning tree instability that might provoke the problem?
- I wonder if you check the logs on the switches between the server and the 4900 if there are any messages generated that might relate to this behavior?
- I remember working with a customer doing code upgrades to a bunch of routers and switches. These were not in the same subnet, and in fact many of them were over a WAN connection that was pretty heavily used. I had the experience many times of starting an image copy to a remote device, watching it run for a good amount of time, and then time out. I discovered that if I were careful about when I did the image copy (looking for times when traffic on the WAN was lower) I could get the image copies to work better. But then I found a better solution. Instead of using tftp (which uses UDP and is a not reliable transport) if I used something with a reliable transport (like FTP, or like SCP, or like HTTP) that I could run the image copies any time of the day and that they worked! So ultimately my suggestion is that perhaps you might want to use a different protocol to copy the image files.
06-12-2020 12:47 PM
The output you provide does give us a clue about the issue. As the copy start you get !OO. The ! indicates successful transfer. the O then indicates out of order packets. We do not know anything about the topology here but it looks to me like there are some issues on the connection from the 4900 to the server.
06-12-2020 01:49 PM
The architecture looks like this top down with 4900s as core of a small datacenter, 9Ks as distribution, and Dell Switches at the access layer:
4900sw1 4900sw2
9300sw1 9300sw2
access sw1 access sw2 .....access swN
Dell/EMC Virtual environment
06-12-2020 03:21 PM
Thanks for the additional information. Here are some comments about what I notice, what I am thinking, and what I might suggest:
- the switch and the server are in the same subnet (same vlan 100). So we do not have to be concerned with any layer 3 routing type issues.
- connectivity between the switch and the server is good for things that are fairly quick - such as ping or even copying a config file.
- but connectivity seems problematic for things that are big - like copying an image file.
- typically ping is short and quick. But I wonder if you did an extended ping to do several thousand, if you might see some pattern of dropped packets? I wonder what might happen if you did 2 or 3 of the extended pings at the same time if the behavior might change?
- would I be correct in assuming that there might be more than 1 layer 2 path from the switch to the server?
- could you find the layer 2 path between the devices? using the mac address of the server do show mac address table on switches between the server and the switch to find the path from the switch to the server. and using the mac address of the switch do show mac address table to find the path from server to switch? Is the path going and coming the same? I wonder while doing the copy of the image file if the path might change?
- I wonder if there is any possibility that there is spanning tree instability that might provoke the problem?
- I wonder if you check the logs on the switches between the server and the 4900 if there are any messages generated that might relate to this behavior?
- I remember working with a customer doing code upgrades to a bunch of routers and switches. These were not in the same subnet, and in fact many of them were over a WAN connection that was pretty heavily used. I had the experience many times of starting an image copy to a remote device, watching it run for a good amount of time, and then time out. I discovered that if I were careful about when I did the image copy (looking for times when traffic on the WAN was lower) I could get the image copies to work better. But then I found a better solution. Instead of using tftp (which uses UDP and is a not reliable transport) if I used something with a reliable transport (like FTP, or like SCP, or like HTTP) that I could run the image copies any time of the day and that they worked! So ultimately my suggestion is that perhaps you might want to use a different protocol to copy the image files.
06-14-2020 06:34 PM
Some interesting points you've brought up. I will definitely run some extended ping tests to see what happens. Also, I was in the process of looking into pathing between my TFTP server and Core switch, but had two cut overs to prepare for. So, first thing back in the office I'll dig into the communications path question. One other thing I am looking into, is we're running HSRP with vPC, which is working solid for us, but I have to see if this is somehow contributing to the "O" - out of order results.
06-12-2020 06:23 PM
06-12-2020 11:42 PM
Hi,
As @Richard Burts has been explained very well and this error is normal while you are working with tftp over the L3 routing or slow network. I would like to suggest you switch to the FTP server.
And at the same time, What is an error on the TFTP server?
Additional Checkup: If there are serval paths to reach the TFTP server then try with the Source interface command.
06-18-2020 08:05 AM
Issue was found on the TFTP server itself. After digging and testing and finally going the route to try and test an SCP type server, it was found that the utility server had two different TFTP services running for some reason. After shutting down both and restarting only one, I was able to successfully transfer the file with no "O". Thanks to all who gave feedback on this.
06-18-2020 09:55 AM
Thanks for the update. Glad that you found the problem and solved it. And for explaining the issue. Thank you for marking this question as solved. This will help other participants in the community to identify discussions which have helpful information. This community is an excellent place to ask questions and to learn about networking. I hope to see you continue to be active in the community.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide