Bug ID CSCve08815 raised to

Andreas di Zazzo · ‎02-24-2017

Upgraded a customer’s ISE to 2.2 yesterday and we ran into a real surprise. After the upgrade the first ISE node to 2.2 it started complaining it couldn’t reach DNS servers and other services. We had previously upgraded their ISE lab without any problems. Couldn’t figure out why production environment did this until we did some tracing from ISE CLI and realized the IP packets never left the box.

A quick look in the ISE routing table revealed a brand new interface (new in ISE 2.2) .

customer-ise2.2-node/admin# sh int

docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0

ether 02:42:95:d3:20:9c txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

GigabitEthernet 0

flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 172.26.50.99 netmask 255.255.255.0 broadcast 172.26.50.255

inet6 fe80::20c:29ff:fea3:2de6 prefixlen 64 scopeid 0x20<link>

ether 00:0c:29:a3:2d:e6 txqueuelen 1000 (Ethernet)

RX packets 199152 bytes 86053852 (82.0 MiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 199707 bytes 124938725 (119.1 MiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

GigabitEthernet 1

flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet6 fe80::20c:29ff:fea3:2df0 prefixlen 64 scopeid 0x20<link>

ether 00:0c:29:a3:2d:f0 txqueuelen 1000 (Ethernet)

RX packets 1959 bytes 184239 (179.9 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 16 bytes 1296 (1.2 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

GigabitEthernet 2

flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet6 fe80::20c:29ff:fea3:2dfa prefixlen 64 scopeid 0x20<link>

ether 00:0c:29:a3:2d:fa txqueuelen 1000 (Ethernet)

RX packets 1960 bytes 184299 (179.9 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 16 bytes 1296 (1.2 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

GigabitEthernet 3

flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet6 fe80::20c:29ff:fea3:2d04 prefixlen 64 scopeid 0x20<link>

ether 00:0c:29:a3:2d:04 txqueuelen 1000 (Ethernet)

RX packets 1960 bytes 184299 (179.9 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 16 bytes 1296 (1.2 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536

inet 127.0.0.1 netmask 255.0.0.0

inet6 ::1 prefixlen 128 scopeid 0x10<host>

loop txqueuelen 0 (Local Loopback)

RX packets 9695456 bytes 3782997962 (3.5 GiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 9695456 bytes 3782997962 (3.5 GiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

customer-ise2.2-node/admin# sh ip ro

Destination Gateway Iface

----------- ------- -----

default 172.26.50.1 eth0

172.26.50.0/24 0.0.0.0 eth0

172.17.0.0/16 0.0.0.0 docker0

WHY ON EARTH do Cisco address this internal docker0 interface with 172.17.0.0/16 ?!? And !! this network address cannot be changed. There is no configuration for this in either CLI or GUI.

My customer had their DNS servers within the 172.17.0.0/16 range which now litteraly got blackholed by this new awesome docker0 interface which is locally connected to the ISE node.

A workaround (that does work) is to create two static routes in the ISE CLI, 172.17.0.0/17 and 172.17.128.0/17 and point to default-gw. But seriously should that really be needed.

So how come we did not see this in the lab? By a total coincidence the management of the ISE nodes in the lab was using the 172.17.0.0/16 range. The ISE upgrader must have realized this because it assigned a –different- /16 range to the docker0 interface in the lab to it wouldn’t collide with the management ip.

Since we had no services at this range we did not notice this new behavior.

Why this docker0 interface cannot use the 127.0.0.0/8 range is beyond me because it seems to be for internal ISE communications only.

Oh and no word about this is release notes or anywhere else for that matter. When upgrading the remaining nodes in the ISE 2.2 cluster, some of the other nodes picked different networks such as 172.18.0.0/16

David Thoben · ‎03-22-2017

Hi Andreas, Hi All,

experiencing the same issues about docker network selection. :-(

My client is running it's identity stores (AD/LDAP) in 172.18.0.0....

I can confirm the workaround BEFORE upgrade.

Workaround:

ISE02/dthoben# sh ip route

Destination Gateway Iface
----------- ------- -----
default 172.17.122.254 eth0
172.17.122.0/23 0.0.0.0 eth0
172.18.0.0/17 172.17.129.254 eth0 (static route)
172.18.128.0/17 172.17.129.254 eth0 (static route)
172.18.0.0/16 0.0.0.0 docker0

Regards.

David Thoben

David Thoben · ‎03-22-2017

Update:

Just configure multiple interfaces to move the range of the docker interface multiple times (before upgrade).

-OR-

Configure a large subnet onto a single interface to move the docker interface to a proper position (before upgrade).

ISE02/dthoben# sh ip route

Destination Gateway Iface
----------- ------- -----
default 172.17.144.254 eth0
172.17.144.0/24 0.0.0.0 eth0
172.18.144.0/24 0.0.0.0 eth1
172.19.144.0/24 0.0.0.0 eth3
172.20.144.0/24 0.0.0.0 eth2
172.21.0.0/16 0.0.0.0 docker0

Regards,

David Thoben

Andreas di Zazzo · ‎05-03-2017

Bug ID CSCve08815 raised to track the issue. No fix available yet.

dgoodenberger · ‎05-19-2017

Appears to be fixed as of 18 May 2017. They have changed the docker route to use 169.254.0.0/24.

dgoodenberger · ‎05-19-2017

I ran into this as well when upgrading my dev ISE appliances. The FTP repo that I use for backing up the ise appliances uses a 172.17 IP address and was broken by the 2.1 to 2.2 upgrade. I added more specific routes via CLI to work around this. However, when I went to upgrade my production appliances, I pre-added the specific routes, and the 2.2 upgrade failed on the secondary PAN with "500 internal error". I removed the routes, and the 2.2 upgrade was then successful. Once the upgrade of all nodes is successful, I will re-add the routes to all appliances. I know this is a docker default, but hopefully Cisco can come with a fix to minimize the impact of this in the future.

Docker0 interface in ISE 2.2 causing problem