cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1399
Views
8
Helpful
21
Replies

APIC Failing Ping

Rem Markov
Level 1
Level 1

Hey! 

So we have 3 apics that are extermly slow, both the CLI and the GUI  so we wanted to upgrade them, but the upgrade failed due to unknown reasons.

When trying to understand and troubleshoot We ran into something weird.

phmoapc-166014-24# acidiag cluster
Admin password:

Running...

Checking Wiring and UUID: OK
Checking AD Processes: Running
Checking All Apics in Commission State: OK
Checking All Apics in Active State: OK
Checking Fabric Nodes: OK
Checking Apic Fully-Fit: Not Fully Fit Apics: IFC-1 IFC-2 IFC-3
Checking Shard Convergence: OK
Checking Leadership Degration: Non optimal leader for shards : 3:1,3:2,3:4,3:5,3:7,3:8,3:10,3:11,3:13,3:16,3:17,3:19,3:20,3:22,3:23,3:25,3:26,3:28,3:31,6:1,6:2,6:4,6:5,6:7,6:8,6:10,6:11,6:13,6:16,6:17,6:19,6:20,6:22,6:23,6:25,6:26,6:28,6:31,6:32,9:1,9:2,9:4,9:5,9:7,9:8,9:10,9:11,9:13,9:16,9:17,9:19,9:20,9:22,9:23,9:25,9:26,9:28,9:29,9:31,10:1,10:2,10:4,10:5,10:7,10:8,10:10,10:11,10:13,10:14,10:16,10:17,10:19,10:20,10:22,10:23,10:25,10:26,10:28,10:31,11:1,11:2,11:4,11:5,11:7,11:8,11:10,11:11,11:13,11:14,11:16,11:17,11:19,11:20,11:22,11:23,11:25,11:26,11:28,11:31,14:1,14:2,14:4,14:5,14:7,14:8,14:10,14:11,14:13,14:14,14:16,14:17,14:19,14:20,14:22,14:23,14:25,14:26,14:28,14:31,16:1,16:2,16:4,16:5,16:7,16:10,16:11,16:13,16:14,16:16,16:17,16:19,16:20,16:22,16:23,16:25,16:26,16:28,16:31,22:1,22:2,22:4,22:5,22:7,22:8,22:10,22:11,22:13,22:14,22:16,22:17,22:19,22:20,22:22,22:25,22:28,22:31,22:32,23:1,23:2,23:4,23:5,23:7,23:8,23:10,23:11,23:13,23:14,23:16,23:17,23:19,23:20,23:22,23:23,23:25,23:26,23:28,23:31,33:1,34:1,34:2,34:4,34:5,34:7,34:8,34:10,34:11,34:13,34:16,34:17,34:19,34:20,34:22,34:23,34:25,34:26,34:28,34:31,35:1,35:2,35:4,35:5,35:7,35:8,35:10,35:11,35:13,35:14,35:16,35:17,35:19,35:22,35:25,35:26,35:28,35:31,36:1,39:1,39:2,39:4,39:5,39:7,39:8,39:10,39:11,39:13,39:14,39:16,39:17,39:19,39:20,39:22,39:23,39:25,39:26,39:28,39:31
Ping OOB IPs:
APIC-1: 192.168.199.9 - OK
APIC-2: 192.168.199.10 - OK
APIC-3: 192.168.199.11 - OK
Ping Infra IPs:
APIC-1: 10.0.0.1 - OK
APIC-2: 10.0.0.2 - OK
APIC-3: 10.0.0.3 - Ping failed
Checking APIC Versions: Cluster Version:5.2(7g) Imcompatible Apics: IFC-1(5.2(4e)) IFC-2(5.2(4e)) IFC-3(5.2(4e))
Checking SSL: OK
Full file system(s): None

Done!

For some reason the ping between the infra IPs fails, does someone has a clue where to begin searching for the error?

1 Accepted Solution

Accepted Solutions

Rem Markov
Level 1
Level 1

I finally found the problem.

I had 4 connections between APIC3&1 and the leafs instead of 2.

VIC 1455 is used on the APICs. All  4-ports on this NIC were cabled to the Leaf.image007.pngimage006.png

after shut 2 ports it worked!


View solution in original post

21 Replies 21

Robert Burns
Cisco Employee
Cisco Employee

Is the 3rd APIC is the same or different Pod as 1 & 2?

Robert

Rem Markov
Level 1
Level 1

all in the same pod.

Check both fabric links from APIC3 CLI:

acidiag run lldptool in eth2-1 | grep topo
acidiag run lldptool in eth2-2 | grep topo

If all looks fine, initiate a ping from APIC3 > APIC1 / APIC2's TEP address

Robert 

phmoapc-166018-25# acidiag run lldptool in eth2-1 | grep topo
topology/pod-1/paths-103/pathep-[eth1/1]
topology/pod-1/node-103


phmoapc-166018-25# acidiag run lldptool in eth2-2 | grep topo
topology/pod-1/paths-104/pathep-[eth1/1]
topology/pod-1/node-104

From APIC3 and one other APIC:

acidiag avread
acidiag rvread

Robert

Hi @Rem Markov ,

TIP: Let me modify @Robert Burns ' suggestion.

Instead of issuing the commands...

acidiag avread
acidiag rvread

... ditch the acidiag part and just issue the commands:

avread
rvread

You'll get a much more readable output, and (in the case of rvread), some extra information. And BTW - fnvread also give virtually the same information as acidiag fnvread  and saves typing.

Caveat: avread  does not give you timestamps or the complete serial number. (Which is why the output is more readable)

 

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

APIC 3:
```

phmoapc-166018-25# avread
Cluster:
-------------------------------------------------------------------------
fabricDomainName MedOne
discoveryMode PERMISSIVE
clusterSize 3
version apic-5.2(7g)
drrMode OFF
operSize 3

APICs:
-------------------------------------------------------------------------
APIC 1 APIC 2 APIC 3
version 5.2(4e) 5.2(4e) 5.2(4e)
address 10.0.0.1 10.0.0.2 10.0.0.3
oobAddress 192.168.199.9/24 192.168.199.10/24 192.168.199.11/24
routableAddress 0.0.0.0 0.0.0.0 0.0.0.0
tepAddress 10.0.0.0/16 10.0.0.0/16 10.0.0.0/16
podId 1 1 1
chassisId 8ebc498a-.-7af699ea 043f74d8-.-67674e20 596a5378-.-1857d6f8
cntrlSbst_serial (APPROVED,WMP251900CG) (APPROVED,WMP251900D4) (APPROVED,WMP251900BK)
active YES YES YES
flags cra- cra- cra-
health 255 255 255
phmoapc-166018-25# rvread
\- unexpected state; /-unexpected mutator;
s->R 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32lcl
r->R123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123lcl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Replicas are in expected states and are mutated by proper apic's
---------------------------------------------
clusterTime=<diff=0 common=2023-07-11T06:03:53.660+00:00 local=2023-07-11T06:03:53.660+00:00 pF=<displForm=0 offsSt=0 offsVlu=0 lm(t):3(2022-06-06T07:34:08.367+00:00)>>


Non optimal leader for shards : 3:1,3:2,3:4,3:5,3:7,3:8,3:10,3:11,3:13,3:14,3:16,3:17,3:19,3:20,3:22,3:23,3:25,3:26,3:28,3:29,3:31,3:32,6:1,6:2,6:4,6:5,6:7,6:8,6:10,6:11,6:13,6:14,6:16,6:17,6:19,6:20,6:22,6:23,6:25,6:26,6:28,6:29,6:31,6:32,9:1,9:2,9:4,9:5,9:7,9:8,9:10,9:11,9:13,9:14,9:16,9:17,9:19,9:20,9:22,9:23,9:25,9:26,9:28,9:29,9:31,9:32,10:1,10:2,10:4,10:5,10:7,10:8,10:10,10:11,10:13,10:14,10:16,10:17,10:19,10:20,10:22,10:23,10:25,10:26,10:28,10:29,10:31,10:32,11:1,11:2,11:4,11:5,11:7,11:8,11:10,11:11,11:13,11:14,11:16,11:17,11:19,11:20,11:22,11:23,11:25,11:26,11:28,11:29,11:31,11:32,14:1,14:2,14:4,14:5,14:7,14:8,14:10,14:11,14:13,14:14,14:16,14:17,14:19,14:20,14:22,14:23,14:25,14:26,14:28,14:29,14:31,14:32,16:1,16:2,16:4,16:5,16:7,16:8,16:10,16:11,16:13,16:14,16:16,16:17,16:19,16:20,16:22,16:23,16:25,16:26,16:28,16:29,16:31,16:32,22:1,22:2,22:4,22:5,22:7,22:8,22:10,22:11,22:13,22:14,22:16,22:17,22:19,22:20,22:22,22:23,22:25,22:26,22:28,22:29,22:31,22:32,23:1,23:2,23:4,23:5,23:7,23:8,23:10,23:11,23:13,23:14,23:16,23:17,23:19,23:20,23:22,23:23,23:25,23:26,23:28,23:29,23:31,23:32,33:1,34:1,34:2,34:4,34:5,34:7,34:8,34:10,34:11,34:13,34:14,34:16,34:17,34:19,34:20,34:22,34:23,34:25,34:26,34:28,34:29,34:31,34:32,35:1,35:2,35:4,35:5,35:7,35:8,35:10,35:11,35:13,35:14,35:16,35:17,35:19,35:20,35:22,35:23,35:25,35:26,35:28,35:29,35:31,35:32,36:1,39:1,39:2,39:4,39:5,39:7,39:8,39:10,39:11,39:13,39:14,39:16,39:17,39:19,39:20,39:22,39:23,39:25,39:26,39:28,39:29,39:31,39:32
---------------------------------------------
clusterTime=<diff=1 common=2023-07-11T06:03:54.448+00:00 local=2023-07-11T06:03:54.447+00:00 pF=<displForm=0 offsSt=0 offsVlu=0 lm(t):3(2022-06-06T07:34:08.367+00:00)>>

```

 

 

APIC 2:
```

phmoapc-166014-25# avread
Cluster:
-------------------------------------------------------------------------
fabricDomainName MedOne
discoveryMode PERMISSIVE
clusterSize 3
version apic-5.2(7g)
drrMode OFF
operSize 3

APICs:
-------------------------------------------------------------------------
APIC 1 APIC 2 APIC 3
version 5.2(4e) 5.2(4e) 5.2(4e)
address 10.0.0.1 10.0.0.2 10.0.0.3
oobAddress 192.168.199.9/24 192.168.199.10/24 192.168.199.11/24
routableAddress 0.0.0.0 0.0.0.0 0.0.0.0
tepAddress 10.0.0.0/16 10.0.0.0/16 10.0.0.0/16
podId 1 1 1
chassisId 8ebc498a-.-7af699ea 043f74d8-.-67674e20 596a5378-.-1857d6f8
cntrlSbst_serial (APPROVED,WMP251900CG) (APPROVED,WMP251900D4) (APPROVED,WMP251900BK)
active YES YES YES
flags cra- cra- cra-
health 255 255 255
phmoapc-166014-25# rvread
\- unexpected state; /-unexpected mutator;
s->R 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32lcl
r->R123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123lcl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Replicas are in expected states and are mutated by proper apic's
---------------------------------------------
clusterTime=<diff=198200 common=2023-07-11T06:05:21.084+00:00 local=2023-07-11T06:02:02.884+00:00 pF=<displForm=0 offsSt=0 offsVlu=0 lm(t):3(2022-06-06T07:34:08.367+00:00)>>


Non optimal leader for shards : 3:1,3:2,3:4,3:5,3:7,3:8,3:10,3:11,3:13,3:14,3:16,3:17,3:19,3:20,3:22,3:23,3:25,3:26,3:28,3:29,3:31,3:32,6:1,6:2,6:4,6:5,6:7,6:8,6:10,6:11,6:13,6:14,6:16,6:17,6:19,6:20,6:22,6:23,6:25,6:26,6:28,6:29,6:31,6:32,9:1,9:2,9:4,9:5,9:7,9:8,9:10,9:11,9:13,9:14,9:16,9:17,9:19,9:20,9:22,9:23,9:25,9:26,9:28,9:29,9:31,9:32,10:1,10:2,10:4,10:5,10:7,10:8,10:10,10:11,10:13,10:14,10:16,10:17,10:19,10:20,10:22,10:23,10:25,10:26,10:28,10:29,10:31,10:32,11:1,11:2,11:4,11:5,11:7,11:8,11:10,11:11,11:13,11:14,11:16,11:17,11:19,11:20,11:22,11:23,11:25,11:26,11:28,11:29,11:31,11:32,14:1,14:2,14:4,14:5,14:7,14:8,14:10,14:11,14:13,14:14,14:16,14:17,14:19,14:20,14:22,14:23,14:25,14:26,14:28,14:29,14:31,14:32,16:1,16:2,16:4,16:5,16:7,16:8,16:10,16:11,16:13,16:14,16:16,16:17,16:19,16:20,16:22,16:23,16:25,16:26,16:28,16:29,16:31,16:32,22:1,22:2,22:4,22:5,22:7,22:8,22:10,22:11,22:13,22:14,22:16,22:17,22:19,22:20,22:22,22:23,22:25,22:26,22:28,22:29,22:31,22:32,23:1,23:2,23:4,23:5,23:7,23:8,23:10,23:11,23:13,23:14,23:16,23:17,23:19,23:20,23:22,23:23,23:25,23:26,23:28,23:29,23:31,23:32,33:1,34:1,34:2,34:4,34:5,34:7,34:8,34:10,34:11,34:13,34:14,34:16,34:17,34:19,34:20,34:22,34:23,34:25,34:26,34:28,34:29,34:31,34:32,35:1,35:2,35:4,35:5,35:7,35:8,35:10,35:11,35:13,35:14,35:16,35:17,35:19,35:20,35:22,35:23,35:25,35:26,35:28,35:29,35:31,35:32,36:1,39:1,39:2,39:4,39:5,39:7,39:8,39:10,39:11,39:13,39:14,39:16,39:17,39:19,39:20,39:22,39:23,39:25,39:26,39:28,39:29,39:31,39:32
---------------------------------------------
clusterTime=<diff=198200 common=2023-07-11T06:05:21.848+00:00 local=2023-07-11T06:02:03.648+00:00 pF=<displForm=0 offsSt=0 offsVlu=0 lm(t):3(2022-06-06T07:34:08.367+00:00)>>

```

A lastly do a ping from APIC3 ==> APIC1 & 2 using it's TEP address

Robert

Ping from APIC3 to APIC1:

phmoapc-166018-25# ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.

--- 10.0.0.1 ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 8133ms

Ping from APIC3 to APIC2:

phmoapc-166018-25# ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=61 time=0.179 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=61 time=0.127 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=61 time=0.202 ms

--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2051ms
rtt min/avg/max/mdev = 0.127/0.169/0.202/0.033 ms

Let's complete the test and ping from APIC2 => APIC1 and then to APIC3 (see if initiating traffic in the reverse direction has any impact).

So far we know we have an infra connectivity issue between APIC3 & APIC1.  Let's get a clear understanding of the full extent.

Robert

Ping apic2 to apic1:

phmoapc-166014-25# ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=61 time=0.224 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=61 time=0.289 ms

--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.224/0.256/0.289/0.036 ms

Ping apic2 to apic3:
phmoapc-166014-25# ping 10.0.0.3
PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.
64 bytes from 10.0.0.3: icmp_seq=1 ttl=61 time=0.110 ms
64 bytes from 10.0.0.3: icmp_seq=2 ttl=61 time=0.192 ms

--- 10.0.0.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.110/0.151/0.192/0.041 ms

Immeidatley after successfully pinging APIC3 FROM APIC2, does the reverse work?  The address would be resolved if that's the issue.  

Robert

 

I didn't check immediately, but it does work in general.  The problem is that Apic 3 and Apic 1 can't ping each other .

 

 

Robert Burns
Cisco Employee
Cisco Employee

I'd also suggest a controller reboot of each controller, starting with APIC3.  Don't proceed with the next controller until the cluster returns to fully fit state.

Robert

Save 25% on Day-2 Operations Add-On License