cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

Community Helping Community

29
Views
0
Helpful
12
Replies
Highlighted
Cisco Employee

CTCM spinning up only one DI node

 

Hi,

 

Twice now I have faced the issue where a request from NSO for two DI nodes for an ISCR configuration only spins up one DI node.

 

After the last failed attempt – I stopped and restarted the OCTL/CPS.  Then I reonboarded CTCM into NSO and tried again – same results.

Currently our CTCM server at 172.18.142.15 (ctcm-admin/cisco123)  is  in the state of having spun up one CTCM when we requested two.

 

Chassis Name: 'Gateway1', Chassis Id: di-000

Chassis Name: 'Gateway2', Chassis Id: di-001

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ nova list --all-tenants

+--------------------------------------+----------------+--------+------------+-------------+--------------------------------------------------+

| ID                                   |Name           | Status | Task State | Power State | Networks                                         |

+--------------------------------------+----------------+--------+------------+-------------+--------------------------------------------------+

| 0a9aa43b-03dc-4213-be4d-7523ad7a40d1 | cartridge--4e2 | ACTIVE | -          | Running     | core=172.16.180.112                              |

| 287312c3-e5dd-412e-ad29-6d411edadacc | di-000cf--5e0  | ACTIVE | -          | Running     | core=172.16.180.114; di-internal=172.16.1.174    |

| d636d9a3-ce40-4d4e-a889-c408d872b22c | di-000cf--7bf  | ACTIVE | -          | Running     | core=172.16.180.113; di-internal=172.16.1.173    |

| adcc0204-a159-4ae8-9157-2e0932c438ee | di-000sf--60b  | ACTIVE | -          | Running     | di-internal=172.16.1.176; service1=172.16.184.32 |

| ee77472a-6c01-4411-b876-7187fbecfcd2 | di-000sf--9c4  | ACTIVE | -          | Running     | di-internal=172.16.1.178; service1=172.16.184.34 |

| 4acdc515-bc50-4984-a6d0-32893f810b9a | di-000sf--b23  | ACTIVE | -          | Running     | di-internal=172.16.1.175; service1=172.16.184.31 |

| f5cd5a07-aa7e-4eca-9a2d-e7d4ee63e55e | di-000sf--d24  | ACTIVE | -          | Running     | di-internal=172.16.1.177; service1=172.16.184.33 |

| 373012ca-d30c-47ae-839b-e60c78073829 | octl-01        | ACTIVE | -          | Running     | core=172.16.180.207, 10.19.1.61                  |

+--------------------------------------+----------------+--------+------------+-------------+--------------------------------------------------+

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$

 

NSO is reporting an error:

 

<ERROR> 2015-11-10 23:09:25,074 QtcmHandler$startCTCMThread pool-201-thread-1: - Oops! An error has occured in Create CTCM thread! null

java.lang.Exception

        at com.cisco.nso.qtcmhandler.QTCMInvoker.start(QTCMInvoker.java:41)

        at com.cisco.nso.qtcmhandler.QtcmHandler$startCTCMThread.run(QtcmHandler.java:177)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        at java.lang.Thread.run(Thread.java:745)

<ERROR> 2015-11-10 23:09:25,074 QtcmHandler$startCTCMThread pool-201-thread-1: - Oops! Create CTCM request was not successfully executed. Check logs for more information!

<INFO> 2015-11-10 23:09:25,074 QtcmHandler pool-201-thread-1: - Setting HLRFS status data for MG-2 to ERROR with description Error in HLRFS MG-2 – null

 

Attached is the cluster tech-support and full nso mobility function pack log.

 

The time of the event is shown in the above logs – Eastern Time.

 

The testbed is currently in this state and I will leave it as such tonight.

   

Please provide advice on troubleshooting this issue. This is with the new updated CTCM for sprint.

 

 

JA

 

Everyone's tags (2)
1 ACCEPTED SOLUTION

Accepted Solutions
Cisco Employee

Re: CTCM spinning up only one DI node

 

Hi Shash,

I had a look at this. It turns out it’s a known issue, have a look at this bug’s Eng-notes for a workaround (bullets 4 & 5):

 

http://cdetsweb-prd.cisco.com/apps/dumpcr?identifier=CSCux06092

 

 

Best regards,

Michiel

 

View solution in original post

12 REPLIES 12
Cisco Employee

Re: CTCM spinning up only one DI node

 

Hi Dilbag,

 

 

I had a quick look at the NSO MFP logs, we see an error response for the “Start” request from CTCM. The complete error response is attached.

 

I have highlighted a section of the response below, this I hope will let CTCM engineers figure out what might be wrong:

 

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>Traceback (most recent call last):</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di", line 762, in &lt;module&gt;</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>sys.exit(main())</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di", line 743, in main</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>cpvc_di_start(Application(args.ws), env, args.chassis_name, args.sf_count, args.billing, args.username, args.password, args.snmp)</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di", line 564, in cpvc_di_start</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>vip = vip_create(app, mgmt_ip, port_name, net, publicnet, tenant)</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di", line 421, in vip_create</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>vip_delete(app, port_name, mgmt_network)</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di", line 444, in vip_delete</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>vip = publicnet_cls.get_floating_ip_for_port(port_id)</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/ctcm/phoenix/lib/libopenstack/publicnet.py", line 172, in get_floating_ip_for_port</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>floating_ip_addr = cmd_get_value("port_id", port_id, 3, *{} floatingip-list*, OS_NET_CMD)</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>File "/opt/ctcm/partners/ctcm/phoenix/lib/libopenstack/__init__.py", line 224, in cmd_get_value</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>raise KeyError(_("No objects from *{}* match *{}*. Please ensure object exists").format(cmd, name))</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>KeyError: "No objects from *{} floatingip-list* match *b243e908-76f1-4b56-9883-cfd9ea90033b*. Please ensure object exists"</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

  <qvpc-bootstrap:output xmlns:qvpc-bootstrap="http://cisco.com/ns/diadem/qvpc-bootstrap">

    <qvpc-bootstrap:line>/usr/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.</qvpc-bootstrap:line>

  </qvpc-bootstrap:output>

 

Thanks

Shash

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

Hi Shash,

I had a look at this. It turns out it’s a known issue, have a look at this bug’s Eng-notes for a workaround (bullets 4 & 5):

 

http://cdetsweb-prd.cisco.com/apps/dumpcr?identifier=CSCux06092

 

 

Best regards,

Michiel

 

View solution in original post

Cisco Employee

Re: CTCM spinning up only one DI node

 

Hi Team,

 

We are unable to recover.

 

We have tried:

 

 

1.       qvpc-boostrap stop and then Ctcm-actions stop/start 

2.       qvpc-bootstarap stop and then Ctcm-actions stop/remove/setup/start

 

 

We continue to see this issue on our CTCM setup. We continue every time we try to start up chassis to get the floatin-ip issue:

Last login: Wed Nov 11 11:43:01 2015 from 10.0.240.11

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ neutron floatingip-list

+--------------------------------------+------------------+---------------------+--------------------------------------+

| id                                   | fixed_ip_address | floating_ip_address | port_id                              |

+--------------------------------------+------------------+---------------------+--------------------------------------+

| 1d32496c-a812-4a5b-a365-5bf25d39c6b7 | 172.16.180.207   | 10.19.1.61          | a262889c-36c8-4813-a7e3-3046f9d6b06f |

| 44a839f0-dae5-4209-984b-f1ce3dedf07b | 172.16.180.3     | 10.19.1.64          | 300b0bf6-e5b5-410c-8a01-2b016de8c696 |

| 96d28cfe-e238-4778-9d84-71b8eaee5111 | 172.16.180.2     | 10.19.1.62          | ad9e1495-1b81-4437-8feb-5b54b9f2e671 |

| cdc27cc4-53e2-4483-950c-e2f74ac61ef0 |                  | 10.19.1.63          |                                      |

+--------------------------------------+------------------+---------------------+--------------------------------------+

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$

 

Please advise.

 

JA

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

HI,

   

This is indeed the issue. So will clean up the testbed and move forward.  Hopefully this does not occur regularly.

 

JA

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ neutron floatingip-list

+--------------------------------------+------------------+---------------------+--------------------------------------+

| id                                   | fixed_ip_address | floating_ip_address | port_id                              |

+--------------------------------------+------------------+---------------------+--------------------------------------+

| 33df102c-1a0b-442d-9319-83e495c008a1 | 172.16.180.207   | 10.19.1.61          | 7a0347c5-e09c-45ea-84d8-60042ba6a556 |

| 5ac31e1b-3a6e-43c8-9c68-45d02f7e8298 |                  | 10.19.1.63          |                                      |

| e0ff8366-7fb1-44af-bbc7-8f06c8f47c58 | 172.16.180.2     | 10.19.1.62          | cf0d934c-a6f9-4daf-9d93-1601c0ecfa4c |

+--------------------------------------+------------------+---------------------+--------------------------------------+

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$

Cisco Employee

Re: CTCM spinning up only one DI node

 

Folks:

 

Just as an FYI this system is patched with the attached file to fix CSCuw80043

 

Basically the equivalent of _5 build for sprint.

 

I am not sure if the fix may have caused conditions more favorable for running into CSCux06092

but if so let me know and I will back out the patch.

 

 

Regards,

Jim

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

Julie Ann,

 

Did you delete the port first? Let me know if you need help.

 

Thanks, Shaheed

 

Cisco Employee

Re: CTCM spinning up only one DI node

Yes

Cisco Employee

Re: CTCM spinning up only one DI node

 

Just some logs from server.log  aroundthe time of Julie Ann's actvities. Looks like problem wth the installer and the port API.

 

015-11-11 11:13:59.809 26887 INFO neutron.api.v2.resource [req-34675000-8b89-439e-990c-25bdd8374a4f None] delete failed (client error): Port 90938b27-ac05-43ad-bc99-b0092f8353c1 has owner network:router_interface and therefore cannot be deleted directly via the port API.

2015-11-11 11:14:10.955 26911 INFO neutron.api.v2.resource [req-3d066969-5b62-481a-91e3-bb76e58bf5e5 None] delete failed (client error): Port a24f359d-5088-41bd-a7fe-aa474561f27c has owner network:router_interface and therefore cannot be deleted directly via the port API.

2015-11-11 11:16:42.647 26264 WARNING neutron.plugins.ml2.managers [req-980bcf81-9a6e-4aa3-8e93-ea9716844f59 None] Failed to bind port f54def95-a69c-4582-a26a-ed9765352ca4 on host rtpc1b1-ctrl

2015-11-11 11:16:43.384 27010 WARNING neutron.plugins.ml2.rpc [req-5e40bf32-9e01-4488-a5c1-917abd1fed08 None] Device tapf54def95-a6 requested by agent lb0025b5b5002a on network 306a1a97-b498-4dc3-9aaf-40bb436709c1 not bound, vif_type: binding_failed

2015-11-11 11:16:44.030 27008 WARNING neutron.plugins.ml2.rpc [req-b127853c-870a-4ff7-b45d-9d7f3a620208 None] Device f54def95-a69c-4582-a26a-ed9765352ca4 requested by agent ovs-agent-rtpc1b1-ctrl on network 306a1a97-b498-4dc3-9aaf-40bb436709c1 not bound, vif_type: binding_failed

2015-11-11 11:30:33.313 26916 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 11:30:33.313 26916 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 11:30:50.282 26888 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 11:30:50.282 26888 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2oken

2015-11-11 12:01:05.385 26905 INFO neutron.api.v2.resource [req-08d00fc8-d365-4c85-ad23-3f4de56f901a None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet 4ce9a742-59a1-45c8-b7b2-8746f0057546 which has no gateway_ip

2015-11-11 12:01:10.491 26929 INFO neutron.api.v2.resource [req-aa588d1f-c955-420d-bdb5-d4b9f3d796f1 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet 4ce9a742-59a1-45c8-b7b2-8746f0057546 which has no gateway_ip

2015-11-11 12:01:10.520 26929 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 12:01:10.520 26929 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 12:01:15.661 26885 INFO neutron.api.v2.resource [req-b24afa53-4192-4960-b7c6-6d69ac52c4e3 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet 4ce9a742-59a1-45c8-b7b2-8746f0057546 which has no gateway_ip
2015-11-11 12:01:20.911 26906 INFO neutron.api.v2.resource [req-76e5bc58-c6ad-498a-ac4b-cb049a8ffb5c None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet 4ce9a742-59a1-45c8-b7b2-8746f0057546 which has no gateway_ip
2015-11-11 12:01:26.090 26931 INFO neutron.api.v2.resource [req-255ee5ca-80f9-4e0e-a055-11703366e13e None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet 4ce9a742-59a1-45c8-b7b2-8746f0057546 which has no gateway_ip
2015-11-11 12:01:32.220 26884 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:01:32.220 26884 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:01:47.144 26912 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:01:47.144 26912 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:01:47.705 26950 WARNING neutron.plugins.ml2.managers [req-7294a9d2-a3a4-4b94-a46a-cc9244434eb2 None] Failed to bind port 95ec158b-0333-419a-b038-ebcc45367bfb on host rtpc1b1-ctrl
2015-11-11 12:01:49.609 26986 WARNING neutron.plugins.ml2.rpc [req-5e40bf32-9e01-4488-a5c1-917abd1fed08 None] Device tap95ec158b-03 requested by agent lb0025b5b5002a on network 306a1a97-b498-4dc3-9aaf-40bb436709c1 not bound, vif_type: binding_failed
2015-11-11 12:01:50.213 26991 WARNING neutron.plugins.ml2.rpc [req-b127853c-870a-4ff7-b45d-9d7f3a620208 None] Device 95ec158b-0333-419a-b038-ebcc45367bfb requested by agent ovs-agent-rtpc1b1-ctrl on network 306a1a97-b498-4dc3-9aaf-40bb436709c1 not bound, vif_type: binding_failed
2015-11-11 12:01:50.607 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:01:50.607 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

2015-11-11 12:02:49.914 26916 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:02:51.355 26929 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:02:51.355 26929 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:02:57.718 26902 INFO neutron.api.v2.resource [req-620ec2f9-313a-4a85-b0ff-86e956d931e0 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet d69e9927-81ed-4ef8-ab6b-d44758b28e36 which has no gateway_ip
2015-11-11 12:03:02.799 26890 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:02.800 26890 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:02.955 26899 INFO neutron.api.v2.resource [req-ee6f90a9-abbc-44e1-a085-7a9b3cebce01 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet d69e9927-81ed-4ef8-ab6b-d44758b28e36 which has no gateway_ip
2015-11-11 12:03:04.664 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:04.664 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:04.901 26880 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:04.902 26880 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.039 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.040 26913 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.338 26919 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.339 26919 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.576 26896 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:05.577 26896 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:08.071 26893 INFO neutron.api.v2.resource [req-713203f4-8b24-48db-9084-a72af63908f3 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet d69e9927-81ed-4ef8-ab6b-d44758b28e36 which has no gateway_ip
2015-11-11 12:03:13.219 26895 INFO neutron.api.v2.resource [req-392d0bb4-c3ed-4401-a386-b2db93aa7856 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet d69e9927-81ed-4ef8-ab6b-d44758b28e36 which has no gateway_ip
2015-11-11 12:03:18.355 26906 INFO neutron.api.v2.resource [req-536d9872-e19b-4a9c-bc0a-33c6c2876856 None] update failed (client error): Bad floatingip request: Cannot add floating IP to port on subnet d69e9927-81ed-4ef8-ab6b-d44758b28e36 which has no gateway_ip
2015-11-11 12:03:42.377 26897 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:42.377 26897 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:55.698 26894 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:03:55.699 26894 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:08:24.787 26878 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:08:24.788 26878 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:31:09.270 26925 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:31:09.270 26925 WARNING keystonemiddleware.auth_token [-] Authorization failed for token
2015-11-11 12:44:27.200 26878 WARNING keystonemiddleware.auth_token [-] Authorization failed for token

Cisco Employee

Re: CTCM spinning up only one DI node

 

Julie Ann,

 

Jim and I looked at the system, and all seemed well:

 

• There are two chassis, with all VMs active:

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ qvpc-di list                                                                                                                                
...
Chassis Name: 'Gateway1', Chassis Id: di-000
Chassis Name: 'Gateway2', Chassis Id: di-001
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ orchestration subscription list
...
     cartridge-proxy: applicationInstances 1, groupInstances 0, clusterInstances 1, members 1 (Active 1)
              di-001: applicationInstances 1, groupInstances 5, clusterInstances 6, members 6 (Active 6)
              di-000: applicationInstances 1, groupInstances 5, clusterInstances 6, members 6 (Active 6)

• We checked that the floating IPs were working OK:

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ qvpc-di show Gateway1
...
Chassis name:          'Gateway1'
Chassis ID:             di-000
Billing size:           16 GB
Username:               admin
Password:               admin
SNMP community:         public
Management IP:          10.19.1.62
Management private IP:  172.16.180.2
SF Count:               4
Slot       Instance                             Registered          Address
di-000-001 e02f2412-2431-4d97-8b83-d654433ebbe5 2015-11-11T17:01:14 172.16.180.18
di-000-002 872e3338-ac5d-4887-8519-bc9a2b0ec870 2015-11-11T17:01:54 172.16.180.19
di-000-003 732ab22e-fd05-4d68-b79b-8f4d29d34dcd 2015-11-11T17:02:54 172.16.1.21
di-000-004 e7e8e395-94af-4638-a9c2-aa961d2ea643 2015-11-11T17:02:54 172.16.1.19
di-000-005 8a509a51-d3db-4630-bc2d-6792fddba06d 2015-11-11T17:02:57 172.16.1.22
di-000-006 a2bd25fc-fceb-4a5a-8bea-5b45fc2a0ada 2015-11-11T17:02:59 172.16.1.20
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ qvpc-di show Gateway2
...
Chassis name:          'Gateway2'
Chassis ID:             di-001
Billing size:           16 GB
Username:               admin
Password:               admin
SNMP community:         public
Management IP:          10.19.1.64
Management private IP:  172.16.180.3
SF Count:               4
Slot       Instance                             Registered          Address
di-001-001 9472b01c-8a2d-4536-8ac5-a21c6cc13b82 2015-11-11T17:03:06 172.16.180.21
di-001-002 ccc0122e-b819-4369-86f1-88f4e33deaca 2015-11-11T17:02:54 172.16.180.20
di-001-003 3c5e1dba-a198-40ad-b327-6373df26227e 2015-11-11T17:05:03 172.16.1.20
di-001-004 a638fd6e-303d-4e3c-b124-391f48eb79b6 2015-11-11T17:05:04 172.16.1.22
di-001-005 f891ea16-1cfd-45fc-94fe-ae0a22553260 2015-11-11T17:05:06 172.16.1.19
di-001-006 422678d9-c26d-4f75-8f92-3eaad7501a46 2015-11-11T17:05:06 172.16.1.21
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ ssh root@10.19.1.62
...
Cisco Systems QvPC-DI Intelligent Mobile Gateway
root@10.19.1.62's password:
Permission denied, please try again.
root@10.19.1.62's password:
Permission denied, please try again.
root@10.19.1.62's password:
Received disconnect from 10.19.1.62: 2: Too many authentication failures for root from 10.19.1.138 port 33353 ssh2
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$
[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ ssh root@10.19.1.64
...
Cisco Systems QvPC-DI Intelligent Mobile Gateway
root@10.19.1.64's password:

 

We then tried, but were unable to correlate the errors in the neutron logs that Jim reported with anything that CTCM did. We DID locate two instances of the issue that has been resolved. I took Jim through the failure scenario in detail, and the recovery procedure. In brief, a failure scenario looks like this:

 

 

  1. A qvpc-bootstrap “start” fails with an error related to floatingip allocation (usually because you have run out of floatingips).
  2. CTCM leaves a named port like this in existence:

  

 

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ neutron port-list

 

+--------------------------------------+---------------------------------------+-------------------+---------------------------------------------------------------------------------------+

 

| id | name | mac_address       | fixed_ips |

 

+--------------------------------------+---------------------------------------+-------------------+---------------------------------------------------------------------------------------+

 

...

 

| 300b0bf6-e5b5-410c-8a01-2b016de8c696 | di-001-mgmt-vip | fa:16:3e:d5:c5:cd | {"subnet_id": "7926dd15-bb85-4fbd-a605-

 

 

+--------------------------------------+---------------------------------------+-------------------+-------------------------------------------

 

 

  1. But crucially, if you list the floatingips, there will be no entry for the port-id listed:

 

 

[ctcm-admin@rtp-mercury-vm1 ~(keystone_admin)]$ neutron port-list  | grep 300b0bf6-e5b5-410c-8a01-2b016de8c696

 

 

The recovery procedure is to delete the orphaned port by hand, and then “stop” the affected chassis:

 

 

  • neutron port-delete di-001-mgmt-vip
  • qvpc-bootstrap stop chassis-name Gateway55

  

 

Since we were unable to see the problem, I left things at this point; please let us know right away if you see this again.

 

Thanks, Shaheed

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

Shaheed, thanks for getting on this and reviewing the specifics of the ddts.

 

 

Juie Ann,

 

 

I was able to delete the chassis then add them back in so were you looking to remove them and

 

resetup CTCM or has that been done? If so the chassis are up and ready to go.

 

 

Regards,

 

Jim

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

+ diadem-orch-users

 

 

From what I see in the logs, it looks like a CTCM issue.  It appears that NSO sent the request to CTCM, but the response from CTCM indicated that the command failed:

 

 

Nso-funcpack-mobility.log

 

<INFO> 2015-11-10 23:09:25,074 QTCMInvoker pool-201-thread-1: - Start Response obj: QTCMResponse [success=false, managementAddress=/0.0.0.0, chassis=[], sfCount=null, instance=null, errorMessage=null]

 

<ERROR> 2015-11-10 23:09:25,074 QtcmHandler$startCTCMThread pool-201-thread-1: - Oops! An error has occured in Create CTCM thread! null

 

java.lang.Exception

 

at com.cisco.nso.qtcmhandler.QTCMInvoker.start(QTCMInvoker.java:41)

 

at com.cisco.nso.qtcmhandler.QtcmHandler$startCTCMThread.run(QtcmHandler.java:177)

 

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 

at java.lang.Thread.run(Thread.java:745)

 

<ERROR> 2015-11-10 23:09:25,074 QtcmHandler$startCTCMThread pool-201-thread-1: - Oops! Create CTCM request was not successfully executed. Check logs for more information!

 

<INFO> 2015-11-10 23:09:25,074 QtcmHandler pool-201-thread-1: - Setting HLRFS status data for MG-2 to ERROR with description Error in HLRFS MG-2 - null

 

 

wso2carbon.log

 

In the wso2carbon.log, I only see mentions of di-000 and no mention of di-001.

 

 

In CTCM's cvpc.log file, we see the two commands to start the two gateways, but they follow with errors when trying to show the second instance.

 

 

Cvpc.log

 

2015-11-10 23:03:50,253 cvpc[174] DEBUG: Original args ['confd', 'mode', 'di', 'action', 'start', 'chassis-name', 'Gateway1', 'sf-instances', '4', 'billing-size', '16', 'admin-username', 'admin', 'admin-password', 'admin', 'snmp-community', 'public']

 

2015-11-10 23:03:50,253 cvpc[214] DEBUG: Processed args ['di', 'start', '--billing-size', '16', '--admin-username', 'admin', '--admin-password', 'admin', '--snmp-community', 'public', 'Gateway1', '4']

 

2015-11-10 23:03:50,255 cvpc[44] DEBUG: /opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di start --billing 16 --username admin --password admin --snmp public Gateway1 4

 

2015-11-10 23:04:37,087 cvpc[44] DEBUG: /opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di show Gateway1

 

2015-11-10 23:04:44,298 cvpc[174] DEBUG: Original args ['confd', 'mode', 'di', 'action', 'start', 'chassis-name', 'Gateway2', 'sf-instances', '4', 'billing-size', '16', 'admin-username', 'admin', 'admin-password', 'admin', 'snmp-community', 'public']

 

2015-11-10 23:04:44,298 cvpc[214] DEBUG: Processed args ['di', 'start', '--billing-size', '16', '--admin-username', 'admin', '--admin-password', 'admin', '--snmp-community', 'public', 'Gateway2', '4']

 

2015-11-10 23:04:44,300 cvpc[44] DEBUG: /opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di start --billing 16 --username admin --password admin --snmp public Gateway2 4

 

2015-11-10 23:05:00,904 cvpc[54] ERROR: status 1

 

2015-11-10 23:05:00,905 cvpc[44] DEBUG: /opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di show Gateway2

 

2015-11-10 23:05:08,165 cvpc[54] ERROR: status 1

 

2015-11-10 23:05:08,166 cvpc[44] DEBUG: /opt/ctcm/partners/cisco-qvpc/repo/bin/qvpc-di show Gateway2

 

2015-11-10 23:05:15,321 cvpc[54] ERROR: status 1

 

 

Thanks,

 

Sean

 

Cisco Employee

Re: CTCM spinning up only one DI node

 

Sean,

 

Can I have a look? If not, then please gather the logs from the ctcm-admin users $HOME/logs folder and provide those.

 

Thanks, Shaheed

 

Content for Community-Ad
FusionCharts will render here