cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
271
Views
0
Helpful
4
Replies

service logic calling another service logic - refcount issue at delete

ben_piret
Level 1
Level 1

Hello the community,

I am facing some strange behaviour in the following scenario : I have a service logic (l3vpn) defined to build L3 links (sub-interface and BGP) on my MPLS network and another service logic (vpnconfig) to build the vpn level config (VRF and address-family). Normally the users are supposed to create the vpnconfig logic before the l3vpn service logic.

I had (maybe the bad ?) idea that when the VRF is not defined on the PE, the L3VPN logic would just call the VPNCONFIG logic (in python via 

root.vpnconfig__vpnconfig[SERVICENAME].pedevice.create(pe_device_name) to add the PE which lacks the VRF level config).
It works pretty fine at config time but it is introducing a micro-outage at delete time in the following sequence :
1) I add a L3VPN link with the PE not being configured : the L3VPN logic calls the VPNCONFIG logic and I get both configs deployed - so far so good.
2) I add another L3VPN link on the same PE - this time the vpnconfig logic is not called (because the PE is now configured as per 1) and the link gets deployed - so far so good
3) I am deleting the link cretaed in 1) - the commit dry-run shows me that VRF level part of the configuration gets removed allthough the refcount (service-meta-data) on the device configuration is at 2 :
/* Refcount: 2 */
/* Backpointer: [ /vpnconfig:vpnconfig[vpnconfig:name='DUMMYVRF'] ] */
vrf DUMMYVRF {
/* Refcount: 2 */
/* Originalvalue: 9208:722 */
rd 9208:722;
export {
/* Refcount: 2 */
/* Originalvalue: export-public-loop-only-management */
map export-public-loop-only-management;
}
route-target {
/* Refcount: 2 */
/* Backpointer: [ /vpnconfig:vpnconfig[vpnconfig:name='DUMMYVRF'] ] */
import 9208:200;
}
}
nsoadmin@ncs% commit dry-run outformat native #### see the 'no ip vrf DUMMYVRF' in the lines below
native {
device {
name R13B5032WIN0
data no interface TenGigabitEthernet0/0/1.210000
ip vrf DUMMYVRF
no export map export-public-loop-only-management
no route-target import 9208:200
exit
router bgp 9208
address-family ipv4 unicast vrf DUMMYVRF
no neighbor 10.20.30.10 remote-as 65534
exit-address-family
!
!
no ip vrf DUMMYVRF
router bgp 9208
address-family ipv4 unicast vrf DUMMYVRF
no redistribute connected
exit-address-family
!
!
}
}
At the end, if I do the true commit, the NED makes a rollback as it understands the mistake but it is a rollback : during a short while the VRF is deleted from the PE before being reinjected which creates a micro-outage for link done in 2).
It could be a conceptual mistake which I would accept but why is NSO removing my VRF if the refcount is at 2 ?
Do you think that it is a bug ?
Thanks in advance for your help.
Kind regards.
Benoit
 
4 Replies 4

ben_piret
Level 1
Level 1

sorry I messed myself up the refcount is well 1 but the owner of the refcount is the vpnconfig

    /* Refcount: 1 */
    /* Backpointer: [ /vpnconfig:vpnconfig[vpnconfig:name='DUMMYVRF'] ] */
    vrf DUMMYVRF {
        /* Refcount: 1 */
        rd 9208:722;
        export {
            /* Refcount: 1 */
            map export-public-loop-only-management;
        }
        route-target {
            /* Refcount: 1 */
            /* Backpointer: [ /vpnconfig:vpnconfig[vpnconfig:name='DUMMYVRF'] ] */
            import 9208:200;
        }
    }

so it is a bit less illogic but I still don't understand why it is being deleted while deleting the L3VPN as it is owned by vpnconfig - is there another layer a service-meta-data where I could find that this piece of config is owned by vpnconfig itself owned by the L3VPN of link 1) ?

Hi,

as a next step I would do a dry run with outformat cli. The way NSO works is first the fastmap algorithm decides how the DB has to changed, and that is visible in the commit dry-run outformat cli. Then the NED takes that diff and builds a CLI script. So you should see if the intent was to remove the vrf or not. 

If you see that NSO does not want to remove the vrf, and the command is introduced by the NED, then that is possibly a bug if it is unnecessary, or it might be something to satisfy a device CLI config requirement - I don't know which requirement that could be, but you could check by taking the output of the dryrun outformat native, editing it to remove the line you highlighted, and then pasting it to the device to see how it reacts.

There is a bit your question I did not understand, about the rollback, but to see what is going on exactly between NSO and device you can turn on trace with

devices device PE1 trace raw
commit

That will show all that is sent and received from the device.

hazad
Cisco Employee
Cisco Employee

When you run root.vpnconfig__vpnconfig[SERVICENAME].pedevice.create(pe_device_name) form your l3vpn service, you will create an instance of the vpnconfig service. That service instance will be the one creating the vpn config on the device, which you've confirmed through the service-meta-data. The vpnconfig service will however be owned by the l3vpn service. You will find service-mata-data on the vpnconfig service instance pointing to the l3vpn service instance.

Since you only create the vpnconfig instance when there is no vpnconfig instance, you will only have it created by the first l3vpn instance. So the vpnconfig will only be owned by the first l3vpn instance, and will be deleted only when you delete the first l3vpn instance. But that is not what you want I suppose. You want it to remain for as long as you have l3vpn services. I believe the solution here is to simply remove that condition and always create the vpnconfig instance through the l3vpn service, regardless if it exists or not. By doing this, all subsequent l3vpns will bump the refcount, preventing the vpnconfig to be deleted until all l3vpn services are deleted.

ben_piret
Level 1
Level 1

Hi both,

thanks for your answers.

For the moment I have removed this "feature" as it creates more problems that it solves.

Benoit