07-25-2023 06:56 AM - edited 11-20-2024 04:15 AM
Introduction
Telemetry configuration & troubleshooting steps
Below is the list of telemetry sensors which can be used for system health monitoring. It includes ASR9K & NCS5500 platform
Configuration
Sample
telemetry model-driven
destination-group telemetrycollection
address-family ipv4 <SERVER IP> port <>
encoding self-describing-gpb
protocol tcp
sensor-group SYSTEM
sensor-path Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring
sensor-path Cisco-IOS-XR-shellutil-oper:system-time/uptime
sensor-path Cisco-IOS-XR-asr9k-fab-health-oper:fabric-health-stats
sensor-path Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-briefs
sensor-path Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-summary
sensor-path Cisco-IOS-XR-wd-oper:watchdog/nodes/node/memory-state
sensor-path Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization
sensor-path Cisco-IOS-XR-asr9k-fsi-oper:fabric-stats/nodes/node/statses/stats
sensor-path Cisco-IOS-XR-procmem-oper:processes-memory/nodes/node/process-ids/process-id
sensor-path Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface/interface-statistics
sensor-path Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/load-utilization
sensor-path Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/power/location/pem_attributes
sensor-path Cisco-IOS-XR-asr9k-xbar-oper:cross-bar-stats/nodes/node/cross-bar-table/sm15-stats/sm15-stat
sensor-path Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/cache/protocols/protocol
!
subscription telemetrycollection_SYSTEM_5mins
sensor-group-id SYSTEM_5mins sample-interval 300000
destination-id telemetrycollection
Verification
Verify sensors
run mdt_exec -s "Cisco-IOS-XR-wd-oper:watchdog/nodes/node" -c <number of times to fetch data>
RP/0/RP1/CPU0:ASR9K_1#run mdt_exec -s "Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node" -h
usage: mdt_sub_show [options]
-s yang path to subscribe to
-c cadence
-d file to dump the output to
-n exit after n messages, use with -d option
-h help
show telemetry model-driven subscription
Subscription: telemetrycollection_SYSTEM_5mins State: ACTIVE >>> Check the state
-------------
Sensor groups:
Id Interval(ms) State
SYSTEM_5mins 300000 Resolved
Destination Groups:
Id Encoding Transport State Port Vrf IP
telemetrycollection self-describing-gpb tcp Active 20191 10.41.17.45
No TLS
RP/0/RP1/CPU0:ASR9K_1#sho telemetry model-driven subscription telemetrycollection_SYSTEM_5mins internal
Thu Nov 11 15:27:48.511 EST
Subscription: telemetrycollection_SYSTEM_5mins
-------------
State: ACTIVE
Sensor groups:
Id: SYSTEM_5mins
Sample Interval: 300000 ms
Sensor Path: Cisco-IOS-XR-wd-oper:watchdog/nodes/node
Sensor Path State: Resolved >>> Check if sensors are in RESOLVED state. If not, verify the configuration or sensor paths is valid or not
Sensor Path: Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-shellutil-oper:system-time/uptime
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-controller-optics-oper:optics-oper/optics-ports/optics-port/optics-info
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-clns-isis-oper:isis/instances/instance/levels/level/adjacencies/adjacency
Sensor Path State: Resolved
Sensor Path: Cisco-IOS-XR-alarmgr-server-oper:alarms/brief/brief-card/brief-locations/brief-location/active
Sensor Path State: Resolved
Destination Groups:
Group Id: telemetrycollection
Destination IP: 10.41.17.45
Destination Port: 20191
Encoding: self-describing-gpb
Transport: tcp
State: Active
No TLS
Total bytes sent: 3286633286 >>> Send to external server in bytes
Total packets sent: 45201 >>> Send to external server in packets
Last Sent time: 2021-11-11 15:27:47.754358358 -0500
Collection Groups:
------------------
Id: 877
Sample Interval: 300000 ms
Encoding: self-describing-gpb
Num of collection: 117
Collection time: Min: 26 ms Max: 42 ms
Total time: Min: 26 ms Avg: 30 ms Max: 42 ms
Total Deferred: 0
Total Send Errors: 0 >>>>> Check if there is errors
Total Send Drops: 0 >>>>> Check if there is drops
Total Other Errors: 0
No data Instances: 0
Last Collection Start:2021-11-11 15:27:37.744457358 -0500
Last Collection End: 2021-11-11 15:27:37.744490358 -0500
Sensor Path: Cisco-IOS-XR-wd-oper:watchdog/nodes/node
Sysdb Path: /oper/wd/node/*/overload_state
Count: 118 Method: GET Min: 26 ms Avg: 30 ms Max: 42 ms
Item Count: 1061 Status: Active
Missed Collections:0 send bytes: 466176 packets: 118 dropped bytes: 0
success errors deferred/drops >>>>> Check for any drops or errors
Gets 1061 0
List 118 0
Datalist 0 0
Finddata 118 0
GetBulk 0 0
Encode 0 0
Send 0 0
-rw-r—r--. 1 root root 4694 Feb 23 11:13 /tmp/tcpdump/10.41.17.45.48480.5432.cl.1645594986
we were observing missed collection in 663 which is expected or noise, since 663 is running with relative mode by default
Behaviour with 6.6.3 is as below for single sensor collection (Relative mode behaviour-default)
T0 collection start time 00:00 hrs
T1 collection end time 00:03 hrs
Ts sample interval 15 min
So next collection starts at 00:18 hrs T0+T1+Ts= 18 Min (+3 min increase from the normal interval). So in a day we should see some missed collection for each sensors
Behaviour with 7.3.2 is as below(Strict mode behaviour - default)
T0 collection start time is 00:00hrs
T1 collection end time is 00:03 hrs
Ts sample interval 15 min
So next collection start at 00:15 hrs Only Ts=15 Min
In XR version 7.0.1, as part of path concurrency and congestion management feature relative timer behavior was removed and the default was set to strict timer. That is the only mode now whether ‘strict timer’ is configured or not.
In 6.6.3, by default a relative timer will be used and in that case the missed collection stat is just noise.
Logs Collection
if any issue for further triage
ASR9K/NCS55xx Recommended CLI | Telemetry Sensor paths | Remarks | |||||
show redundancy | Cisco-IOS-XR-shellutil-oper:system-time/uptime | ||||||
show pfm location all | Cisco-IOS-XR-pfm-oper:platform-fault-manager/racks/rack/slots/slot/hardware-fault-devices/hardware-fault-type/hardware-fault-info | Enhancement in 732 | |||||
show install active summary | Cisco-IOS-XR-spirit-install-instmgr-oper:software-install/active | ALL | |||||
show version | Cisco-IOS-XR-spirit-install-instmgr-oper:software-install/version | ALL | |||||
show platform | Cisco-IOS-XR-plat-chas-invmgr-ng-oper:platform/racks/rack/slots/slot/state | ALL | |||||
show mem summary location all | Cisco-IOS-XR-nto-misc-shmem-oper:memory-summary/nodes/node/summary | ||||||
show shmwin summary location <> | Cisco-IOS-XR-nto-misc-shmem-oper:memory-summary/nodes/node/detail | ||||||
show watchdog memory-state location all | State of the Memory (Normal, severe, Critical) | Working in 7.x.x release | |||||
Cisco-IOS-XR-wd-oper:watchdog/nodes/node/memory-state | |||||||
Available memory in each LC/RP’s and state of the emory. | |||||||
Cisco-IOS-XR-wd-oper:watchdog/nodes/node/memory-state/free-memory | |||||||
show process memory | sensor-path Cisco-IOS-XR-procmem-oper:processes-memory/nodes/node/process-ids/process-id | ||||||
sensor-path Cisco-IOS-XR-nto-misc-shprocmem-oper:processes-memory/nodes/node/job-ids/job-id/job-id | |||||||
"show controllers fabric fia errors ingress | Cisco-IOS-XR-asr9k-fsi-oper:fabric-stats/nodes/node/statses/stats | ||||||
show controllers fabric fia errors egress | |||||||
show controllers fabric fia stats location <loc_name> | |||||||
show controllers fabric fia stats | |||||||
show controllers fabric fia drops | |||||||
show controllers fabric fia bridge stats" | |||||||
show controllers fabric crossbar statistics instance all location 0/<>/CPU0 | Cisco-IOS-XR-asr9k-xbar-oper:cross-bar-stats/nodes/node/cross-bar-table/sm15-stats/sm15-stat | ASR9k | |||||
Cisco-IOS-XR-asr9k-xbar-oper:cross-bar-stats/nodes/node/cross-bar-table | |||||||
show controller fabric health location <no> | Cisco-IOS-XR-asr9k-fab-health-oper:fabric-health-stats | ASR9k | |||||
show asic-errors all location <no> | Cisco-IOS-XR-asic-error-oper:asic-errors/nodes/node/asic-information/all-instances/all-error-path/summary | ASR9k | |||||
show env temperature | sensor-path Cisco-IOS-XR-asr9k-sc-envmon-oper:environmental-monitoring/racks/rack/slots/slot/modules/module/sensor-types/sensor-type[type='temp']/sensor-names/sensor-name/value-detailed | ASR9k | |||||
admin show env temperature | Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/temperatures |
|
|||||
admin show env voltage | Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/voltage | ASR9k | |||||
admin show env power | Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/power | ASR9k | |||||
admin show env fan | Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/fan | ASR9k | |||||
admin show env current | Cisco-IOS-XR-sysadmin-asr9k-envmon-ui:environment/oper/current | ASR9k | |||||
show controllers npu internaltcam loc < >, show controllers npu externaltcam location <> | Cisco-IOS-XR-fia-internal-tcam-oper:controller/dpa/nodes/node/internal-tcam-resources | NCS5508 | |||||
sh controllers npu voq-usage interface all instance all location all | Cisco-IOS-XR-fretta-bcm-dpa-npu-stats-oper:dpa/stats/nodes/node/npu-numbers/npu-number/display/interface-handles/interface-handle | NCS5508 | |||||
sh controllers npu stats voq base <base-no> instance all location all | |||||||
show controllers npu resources all location all | Cisco-IOS-XR-platforms-ofa-oper:ofa/stats/nodes/node/Cisco-IOS-XR-NCS-BDplatforms-npu-resources-oper:hw-resources-datas/hw-resources-data | NCS5508 | |||||
Resources polled: Ext-tcam-ipv4, ext-tcam-ipv6, ext-tcam-ipv6-short, ext-tcam-ipv6-long, fec, ecmp-fec, lem, lpm, encap resource monitoring: | |||||||
OOR Status | Cisco-IOS-XR-platforms-ofa-table-stats-oper:ofa/stats/nodes/node/table-datas | NCS5508 | |||||
sh controllers npu resources fec location | |||||||
show controllers npu stats traps-all instance all location all | Cisco-IOS-XR-fretta-bcm-dpa-npu-stats-oper:dpa/stats/nodes/node/npu-numbers/npu-number/display/trap-ids/trap-id | NCS5508 | |||||
show filesystem | Cisco-IOS-XR-shellutil-filesystem-oper:file-system/node | ALL | |||||
show reboot history location | Cisco-IOS-XR-linux-os-reboot-history-oper:reboot-history/node | ALL | |||||
show route vrf all summary | Cisco-IOS-XR-ip-rib-ipv4-oper:rib/rib-table-ids/rib-table-id/summary-protos/summary-proto | ALL | |||||
show cef tables | Cisco-IOS-XR-ip-rib-ipv6-oper:ipv6-rib/rib-table-ids/rib-table-id/summary-protos/summary-proto | ||||||
Cisco-IOS-XR-fib-common-oper:fib/nodes/node/protocols/protocol/cef-tables/cef-table/table-id | |||||||
show alarms detail system {{active/history}} | Cisco-IOS-XR-alarmgr-server-oper:alarms/brief/brief-card/brief-locations/brief-location/history | ALL | |||||
Cisco-IOS-XR-alarmgr-server-oper:alarms/brief/brief-card/brief-locations/brief-location/active | |||||||
show isis fast-reroute summary | Cisco-IOS-XR-clns-isis-oper:isis/instances/instance/topologies/topology/frr-summary | ALL | |||||
show cef drops | Cisco-IOS-XR-fib-common-oper:fib-statistics/nodes/node/drops | ALL | |||||
show mpls interfaces | Cisco-IOS-XR-mpls-lsd-oper:mpls-lsd/interfaces/interface | ALL | |||||
show mpls traffic-eng tunnels summary | Cisco-IOS-XR-mpls-te-oper:mpls-te/tunnels/summary | ALL | |||||
show rsvp interface | Cisco-IOS-XR-ip-rsvp-oper:rsvp/interface-briefs/interface-brief | ALL | |||||
show mpls traffic-eng counters signaling all | Cisco-IOS-XR-mpls-te-oper:mpls-te/signalling-counters/signalling-summary | ALL | |||||
show rsvp counters messages | Cisco-IOS-XR-ip-rsvp-oper:rsvp/counters/interface-messages/interface-message | ALL | |||||
show mpls ldp summary | Cisco-IOS-XR-mpls-ldp-oper:mpls-ldp/nodes/node/bindings-summary-all | ALL | |||||
show mpls ldp summary | Cisco-IOS-XR-mpls-ldp-oper:mpls-ldp/global/active/default-vrf/summary | ALL | |||||
show cef drops | Cisco-IOS-XR-fib-common-oper:fib-statistics/nodes/node/drops | ALL | |||||
Cisco-IOS-XR-fib-common-oper:fib/nodes/node/protocols/protocol/resource | ALL | ||||||
show cef vrf all {{ipv4|ipv6|mpls|etc}} | Cisco-IOS-XR-fib-common-oper:fib/nodes/node/protocols/protocol/vrfs/vrf/summary | ALL | |||||
show controllers np load all | Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/load-utilization | ASR9k | |||||
show controllers np fast-drop all | Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/fast-drop | ||||||
show controllers np counters all | Cisco-IOS-XR-asr9k-np-oper:hardware-module-np/nodes/node/nps/np/counters | ||||||
show segment-routing traffic-eng forwarding policy | Cisco-IOS-XR-infra-xtc-agent-oper:xtc/policy-forwardings/policy-forwarding | ALL | |||||
show segment-routing traffic-eng policy | Cisco-IOS-XR-infra-xtc-agent-oper:xtc/policies/policy | ALL | |||||
show segment-routing traffic-eng ipv4 topology summary | Cisco-IOS-XR-infra-xtc-agent-oper:xtc/topology-summaries/topology-summary | ALL | |||||
show segment-routing traffic-eng policy summary | Cisco-IOS-XR-infra-xtc-agent-oper:xtc/policy-summary | ALL | |||||
show lpts pifib hardware police location <no> | Cisco-IOS-XR-asr9k-lpts-oper:platform-lptsp-ifib/nodes/node/police/police-info | ASR9k | |||||
Cisco-IOS-XR-asr9k-lpts-oper:platform-lptsp-ifib/nodes/node/stats | |||||||
Cisco-IOS-XR-asr9k-lpts-oper:platform-lptsp-ifib-static/node-statics/node-static/stats | |||||||
show memory summary | Cisco-IOS-XR-nto-misc-oper:memory-summary/nodes/node/summary | ALL | |||||
show reboot history location <no> | Cisco-IOS-XR-linux-os-reboot-history-oper:reboot-history/node | ALL | |||||
show processes cpu | Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring | ALL | |||||
uptime device | Cisco-IOS-XR-shellutil-oper:system-time/uptime | ALL | |||||
show controllers optics <> | Cisco-IOS-XR-controller-optics-oper:optics-oper/optics-ports/optics-port/optics-info | ALL | |||||
show processes cpu | Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization | ALL | |||||
sh interfaces <> accounting | Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/cache/protocols/protocol | ALL | |||||
show policy-map interface <name> ingress | Cisco-IOS-XR-qos-ma-oper:qos/interface-table/interface/input/service-policy-names/service-policy-instance/statistics | ALL | |||||
show policy-map interface <name> egress | Cisco-IOS-XR-qos-ma-oper:qos/interface-table/interface/output/service-policy-names/service-policy-instance/statistics | ALL | |||||
Power | Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/power/ | NCS55XX | |||||
admin show environment power | |||||||
admin show environment power location <> | |||||||
Fan | Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/fan | NCS55XX | |||||
admin show enviroment fan | |||||||
Current | Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/current | NCS55XX | |||||
admin show environment current location | |||||||
Voltage | Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/voltage | NCS55XX | |||||
admin show environment voltages location | |||||||
Temperature: | Cisco-IOS-XR-sysadmin-fretta-envmon-ui:environment/oper/temperatures | NCS55XX | |||||
admin show environment temperature location | |||||||
FPD status | Cisco-IOS-XR-show-fpd-loc-ng-oper:show-fpd/locations/location | NCS55XX | |||||
show fpd package | |||||||
Memory available in different partitions: | Cisco-IOS-XR-sysadmin-show-media:ShowMedia" | ALL | |||||
show platform | Cisco-IOS-XR-plat-chas-invmgr-ng-oper:platform/racks/rack/slots/slot/state | ALL | |||||
CPU process | Below sensor-path will provide CPU utilization for all the LC’s, RP’s and processes. | ALL | |||||
show processes cpu | Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization | ||||||
show processes cpu thread | |||||||
Below sensor-path can be used for monitoring CPU utilization for a specific process. The example below is for bgp in the RP. | |||||||
Cisco-IOS-XR-wdsysmon-fd-proc-oper:process-monitoring/nodes/node[node-name=0/RP0/CPU0]/process-name/proc-cpu-utilizations/proc-cpu-utilization[process-name=bgp] | |||||||
Redundancy | Cisco-IOS-XR-infra-rmf-oper:redundancy | ALL | |||||
show redunduncy | |||||||
Sensor-paths for sysadmin controller commands: | "Admin show controller switch statistics | NCS55XX | |||||
Cisco-IOS-XR-sysadmin-controllers-ncs5500:controller/switch/oper/statistics/detail | |||||||
Admin show controller switch trunk | |||||||
Cisco-IOS-XR-sysadmin-controllers-ncs5500:controller/switch/oper/trunk" | |||||||
Sensor-path for SFE asic errors | Command “Admin show asic-errors SFE all all” | NCS55XX | |||||
Cisco-IOS-XR-sysadmin-asic-errors-ael:asic-errors/show-all-instances | |||||||
ASIC errors | Cisco-IOS-XR-fretta-bcm-dpa-npu-stats-oper:dpa/stats/nodes/node/asic-statististics | NCS55XX | |||||
Cisco-IOS-XR-fretta-bcm-dpa-npu-stats-oper:dpa/stats/nodes/node/asic-statistics/asic-statistics-for-npu-ids/asic-statistics-for-npu-id | |||||||
Fabric errors/counters: | Cisco-IOS-XR-dnx-driver-fabric-plane-oper:fabric | NCS55XX | |||||
Interface counters | Interface counters for every interface in the router. | ALL | |||||
Show interface <> | Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/cache/generic-counters | ||||||
Interface data rate (PPS/bps): | Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/data-rate | ALL | |||||
Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/cache/data-rate >>> Use this sensor in different sensor group from generic counters | |||||||
Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface[interface-name=Bun*]/cache/data-rate >> Ex. This will fetch for bundle interface | |||||||
show interface brief | Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-briefs |
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: