cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
40276
Views
25
Helpful
107
Comments
xthuijs
Cisco Employee
Cisco Employee

Introduction

Understanding how to debug PPPoE in IOS-XR for the ASR9K

Problem Description

Detailed step through guide for debugging PPPoE sessions.

Understanding BNG Architecture

Before we start troubleshooting sessions it is important to understand the architecture of how things link together. This overview below shows the different components as it related to PPP(oE).

In the next modules the various debugs will be elaborated on and some common things that generally go wrong.

policy-arch.JPG

1)    1) A session is initiated in PPPoE by the reception of a PADI packet in PPPoE. This PADI is a broadcast packet hitting the control plane. If the session is terminating on a physical (sub)interface, the PPPoE is handled on the linecard, if the session terminates as part of a bundle (sub)interface, PPPoE is handled by the RSP.

2)    2) The reception of a PADI triggers a “session-start” event in the control policy, during the session-start event we need to apply the dynamic template that holds at least the LCP specific parameters for when the session continues to PPP phase. Also we could do “pre-authentication” here based on PPPoE tag information in the session-start event.

3)    3) After we have sent the PADO to the client, and the client selected us as BNG, a PADR will be received which triggers the session-activate event. In this event we need a template, either provided during session start with NCP parameters (like unnumbered info), or more specific template info can be provided during this session-activate event. At this time the subscriber interface is created when we transmit the PADS and are now commited for 3 mins to that subscriber session (in IOS, the ncp timeout).

4)    4) The session moves now to the PPP LCP phase and will try to complete the LCP based on the LCP parameters defined in the dynamic-template provided during the session-start event.

5)    5) During LCP we generally negotiate some sort of authentication protocol and when we do, we enter the authentication phase. While the control policy is still in session-activate state we are starting our CHAP or PAP exchange to retrieve user credentials.

6)    6) Even if CHAP/PAP is not negotiated we can still do an authorization request, but probably not on line username/password in the absence of identity retrieval by PPP/Auth. This authen action is defined in the session-start event of the control policy

7)    7) In XR, in the absence of local authentication capabilities, we need to use radius (or tacacs/less common) to initiate the access-request

8)    8) The radius interaction will give us a response, or not when it times out. Either case, an event is triggered in the policy-plane again that we can trigger on to provide further directives or we “can live” with the response radius has given us.

9)    9) Events triggered are authen-failure (access-reject) or authen-no-response (timeout from our radius-server/list). If the response is a success/accept, we are continuing the session and start NCP.

10)  10) IPCP is started and when completed the route is installed.

This document will focus on debugging and understanding each of these components in in-depth level.

PPPoE

pppoe-disc.JPG

This picture above shows the 4 important packets from PPPoE.

The PADI is a broadcast packet, the PADO is a packet with SMAC of the BNG and the DMAC of the subscriber.

The subscriber will send a unicast PADR packet with the DMAC of the BNG it wants to establish a session with, followed by a PADS from the BNG with the pppoe session id.

This session id is unique for the segment and is to be used for EVERY packet that is sent forward from this session.

Note: The BNG will verify the SMAC, arriving access interface and pppoe session ID to prevent spoofing

The session ID is part of the 8 byte pppoe header that is slapped on to the packet (hence the 1492 maximum MTU size when pppoe encap is used.

PPPoE header explained:


MAC header

PPPoE header

Data :::

MAC header: This is a standard ethernet II header with ethertype 0x8863 for PPPoE control (PADx messages) and 0x8864 for PPPoE data packets, including PPP/LCP.

PPPoE header:


00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Version

Type

Code

Session   ID

Length

Data :::

Version. 4 bits.
Protocol version. Must be set to 1.

Type. 4 bits.
Must be set to 1.

Code. 8 bits.

Session ID. 16 bits, unsigned.

Length. 16 bits.
Size of the Data field in bytes.

Data. Variable length.

Debugging PPPoE

First step in debugging is to evaluate the PPPoE protocol and packets.

Command: Debug pppoe protocol

Provides Basic packet level debugging from a pppoe protocol point of view

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

0019.2f43.9a38: this is the subscriber mac-address

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]:    Service-name:

RP/0/RSP0/CPU0:Sep  1 15:27:09.633 : pppoe_ma[346]: [PADO-Sent]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.633 : pppoe_ma[346]: [PADO-Sent]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]:    vlan-id-outer 100

RP/0/RSP0/CPU0:Sep  1 15:27:09.634 : pppoe_ma[346]: [PADR-Recv]:    Service-name:

RP/0/RSP0/CPU0:Sep  1 15:27:09.853 : pppoe_ma[346]: [PADS-Sent]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

RP/0/RSP0/CPU0:Sep  1 15:27:09.853 : pppoe_ma[346]: [PADS-Sent]:    vlan-id-outer 100

The  format of the debug is:

NODE : Time : <process>[pid] : [PACKET-direction] Interface mac-addr

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

In this example the PPPoE session completed perfect, though little information on the packet contents are provided.

Things to look for are:

Is pppoe completing properly to a PADS

Is the interface shown expected

Is the mac address expected

Is the vlan (stack) what we’d expect?

Command: Debug pppoe packet

Provides detailed packet contents from pppoe

PADI

RP/0/RSP0/CPU0:Sep  1 15:37:36.372 : pppoe_ma[346]: Bundle-Ether100.100: I dst ffff.ffff.ffff src 0019.2f43.9a38: len 46 0x1109

0000000401010000000000000000000000000000000000000000000000000000000000000000000000000000

Although the protocol is not provided here because debug pppoe protocol is off, we can tell that this is a PADI because the packet is a broadcast destination on L2 and also the packet starts with 1109 the “09” being the PADI packet

The added 0’s are expected here also because the PADx packets are very small and we need to add padding to them.

The LEN provided (46) is the Ether header length.

The PPPoE length is highlighted in ORANGE

PADO

RP/0/RSP0/CPU0:Sep  1 15:37:36.372 : pppoe_ma[346]: Bundle-Ether100.100: O dst 0019.2f43.9a38 src b4a4.e392.208b: len 21 0x11070000000f010100000102000

741394b2d424e47

PADR

RP/0/RSP0/CPU0:Sep  1 15:37:36.374 : pppoe_ma[346]: Bundle-Ether100.100: I dst b4a4.e392.208b src 0019.2f43.9a38: len 46 0x11190000000f0102000741394b2

d424e470101000000000000000000000000000000000000000000000000000000

PADS

RP/0/RSP0/CPU0:Sep  1 15:37:36.591 : pppoe_ma[346]: Bundle-Ether100.100: O dst 0019.2f43.9a38 src b4a4.e392.208b: len 10 0x11651ed0000401010000


In RED is the assigned pppoe session ID. When the subscriber interface is created that should match this number in decimal, in this case bundle-ether100.100.pppoe7888

Packet type list:

Type              Code              Direction

PADI               0x09               In only

PADO             0x07               Out only

PADR             0x19               In only

PADS             0x65               Out only

PADT              0xa7               In and Out

Things to look for are:

Are the packet types expected

Are the lengths matching up for the packet to the actual content

Are we expecting the packets in the direction they are meant to be send

Is the mac address expected

Is the vlan (stack) what we’d expect?

Command: debug pppoe session detail

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Bundle-Ether100.100: Initializing new session

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Creating and inserting a new session idb on parent interface Bundle-Ether100.100, with se

ssion id 7889

RP/0/RSP0/CPU0:Sep  1 15:51:22.917 : pppoe_ma[346]: Session: Successfully added new session idb to database of parent Bundle-Ether100.100. Parent sess

ion IDB count = 3

RP/0/RSP0/CPU0:Sep  1 15:51:22.918 : pppoe_ma[346]: Session: Queuing interface create on parent interface Bundle-Ether100.100, with session id 7889

LC/0/0/CPU0:Sep  1 15:51:23.038 : pppoe_ea[288]: Interface: 0x0007b920: Creating and inserting a new intf idb

LC/0/1/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: Creating and inserting a new intf idb

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: Successfully added new intf idb to database

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE received

LC/0/0/CPU0:Sep  1 15:51:23.039 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(1), idb lookups/creates completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: Successfully added new intf idb to database

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE received

LC/0/1/CPU0:Sep  1 15:51:23.040 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(1), idb lookups/creates completed successfully

LC/0/0/CPU0:Sep  1 15:51:23.044 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(2), Hardware programming completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.045 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(2), Hardware programming completed successfully

LC/0/0/CPU0:Sep  1 15:51:23.045 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(3), IMP interface create completed successfully

LC/0/1/CPU0:Sep  1 15:51:23.046 : pppoe_ea[288]: Interface: 0x0007b920: INTF CREATE(3), IMP interface create completed successfully

RP/0/RSP0/CPU0:Sep  1 15:51:23.050 : pppoe_ma[346]: Session: 0x00000000: Received interface create cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.052 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Inserting a session idb into global database

RP/0/RSP0/CPU0:Sep  1 15:51:23.053 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Successfully added new session idb to global database. Ses

sion IDB count = 3

RP/0/RSP0/CPU0:Sep  1 15:51:23.053 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received MTU notification, with MTU actual 1500

RP/0/RSP0/CPU0:Sep  1 15:51:23.057 : pppoe_ma[346]: Session: Received AAA batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.057 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received AAA Session Create cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.058 : pppoe_ma[346]: Session: Received AAA batch end notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.075 : pppoe_ma[346]: Session: Received SubDB batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.075 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received Activate Config cb: No error

RP/0/RSP0/CPU0:Sep  1 15:51:23.076 : pppoe_ma[346]: Session: Received SubDB batch end notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.159 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Interface ready call succeeded

RP/0/RSP0/CPU0:Sep  1 15:51:23.188 : pppoe_ma[346]: Session: Received AAA batch start notication

RP/0/RSP0/CPU0:Sep  1 15:51:23.188 : pppoe_ma[346]: Session: Bundle-Ether100.100.pppoe7889: Received AAA Session Up Complete cb

RP/0/RSP0/CPU0:Sep  1 15:51:23.189 : pppoe_ma[346]: Session: Received AAA batch end notication

This command is not necessarily that useful for normal troubleshooting, but it does signify the MTU being set on the interface and the session creation from PPPoE internally.

Common things to watch out for:

PPPoE session generally won’t establish for various reasons, here are some gotcha’s that I have ran into many many times

  • Service name matching: is the pppoe service name matching what the BBA group is configured for?
  • PPPoE throttling
  • Control plane policing
  • Incorrect vlan (combo)
  • Malformed tags

PPP

PPP consists of 3 phases, LCP, Authentication and NCP in order to establish a session allowing the transport of L3 protocols over it.

During LCP initial link parameters are agreed on such as MRU (like MTU), authentication protocol. If an Authentication protocol is agreed on, then we enter the Auth phase in which credentials are exchanged that likely are being handed off to radius for verification.

After that the NCP phase start which allows us to establish an L3 protocol connection such as IPCP for ipv4 or OSICP for CLNS etc. A9K supports only IPCP and soon IPV6CP (with xr430).

ppp_setup.jpg

During LCP and NCP the exchange of options is done via the following packets. Each of these packets contains options.

CONFIGURE-REQUEST

A packet sent with options that a side of the ppp connection likes to propose

CONFIGURE-ACK

A response to a request indicating that all options in the request are acceptable

CONFIGURE-REJECT

A response to a request indicating that an option can’t be honoured (eg one side does MLP the other side does not, we reject the option

CONFIGURE-NAK

A response to a request indicating that the options in the NAK packet can be fulfilled, but the option value is not desirable. Eg one side proposes CHAP authentication and the other side can only do PAP.

Requests and responses are linked together by a field in the packet called the “ID”.

After the sessions is completed, LCP echo requests, a keepalive mechanism, are exchanged and have to be acknowledged.

Keepalives from the peer are responded by the 9k in hardware. The 9k also originates keepalives which have to be responded by the peer.

Debugging PPP

Some helpful hints explaining the debugs from the component

Command: Debug ppp negotiation

RP/0/RSP0/CPU0:Sep 15 12:43:18.244 : PPP-MA[343]: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Up Event

RP/0/RSP0/CPU0:Sep 15 12:43:18.248 : PPP-MA[343]: LCP: Bundle-Ether100.100.pppoe1: [Initial]: Change to state Closed

In this debug start, the location is RSP0 because PPP for bundle sessions as indicated by the interface are terminated by the RSP. Sessions terminating on a phy interface are terminated on the linecard.

I’ll remove the node and interface moving forward for space issues in the debug.

LCP debug

.244 LCP: [Initial]: Up Event

.248 LCP: [Initial]: Change to state Closed

.249 LCP: [Closed]: Down Event

.250 LCP: [Closed]: Change to state Initial

.250 LCP: [Initial]: Up Event

.251 LCP: [Initial]: Change to state Closed

.358 LCP: [Closed]: Open Event

.358 LCP: [Closed]: Initialize-Restart-Counter

.358 LCP: [Closed]: O CONFREQ id 1 len 19

.359 LCP: [Closed]:    MRU 1492 (0x010405d4)

.359 LCP: [Closed]:    AuthProto CHAP (0x0305c22305)

.359 LCP: [Closed]:    MagicNumber 0x1e7824c4 (0x05061e7824c4)

Identify the protocol state LCP, the LCP state itself in brackets [ ], the direction of the packet O output, I input the packet type (confreq) and the ID. At this point we expect a response from the peer on our ID 1.

.360 LCP: [Closed]: Change to state Req-Sent

.360 LCP: [Req-Sent]: Open Event

.361 LCP: [Req-Sent]: I CONFREQ id 1 len 25

.361 LCP: [Req-Sent]:    MRU 1492 (0x010405d4)

.362 LCP: [Req-Sent]:    MagicNumber 0x659ff390 (0x0506659ff390)

.362 LCP: [Req-Sent]:    MRRU 1524 (0x110405f4)

.362 LCP: [Req-Sent]:    EndpointDisc 1 Local (0x13070174657374)

We have move to state request sent, and we are getting an incoming configure request, also with ID 1, we are expected to send a response to ID1 now as well. Note that I deliberately let the client request options for MRRU and EndpointDisc, used for multilink that we don’t support on the 9k.

.363 LCP: Peer's MRRU: Option received on non-MP interface (reject)

.363 LCP: Peer's ED: Option received on non-MP interface (reject)

PPP identifies already the unsupported option

.363 LCP: [Req-Sent]: O CONFREJ id 1 len 15

.364 LCP: [Req-Sent]:    MRRU 1524 (0x110405f4)

.364 LCP: [Req-Sent]:    EndpointDisc 1 Local (0x13070174657374)

Here we are sending a REJECT on ID1,which is a response to the peers request ID1 in BLUE

We are rejecting the options for MLP. We should expect a new request to come in from the peer with a new offer. Hopefully, the peer is not persistent in its multilink request. If it is, and requests it again, we reject it again. How long that can go on? That will be the ppp max-configure configuration knob, default is 3.

After 3 attempts to configure the link we will terminate the ppp session.

.373 LCP: [Req-Sent]: I CONFACK id 1 len 19

.373 LCP: [Req-Sent]:    MRU 1492 (0x010405d4)

.373 LCP: [Req-Sent]:    AuthProto CHAP (0x0305c22305)

.374 LCP: [Req-Sent]:    MagicNumber 0x1e7824c4 (0x05061e7824c4)

The peer did like out proposal and acknowledged it. If the peer didn’t like our CHAP authentication protocol but can do PAP, it would have sent us a PPP CONFNAK packet with authproto option in there. This would indicate to us to switch auth protocols IF we are configured for that.

If not, and we are persistent on CHAP, we end up in a dead lock also and after default of 3 attempts to configure we drop the session

.374 LCP: [Req-Sent]: Initialize-Restart-Counter

.374 LCP: [Req-Sent]: Change to state Ack-Rcvd

.374 LCP: [Ack-Rcvd]: I CONFREQ id 2 len 14

.374 LCP: [Ack-Rcvd]:    MRU 1492 (0x010405d4)

.375 LCP: [Ack-Rcvd]:    MagicNumber 0x659ff390 (0x0506659ff390)

We are getting a new proposal in from the peer, it retracted the mlp options, note that the ID has changed to 2. The ID MUST increase on every cycle sent.

.375 LCP: [Ack-Rcvd]: O CONFACK id 2 len 14

.375 LCP: [Ack-Rcvd]:    MRU 1492 (0x010405d4)

.375 LCP: [Ack-Rcvd]:    MagicNumber 0x659ff390 (0x0506659ff390)

.375 LCP: [Ack-Rcvd]: Change to state Open

That we can honor, so we send an ack on that proposal id2. At this point in time we have sent an ACK and received an ACK.

.376 LCP: [Open]: Report This-Layer-Up

IPCP debug

This handshake is the same as LCP, with the same packet types, just for a different protocol. IPCP instead of LCP.

.397 IPCP: [Initial]: I CONFREQ id 1 len 22

.398 IPCP: [Initial]:    Address 0.0.0.0 (0x030600000000)

.398 IPCP: [Initial]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.398 IPCP: [Initial]:    SecondaryDNS 0.0.0.0 (0x830600000000)

.398 IPCP: [Initial]: Conf-Req packet stalled

.400 IPCP: [Initial]: Open Event

.400 IPCP: [Initial]: Change to state Starting

.402 IPCP: [Starting]: Report This-Layer-Started

.402 IPCP: [Starting]: Up Event

.402 IPCP: [Starting]: Initialize-Restart-Counter

.403 IPCP: [Starting]: O CONFREQ id 1 len 10

.403 IPCP: [Starting]:    Address 101.101.1.1 (0x030665650101)

Our address is the interface we are unnumbered to from the session, derived from the dynamic template or radius.

.404 IPCP: [Starting]: Change to state Req-Sent

.404 IPCP: [Req-Sent]: Restarting stalled Conf-Req packet

.405 IPCP: [Req-Sent]: I CONFREQ id 1 len 22

.405 IPCP: [Req-Sent]:    Address 0.0.0.0 (0x030600000000)

.405 IPCP: [Req-Sent]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.405 IPCP: [Req-Sent]:    SecondaryDNS 0.0.0.0 (0x830600000000)

.406 IPCP: [Req-Sent]: I CONFACK id 1 len 10

.406 IPCP: [Req-Sent]:    Address 101.101.1.1 (0x030665650101)

.406 IPCP: [Req-Sent]: Initialize-Restart-Counter

.406 IPCP: [Req-Sent]: Change to state Ack-Rcvd

.408 IPCP: Peer's Primary DNS address: 0.0.0.0 (reject)

.408 IPCP: Peer's Secondary DNS address: 0.0.0.0 (reject)

.408 IPCP: [Ack-Rcvd]: O CONFREJ id 1 len 16

.408 IPCP: [Ack-Rcvd]:    PrimaryDNS 0.0.0.0 (0x810600000000)

.408 IPCP: [Ack-Rcvd]:    SecondaryDNS 0.0.0.0 (0x830600000000)

Client requested dns servers and we don’t have any info for them.

.409 IPCP: [Ack-Rcvd]: I CONFREQ id 2 len 10

.409 IPCP: [Ack-Rcvd]:    Address 0.0.0.0 (0x030600000000)

Client requests an ip address

.410 IPCP: [Ack-Rcvd]: O CONFNAK id 2 len 10

.410 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

We provide a pool address.

The client can also request a real address, but that if we ack that be a semi static ip address, so many providers prefer to renegotiate the address and send a nak always on the address option and provide a new pool address.

.411 IPCP: [Ack-Rcvd]: I CONFREQ id 3 len 10

.411 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

.411 IPCP: [Ack-Rcvd]: O CONFACK id 3 len 10

.411 IPCP: [Ack-Rcvd]:    Address 199.1.1.1 (0x0306c7010101)

.411 IPCP: [Ack-Rcvd]: Change to state Open

.412 IPCP: [Open]: Report This-Layer-Up

LCP echos debug

RP/0/RSP0/CPU0:Sep 15 12:43:19.405 LCP: [Open]: I ECHOREQ id 1 len 12 magic 0x659ff390

RP/0/RSP0/CPU0:Sep 15 12:43:19.405 LCP: [Open]: O ECHOREP id 1 len 12 magic 0x1e7824c4

Incoming echo requests and replies, note that the ID is incrementing every time.

Also note the magic numbers in this packet which are the Magics negotiated during LCP.

A9K does NOT do magic number validation.

RP/0/RSP0/CPU0:Sep 15 12:43:29.646 LCP: [Open]: I ECHOREQ id 2 len 12 magic 0x659ff390

RP/0/RSP0/CPU0:Sep 15 12:43:29.646 LCP: [Open]: O ECHOREP id 2 len 12 magic 0x1e7824c4

Note: I requested the lcp echo debug to be moved over to a different debug command. At the time of writing this article it is part of the debug ppp negotiation


AAA

radius.JPG

When the control policy determines that user authentication is in order, an access-request with several attributes is sent to the RADIUS server.

The radius server can check these attributes and reply back with either an ACCESS-ACCEPT or an ACCESS-REJECT.

The Reject can have a reply message stating the reason for reject, but that is optional.

An ACCEPT can come with reply attributes that will instruct the BNG to apply features or session parameters on the session

Debugging AAA

Some helpful hints explaining the debugs from the component

Command: Debug radius

Command: Debug aaa authentication/authorization/accounting

Common things to watch out for:

Some things that can go wrong with RADIUS are:

  • When the secret key is misconfigured, the authenticator of the radius header will be wrong. Some radius-servers reply then always with a reject, some radius-servers will silently discard the message.
  • Radius packets are sent with an 8 bit ID, which is not enough for large scale, we extend that with the source port, is show radius showing bad authenticators? Likely you have ID overload. Make sure your radius server supports extended source ports
  • Are the reply items properly set for based on the session type? For instance IP sessions do not come up when the service-type is set to framed or outbound.
  • When a feature is attempted to be applied, such as an ACL that doesn’t exist, should we reject the session or keep it alive unrestricted.

NOTE: The radius-profile for an IPoE session should NOT include any service-type. Having Framed-User as service type in an access accept for an IP subscriber will cause the session to fail.

PPP/PPPoE Control Policy Events

handling-events.jpg

-->

The  format of the debug is:

NODE : Time : <process>[pid] : [PACKET-direction] Interface mac-addr

RP/0/RSP0/CPU0:Sep  1 15:27:09.632 : pppoe_ma[346]: [PADI-Recv]: Bundle-Ether100.100 peer-mac 0019.2f43.9a38

Comments
h.younis
Level 1
Level 1

Hello 

We have been facing a problem with the PPPOE with radius, and I think it's accounting problem, where the accounting debugging is failure. that's what the debug result is 

 AAA/ACCT/CLIENT(0003ADC2): recv 1000000000bps xmit 1000000000bps
Apr 16 09:40:48.507: AAA/ACCT/HC(0003ADC2): Update PPPoE/7C000C06
Apr 16 09:40:48.507: AAA/ACCT/HC(0003ADC2): no HC PPPoE/7C000C06
Apr 16 09:40:48.507: AAA/ACCT/EVENT/(0003ADC2): CALL STOP
Apr 16 09:40:48.508: AAA/ACCT(0003ADC2) reccnt 0, osr 0
Apr 16 09:55:14.631: AAA/ACCT/EVENT/(0003AECF): CALL STOP
Apr 16 09:55:14.631: AAA/ACCT/CALL STOP(0003AECF): Sending stop requests
Apr 16 09:55:14.631: AAA/ACCT(0003AECF): Send all stops
Apr 16 09:55:14.631: AAA/ACCT/NET(0003AECF): STOP
Apr 16 09:55:14.631: AAA/ACCT/NET(0003AECF): Method list not found
Apr 16 09:55:14.631: AAA/ACCT(0003AECF): del node, session 241349
Apr 16 09:55:14.631: AAA/ACCT/NET(0003AECF): free_rec, count 0
Apr 16 09:55:14.631: /AAA/ACCTNET(0003AECF) reccnt 0, csr TRUE, osr 0
Apr 16 09:55:14.631: AAA/ACCT/NET(0003AECF): Last rec in db, intf not enqueued
Apr 16 09:55:15.041: AAA/ACCT/CLIENT(0003AED1): recv 1000000000bps xmit 1000000000bps
Apr 16 09:55:15.041: AAA/ACCT/HC(0003AED1): Register PPPoE/82000D0A 64 bit counter support not configured
Apr 16 09:55:15.041: AAA/ACCT/HC(0003AED1): Update PPPoE/82000D0A
Apr 16 09:55:15.041: AAA/ACCT/HC(0003AED1): no HC PPPoE/82000D0A
Apr 16 09:55:15.041: AAA/ACCT/EVENT/(0003AED1): CALL START
Apr 16 09:55:15.041: Getting session id for NET(0003AED1) : db=7FDCB4A8A948
Apr 16 09:55:15.041: AAA/ACCT(00000000): add node, session 241351
Apr 16 09:55:15.041: AAA/ACCT/NET(0003AED1): add, count 1
Apr 16 09:55:15.041: AAA/ACCT/EVENT/(0003AED1): ATTR REPLACE
Apr 16 09:55:15.041: AAA/ACCT(0003AED1): Accounting response status = FAILURE
Apr 16 09:55:15.041: AAA/ACCT(0003AED1): Send NEWINFO accounting notification to EM successfully
Apr 16 09:55:15.041: AAA/ACCT/EVENT/(0003AED1): ATTR REPLACE
Apr 16 09:55:15.041: AAA/ACCT/EVENT/(0003AED1): ATTR REPLACE
Apr 16 09:55:15.041: AAA/ACCT/EVENT/(0003AED1): NET DOWN
Apr 16 09:55:15.041: AAA/ACCT/NET(0003AED1): Method list not found
Apr 16 09:55:15.041: AAA/ACCT(0003AED1): del node, session 241351
Apr 16 09:55:15.041: AAA/ACCT/NET(0003AED1): free_rec, count 0
Apr 16 09:55:15.041: /AAA/ACCTNET(0003AED1) reccnt 0, csr FALSE, osr 0
Apr 16 09:55:15.041: AAA/ACCT/CLIENT(0003AED1): recv 1000000000bps xmit 1000000000bps
Apr 16 09:55:15.041: AAA/ACCT/HC(0003AED1): Update PPPoE/82000D0A
Apr 16 09:55:15.041: AAA/ACCT/HC(0003AED1): no HC PPPoE/82000D0A
Apr 16 09:55:15.042: AAA/ACCT/EVENT/(0003AED1): CALL STOP
Apr 16 09:55:15.042: AAA/ACCT(0003AED1) reccnt 0, osr 0
Apr 16 09:55:15.144: AAA/ACCT/EVENT/(0003AED2): CALL START
Apr 16 09:55:15.144: Getting session id for NET(0003AED2) : db=7FDCB5146BE0
Apr 16 09:55:15.144: AAA/ACCT(00000000): add node, session 241352
Apr 16 09:55:15.144: AAA/ACCT/NET(0003AED2): add, count 1
Apr 16 09:55:15.144: Getting session id for NONE(0003AED2) : db=7FDCB5146BE0
Apr 16 09:55:17.028: AAA/ACCT/CLIENT(0003AED3): recv 1000000000bps xmit 1000000000bps
Apr 16 09:55:17.028: AAA/ACCT/HC(0003AED3): Register PPPoE/EA000D0C 64 bit counter support not configured
Apr 16 09:55:17.028: AAA/ACCT/HC(0003AED3): Update PPPoE/EA000D0C
Apr 16 09:55:17.028: AAA/ACCT/HC(0003AED3): no HC PPPoE/EA000D0C
Apr 16 09:55:17.028: AAA/ACCT/EVENT/(0003AED3): CALL START
Apr 16 09:55:17.028: Getting session id for NET(0003AED3) : db=7FDCB50F5BE0
Apr 16 09:55:17.028: AAA/ACCT(00000000): add node, session 241353
Apr 16 09:55:17.028: AAA/ACCT/NET(0003AED3): add, count 1
Apr 16 09:55:17.029: AAA/ACCT/EVENT/(0003AED3): ATTR REPLACE
Apr 16 09:55:17.029: AAA/ACCT(0003AED3): Accounting response status = FAILURE
Apr 16 09:55:17.029: AAA/ACCT(0003AED3): Send NEWINFO accounting notification to EM successfully
Apr 16 09:55:17.029: AAA/ACCT/EVENT/(0003AED3): ATTR REPLACE
Apr 16 09:55:17.029: AAA/ACCT/EVENT/(0003AED3): ATTR REPLACE
Apr 16 09:55:17.029: AAA/ACCT/EVENT/(0003AED3): NET DOWN
Apr 16 09:55:17.029: AAA/ACCT/NET(0003AED3): Method list not found
Apr 16 09:55:17.029: AAA/ACCT(0003AED3): del node, session 241353
Apr 16 09:55:17.029: AAA/ACCT/NET(0003AED3): free_rec, count 0
Apr 16 09:55:17.029: /AAA/ACCTNET(0003AED3) reccnt 0, csr FALSE, osr 0 

and that's what show radius stasts result 

Zawya_Street#sho radius stat
Auth. Acct. Both
Maximum inQ length: NA NA 1
Maximum waitQ length: NA NA 1
Maximum doneQ length: NA NA 0
Total responses seen: 13 0 13
Packets with responses: 13 0 13
Packets without responses: 0 0 0
Access Rejects : 2
Average response delay(ms): 200 0 200
Maximum response delay(ms): 1036 0 1036
Number of Radius timeouts: 0 0 0
Duplicate ID detects: 0 0 0
Buffer Allocation Failures: 0 0 0
Maximum Buffer Size (bytes): 0 0 0
Malformed Responses : 0 0 0
Bad Authenticators : 0 0 0
Unknown Responses : 0 0 0
Source Port Range: (2 ports only)
1645 - 1646
Last used Source Port/Identifier:
1645/13
1646/0

Elapsed time since counters last cleared: 1w1d15h10m
Radius Latency Distribution:
<= 2ms : 0 0
3-5ms : 0 0
5-10ms : 0 0
10-20ms: 0 0
20-50ms: 7 0
50-100m: 3 0
>100ms : 3 0

Current inQ length : 0
Current doneQ length: 0

we facing this problem after we changed the physical connection of our DSLAM, can you help 

thanks in advanced 

xthuijs
Cisco Employee
Cisco Employee

oh this looks like ios? the info shows that there are 2 access-rejects seen. and the radius is responding and has responded to all requests. when you say physical connection chnaged it suggests to me that the nas-port-id or username composed/using phy port info has changed (like rid/cid) and therefore the radius checking on these items is blcoking the user.

a debug radius shows you the attributes and packets sent. it likely shows a changed username/nas-port and a radius access-reject as response.

we need to update the radius database witht eh correct username probably.

xander

xthuijs
Cisco Employee
Cisco Employee

for the keepalive issue, I would like to look into this more. it could be that the dlink is not checking correctly on the magic number of the keepalive. but it could be the echo request or the echo reply. would recommend to file a tac case on that so we can evaluate what is not going right there so we can possibly do something about it to make it operate a bit better. the standard workaround here is the idle timer from ppp.

error 691 is very generic btw. best to run a full debug to see who denies the session. if it is radius, we need to find the solution there I guess.

xander

h.younis
Level 1
Level 1

Thank you for your fast reply 

We didn't change any of our configuration so the radius usernames is the same, we use the same billing system for more than one node and we are facing this issue only with that node

that's what debug radius and debug pppoe events and data result is 

PPPoE/5300075D
Apr 16 12:21:26.391: AAA/ACCT/HC(0003BB71): no HC PPPoE/5300075D
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): CALL START
Apr 16 12:21:26.391: Getting session id for NET(0003BB71) : db=7FDCB1ACDAC8
Apr 16 12:21:26.391: AAA/ACCT(00000000): add node, session 244583
Apr 16 12:21:26.391: AAA/ACCT/NET(0003BB71): add, count 1
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): ATTR REPLACE
Apr 16 12:21:26.391: AAA/ACCT(0003BB71): Accounting response status = FAILURE
Apr 16 12:21:26.391: AAA/ACCT(0003BB71): Send NEWINFO accounting notification to EM successfully
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): ATTR REPLACE
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): ATTR REPLACE
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): NET DOWN
Apr 16 12:21:26.391: AAA/ACCT/NET(0003BB71): Method list not found
Apr 16 12:21:26.391: AAA/ACCT(0003BB71): del node, session 244583
Apr 16 12:21:26.391: AAA/ACCT/NET(0003BB71): free_rec, count 0
Apr 16 12:21:26.391: /AAA/ACCTNET(0003BB71) reccnt 0, csr FALSE, osr 0
Apr 16 12:21:26.391: AAA/ACCT/CLIENT(0003BB71): recv 1000000000bps xmit 1000000000bps
Apr 16 12:21:26.391: AAA/ACCT/HC(0003BB71): Update PPPoE/5300075D
Apr 16 12:21:26.391: AAA/ACCT/HC(0003BB71): no HC PPPoE/5300075D
Apr 16 12:21:26.391: AAA/ACCT/EVENT/(0003BB71): CALL STOP
Apr 16 12:21:26.391: AAA/ACCT(0003BB71) reccnt 0, osr 0
Apr 16 12:21:27.099: AAA/ACCT/EVENT/(0003BB70): CALL STOP
Apr 16 12:21:27.099: AAA/ACCT/CALL STOP(0003BB70): Sending stop requests
Apr 16 12:21:27.099: AAA/ACCT(0003BB70): Send all stops
Apr 16 12:21:27.099: AAA/ACCT/NET(0003BB70): STOP
Apr 16 12:21:27.099: AAA/ACCT/NET(0003BB70): Method list not found
Apr 16 12:21:27.099: AAA/ACCT(0003BB70): del node, session 244582
Apr 16 12:21:27.099: AAA/ACCT/NET(0003BB70): free_rec, count 0
Apr 16 12:21:27.099: /AAA/ACCTNET(0003BB70) reccnt 0, csr TRUE, osr 0
Apr 16 12:21:27.099: AAA/ACCT/NET(0003BB70): Last rec in db, intf not enqueued
Apr 16 12:21:32.062: AAA/ACCT/EVENT/(0003BB72): CALL START
Apr 16 12:21:32.062: Getting session id for NET(0003BB72) : db=7FDCB4F7D848
Apr 16 12:21:32.062: AAA/ACCT(00000000): add node, session 244584
Apr 16 12:21:32.062: AAA/ACCT/NET(0003BB72): add, count 1
Apr 16 12:21:32.062: Getting session id for NONE(0003BB72) : db=7FDCB4F7D848
Apr 16 12:21:37.438: AAA/ACCT/EVENT/(0003BB6D): EXEC DOWN
Apr 16 12:24:15.312: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 12:24:15.313: Service tag: NULL Tag
Apr 16 12:24:15.313: PPPoE : encap string prepared
Apr 16 12:24:15.313: [714]PPPoE 13258: Access IE handle allocated
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA get retrieved attrs
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA get nas port details
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA get dynamic attrs
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA unique ID 3BBA9 allocated
Apr 16 12:24:15.313: [714]PPPoE 13258: No AAA accounting method list
Apr 16 12:24:15.313: [714]PPPoE 13258: Service request sent to SSS
Apr 16 12:24:15.313: [714]PPPoE 13258: Created, Service: None R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 12:24:15.313: [714]PPPoE 13258: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 12:24:15.313: [714]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 12:24:15.313: [714]PPPoE 13258: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA get dynamic attrs
Apr 16 12:24:15.313: [714]PPPoE 13258: AAA account stopped
Apr 16 12:24:23.281: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 12:24:23.281: Service tag: NULL Tag
Apr 16 12:24:23.281: PPPoE : encap string prepared
Apr 16 12:24:23.281: [715]PPPoE 13259: Access IE handle allocated
Apr 16 12:24:23.281: [715]PPPoE 13259: AAA get retrieved attrs
Apr 16 12:24:23.281: [715]PPPoE 13259: AAA get nas port details
Apr 16 12:24:23.281: [715]PPPoE 13259: AAA get dynamic attrs
Apr 16 12:24:23.281: [715]PPPoE 13259: AAA unique ID 3BBAC allocated
Apr 16 12:24:23.281: [715]PPPoE 13259: No AAA accounting method list
Apr 16 12:24:23.281: [715]PPPoE 13259: Service request sent to SSS

that's our configuration for the aaa 

aaa new-model
!
!
aaa group server radius BSISP_BILLING
server 192.168.47.2 auth-port 1812 acct-port 1813
server 192.168.47.131 auth-port 1812 acct-port 1813
ip radius source-interface Loopback1
!
aaa authentication login BSISP_RADIUS group radius local
aaa authentication ppp PPPOE group radius
aaa authorization network PPPOE group radius
aaa accounting exec PPPOE
action-type start-stop
group radius
!
aaa accounting network PPPOE
action-type start-stop
group radius
!
!
!
!
!
aaa server radius dynamic-author
server-key 7 101D5149544E405B
port 3799
auth-type any

and that's the virtual-temp int 

interface Virtual-Template1
mtu 1492
ip address 10.1.0.1 255.255.192.0
ip nat inside
peer default ip address pool ecare
ppp authentication pap chap PPPOE
ppp authorization PPPOE
ppp accounting PPPOE
ip virtual-reassembly

with enabling the pppoe under the correct interface 

thank you

h.younis
Level 1
Level 1

that's my show radius server group result 

Sharecount = 1 sg_unconfigured = FALSE
Type = standard Memlocks = 1
Server(192.168.47.2:1812,1813) Transactions:
Authen: 0 Author: 0 Acct: 0
Server_auto_test_enabled: FALSE
Keywrap enabled: FALSE
Server(192.168.47.131:1812,1813) Transactions:
Authen: 0 Author: 0 Acct: 0
Server_auto_test_enabled: FALSE
Keywrap enabled: FALSE

xthuijs
Cisco Employee
Cisco Employee

your logging is heavily rate limited, some pieces are missing.

can you configure no logging rate-limit and run a test session with

debug pppoe ev/err/pa

debug ppp nego/authen

debug radius

I should be able to tell you more with that.

xander

h.younis
Level 1
Level 1

I configured the no logging rate-limited command and the debugs you requested

that was the output

 [484]PPPoE 14029: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 13:49:48.799: [484]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 86
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 65
00 00 00 42 01 01 00 00 01 03 00 04 32 35 33 32
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE 02 02 00 1E 41 43 3A 20 43 61 6E 6E
6F 74 20 6F 70 65 6E 20 ...
Apr 16 13:49:48.799: [484]PPPoE 14029: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 13:49:48.799: [484]PPPoE 14029: AAA get dynamic attrs
Apr 16 13:49:48.799: [484]PPPoE 14029: AAA account stopped
Apr 16 13:49:50.785: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 52
7C AD 74 BC FF 05 EC 43 F6 74 33 48 88 63 11 19
00 00 00 20 01 01 00 00 01 03 00 04 32 35 33 32
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE
Apr 16 13:49:50.785: Service tag: NULL Tag
Apr 16 13:49:50.785: PPPoE : encap string prepared
contiguous pak, size 20
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 64 11 00
36 CE 00 00
Apr 16 13:49:50.785: [485]PPPoE 14030: Access IE handle allocated
Apr 16 13:49:50.785: [485]PPPoE 14030: AAA get retrieved attrs
Apr 16 13:49:50.785: [485]PPPoE 14030: AAA get nas port details
Apr 16 13:49:50.785: [485]PPPoE 14030: Error adjusting nas port format did
Apr 16 13:49:50.785: [485]PPPoE 14030: AAA get dynamic attrs
Apr 16 13:49:50.785: [485]PPPoE 14030: AAA unique ID 3C2CE allocated
Apr 16 13:49:50.785: [485]PPPoE 14030: No AAA accounting method list
Apr 16 13:49:50.785: [485]PPPoE 14030: Service request sent to SSS
Apr 16 13:49:50.785: [485]PPPoE 14030: Created, Service: None R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 13:49:50.785: [485]PPPoE 14030: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 13:49:50.785: [485]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 86
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 65
00 00 00 42 01 01 00 00 01 03 00 04 32 35 33 32
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE 02 02 00 1E 41 43 3A 20 43 61 6E 6E
6F 74 20 6F 70 65 6E 20 ...
Apr 16 13:49:50.786: [485]PPPoE 14030: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 13:49:50.786: [485]PPPoE 14030: AAA get dynamic attrs
Apr 16 13:49:50.786: [485]PPPoE 14030: AAA account stopped
Apr 16 13:49:54.769: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 52
7C AD 74 BC FF 05 EC 43 F6 74 33 48 88 63 11 19
00 00 00 20 01 01 00 00 01 03 00 04 32 35 33 32
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE
Apr 16 13:49:54.769: Service tag: NULL Tag
Apr 16 13:49:54.769: PPPoE : encap string prepared
contiguous pak, size 20
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 64 11 00
36 CF 00 00
Apr 16 13:49:54.769: [486]PPPoE 14031: Access IE handle allocated
Apr 16 13:49:54.769: [486]PPPoE 14031: AAA get retrieved attrs
Apr 16 13:49:54.769: [486]PPPoE 14031: AAA get nas port details
Apr 16 13:49:54.769: [486]PPPoE 14031: Error adjusting nas port format did
Apr 16 13:49:54.769: [486]PPPoE 14031: AAA get dynamic attrs
Apr 16 13:49:54.769: [486]PPPoE 14031: AAA unique ID 3C2CF allocated
Apr 16 13:49:54.769: [486]PPPoE 14031: No AAA accounting method list
Apr 16 13:49:54.769: [486]PPPoE 14031: Service request sent to SSS
Apr 16 13:49:54.769: [486]PPPoE 14031: Created, Service: None R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 13:49:54.770: [486]PPPoE 14031: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 13:49:54.770: [486]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 86
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 65
00 00 00 42 01 01 00 00 01 03 00 04 32 35 33 32
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE 02 02 00 1E 41 43 3A 20 43 61 6E 6E
6F 74 20 6F 70 65 6E 20 ...
Apr 16 13:49:54.770: [486]PPPoE 14031: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 13:49:54.770: [486]PPPoE 14031: AAA get dynamic attrs
Apr 16 13:49:54.770: [486]PPPoE 14031: AAA account stopped

I really appreciate your cooperation  

xthuijs
Cisco Employee
Cisco Employee

the communication between client and server is broken here:

BNG sends a PADS (with session ID zero, not valid btw) and the response is constantly a PADR.

88 63 | 11 | 65 | 00 00 | 00 42

(is pppoe control, pads, session id 00-00 and pak length 00-42)

this means that either or both sides retain an embryonic session.

ideally you want the BNG to send a PADT here to clean up the client and have him start with a PADI.

you can try to either reboot your client or you can try to set the sessions auto cleanup under the bba-group to forcefully send PADT's at lost connections like this.

(the client is not going through its statemachine correctly and hangs in PADR).

(it should have a timer to see a PADS with a session ID coming in to commit to the session and start LCP.)

xander

h.younis
Level 1
Level 1

I configured the session auto cleanup in my core router, I reboot my core once but it didn't work, you think I should try and reboot my DSLAM ? 

the output after the session auto cleanup is like that 

m console by bsisp on vty1 (41.252.47.207)
Apr 16 14:57:02.679: PPPoE 0: I PADI R:ec43.f674.3348 L:ffff.ffff.ffff Gi0/0/5
contiguous pak, size 32
FF FF FF FF FF FF EC 43 F6 74 33 48 88 63 11 09
00 00 00 0C 01 01 00 00 01 03 00 04 32 36 36 37
Apr 16 14:57:02.679: Service tag: NULL Tag
Apr 16 14:57:02.679: PPPoE 0: O PADO, R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 14:57:02.679: Service tag: NULL Tag
contiguous pak, size 68
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 07
00 00 00 30 01 01 00 00 01 03 00 04 32 36 36 37
01 02 00 0C 5A 61 77 79 61 5F 53 74 72 65 65 74
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE
Apr 16 14:57:02.683: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 52
7C AD 74 BC FF 05 EC 43 F6 74 33 48 88 63 11 19
00 00 00 20 01 01 00 00 01 03 00 04 32 36 36 37
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE
Apr 16 14:57:02.683: Service tag: NULL Tag
Apr 16 14:57:02.683: PPPoE : encap string prepared
contiguous pak, size 20
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 64 11 00
39 2B 00 00
Apr 16 14:57:02.683: [89]PPPoE 14635: Access IE handle allocated
Apr 16 14:57:02.683: [89]PPPoE 14635: AAA get retrieved attrs
Apr 16 14:57:02.683: [89]PPPoE 14635: AAA get nas port details
Apr 16 14:57:02.683: [89]PPPoE 14635: Error adjusting nas port format did
Apr 16 14:57:02.683: [89]PPPoE 14635: AAA get dynamic attrs
Apr 16 14:57:02.683: [89]PPPoE 14635: AAA unique ID 3C884 allocated
Apr 16 14:57:02.683: [89]PPPoE 14635: No AAA accounting method list
Apr 16 14:57:02.683: [89]PPPoE 14635: Service request sent to SSS
Apr 16 14:57:02.683: [89]PPPoE 14635: Created, Service: None R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 14:57:02.684: [89]PPPoE 14635: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 14:57:02.684: [89]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 86
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 65
00 00 00 42 01 01 00 00 01 03 00 04 32 36 36 37
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE 02 02 00 1E 41 43 3A 20 43 61 6E 6E
6F 74 20 6F 70 65 6E 20 ...
Apr 16 14:57:02.684: [89]PPPoE 14635: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 14:57:02.684: [89]PPPoE 14635: AAA get dynamic attrs
Apr 16 14:57:02.684: [89]PPPoE 14635: AAA account stopped
Apr 16 14:57:04.671: PPPoE 0: I PADR R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 52
7C AD 74 BC FF 05 EC 43 F6 74 33 48 88 63 11 19
00 00 00 20 01 01 00 00 01 03 00 04 32 36 36 37
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE
Apr 16 14:57:04.671: Service tag: NULL Tag
Apr 16 14:57:04.671: PPPoE : encap string prepared
contiguous pak, size 20
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 64 11 00
39 2C 00 00
Apr 16 14:57:04.671: [90]PPPoE 14636: Access IE handle allocated
Apr 16 14:57:04.671: [90]PPPoE 14636: AAA get retrieved attrs
Apr 16 14:57:04.671: [90]PPPoE 14636: AAA get nas port details
Apr 16 14:57:04.671: [90]PPPoE 14636: Error adjusting nas port format did
Apr 16 14:57:04.671: [90]PPPoE 14636: AAA get dynamic attrs
Apr 16 14:57:04.671: [90]PPPoE 14636: AAA unique ID 3C885 allocated
Apr 16 14:57:04.671: [90]PPPoE 14636: No AAA accounting method list
Apr 16 14:57:04.671: [90]PPPoE 14636: Service request sent to SSS
Apr 16 14:57:04.671: [90]PPPoE 14636: Created, Service: None R:7cad.74bc.ff05 L:ec43.f674.3348 Gi0/0/5
Apr 16 14:57:04.672: [90]PPPoE 14636: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 16 14:57:04.672: [90]PPPoE 0: O PADS R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 86
EC 43 F6 74 33 48 7C AD 74 BC FF 05 88 63 11 65
00 00 00 42 01 01 00 00 01 03 00 04 32 36 36 37
01 04 00 10 11 EB 75 C6 9A FD 17 BD C3 F0 2A 1C
6A 8B 90 EE 02 02 00 1E 41 43 3A 20 43 61 6E 6E
6F 74 20 6F 70 65 6E 20 ...
Apr 16 14:57:04.672: [90]PPPoE 14636: Destroying R:ec43.f674.3348 L:7cad.74bc.ff05 Gi0/0/5
Apr 16 14:57:04.672: [90]PPPoE 14636: AAA get dynamic attrs
Apr 16 14:57:04.672: [90]PPPoE 14636: AAA account stopped

Thank you 

xthuijs
Cisco Employee
Cisco Employee

nah no need to reboot the dslam, the issue is that the PADS getting sent is still having an empty session ID and the BNG not being able to setup the session.

there is an SSS error (subscriber service switch).

it would be best to keep the existing debugs we already have enabled, but also add sss err/ev in it and some more sss debugs applicable to see why SSS is not able to continue with the session.

if SSS is not able to continue with the session it should have sent a PADT and not a PADS with no session ID. it leaves this client dangling.

when you rerun that test to collect SSS debugs, make sure the client is rebooted so it starts fresh from PADI stage.

xander

h.younis
Level 1
Level 1

I reboot my client and I did the same debugs 

Apr 17 10:33:25.016: PPPoE 0: I PADR R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 56
7C AD 74 BC FF 05 00 22 6B E0 86 DA 88 63 11 19
00 00 00 24 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61 01 04 00 10 87 27 C9 49 8C CA A6 7B
14 6D 47 1B 97 CB 5E C4
Apr 17 10:33:25.016: [474]PPPoE 3477: Error adjusting nas port format did
Apr 17 10:33:25.017: [474]PPPoE 0: O PADS R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 90
00 22 6B E0 86 DA 7C AD 74 BC FF 05 88 63 11 65
00 00 00 46 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61 01 04 00 10 87 27 C9 49 8C CA A6 7B
14 6D 47 1B 97 CB 5E C4 02 02 00 1E 41 43 3A 20
43 61 6E 6E 6F 74 20 6F ...
Apr 17 10:33:25.074: PPPoE 0: I PADT R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 36
7C AD 74 BC FF 05 00 22 6B E0 86 DA 88 63 11 A7
00 00 00 10 01 03 00 08 88 10 08 00 61 61 61 61
01 01 00 00
Apr 17 10:33:51.997: %LDP-5-NBRCHG: LDP Neighbor 7.7.7.7:0 (5) is DOWN (Received error notification from peer: Holddown time expired)
Apr 17 10:33:55.371: %LDP-5-NBRCHG: LDP Neighbor 7.7.7.7:0 (5) is UP
Apr 17 10:34:25.677: PPPoE 0: I PADT R:0022.6be0.86da L:ffff.ffff.ffff Gi0/0/5
contiguous pak, size 36
FF FF FF FF FF FF 00 22 6B E0 86 DA 88 63 11 A7
00 00 00 10 01 03 00 08 88 10 08 00 61 61 61 61
01 01 00 00
Apr 17 10:34:25.678: PPPoE 0: I PADI R:0022.6be0.86da L:ffff.ffff.ffff Gi0/0/5
contiguous pak, size 36
FF FF FF FF FF FF 00 22 6B E0 86 DA 88 63 11 09
00 00 00 10 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61
Apr 17 10:34:25.678: Service tag: NULL Tag
Apr 17 10:34:25.678: PPPoE 0: O PADO, R:7cad.74bc.ff05 L:0022.6be0.86da Gi0/0/5
Apr 17 10:34:25.678: Service tag: NULL Tag
contiguous pak, size 72
00 22 6B E0 86 DA 7C AD 74 BC FF 05 88 63 11 07
00 00 00 34 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61 01 02 00 0C 5A 61 77 79 61 5F 53 74
72 65 65 74 01 04 00 10 87 27 C9 49 8C CA A6 7B
14 6D 47 1B 97 CB 5E C4
Apr 17 10:34:25.691: PPPoE 0: I PADR R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 56
7C AD 74 BC FF 05 00 22 6B E0 86 DA 88 63 11 19
00 00 00 24 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61 01 04 00 10 87 27 C9 49 8C CA A6 7B
14 6D 47 1B 97 CB 5E C4
Apr 17 10:34:25.691: Service tag: NULL Tag
Apr 17 10:34:25.691: PPPoE : encap string prepared
contiguous pak, size 20
00 22 6B E0 86 DA 7C AD 74 BC FF 05 88 64 11 00
0D 96 00 00
Apr 17 10:34:25.691: [475]PPPoE 3478: Access IE handle allocated
Apr 17 10:34:25.691: [475]PPPoE 3478: AAA get retrieved attrs
Apr 17 10:34:25.691: [475]PPPoE 3478: AAA get nas port details
Apr 17 10:34:25.691: [475]PPPoE 3478: Error adjusting nas port format did
Apr 17 10:34:25.692: [475]PPPoE 3478: AAA get dynamic attrs
Apr 17 10:34:25.692: [475]PPPoE 3478: AAA unique ID 4494 allocated
Apr 17 10:34:25.692: [475]PPPoE 3478: No AAA accounting method list
Apr 17 10:34:25.692: [475]PPPoE 3478: Service request sent to SSS
Apr 17 10:34:25.692: [475]PPPoE 3478: Created, Service: None R:7cad.74bc.ff05 L:0022.6be0.86da Gi0/0/5
Apr 17 10:34:25.692: [475]PPPoE 3478: State NAS_PORT_POLICY_INQUIRY Event SSS DISCONNECT
Apr 17 10:34:25.692: [475]PPPoE 0: O PADS R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 90
00 22 6B E0 86 DA 7C AD 74 BC FF 05 88 63 11 65
00 00 00 46 01 01 00 00 01 03 00 08 88 10 08 00
61 61 61 61 01 04 00 10 87 27 C9 49 8C CA A6 7B
14 6D 47 1B 97 CB 5E C4 02 02 00 1E 41 43 3A 20
43 61 6E 6E 6F 74 20 6F ...
Apr 17 10:34:25.692: [475]PPPoE 3478: Destroying R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
Apr 17 10:34:25.692: [475]PPPoE 3478: AAA get dynamic attrs
Apr 17 10:34:25.692: [475]PPPoE 3478: AAA account stopped
Apr 17 10:34:25.751: PPPoE 0: I PADT R:0022.6be0.86da L:7cad.74bc.ff05 Gi0/0/5
contiguous pak, size 36
7C AD 74 BC FF 05 00 22 6B E0 86 DA 88 63 11 A7
00 00 00 10 01 03 00 08 88 10 08 00 61 61 61 61
01 01 00 00
Apr 17 10:34:32.369: %SYS-5-CONFIG_I: Configured from console by bsisp 

xthuijs
Cisco Employee
Cisco Employee

ok the client at least now responds properly with a PADT.

the situation for the bng is still the same, generating PADS messages with 00-00 session ID. 

before that it got a problem from SSS already and that may be the case of the sessionID zero.

now you mentioned that you were using the asr900 series. I am not sure if these are meant to terminate pppoe sessions, I know they can be client.

I am suspecting that the pppoe subsystem making a system call for the sessionID is failing due to non platform support and therefore we see that error and PADS 00-00.

It is best to check in with a tac case to see if the asr900 is meant to support pppoe-server. if so, the bug is the sessionID 0, if not yeah it is that reason I gave above.

cheers

xander

h.younis
Level 1
Level 1

Actually I'm using ASR 1002x, and yes it does support the pppoe-server I was using it for like 2 weeks before now but it doesn't support the debug sss so I can't know exactly what the problem with the SSS

xthuijs
Cisco Employee
Cisco Employee

ok for sure the ASR1002 can do pppoe server side, so good on that.

what worries me is that you have no sss debugs whatsoever, that suggests to me that the subsystem is not there. could it be that you have an ip base image instead of the image that supports pppoe/server?

you could check the cmd below.

regards

xander

CPE#show subsys | i pppoe      

pppoe               Protocol    1.000.001     

pppoe_fwd           Protocol    1.000.001     

pppoe_client        Protocol    1.000.001     

pppoe_server        Protocol    1.000.001     

pppoe_ipfib_ios     Protocol    1.000.001     

pppoemib            Management  1.000.000     

pppoe_fwd_registry  Registry    1.000.001     

pppoe_registry      Registry    1.000.001     

pppoemib_registry   Registry    1.000.001     

CPE#show subsys | i sss 

pppoa_sss           Library     1.000.001     

sss                 Protocol    1.000.001     

vpdn_ssm_sss_seghandProtocol    1.000.001     

pptp_ssm_sss_seghandProtocol    1.000.001     

ppp_sss             Protocol    1.000.001     

sss_mgr_xconnect_atoProtocol    1.000.001     

sss_mgr_xconnect_iniProtocol    1.000.001     

sss_mgr_xconnect_l2tProtocol    1.000.001     

sss_pm_xconnect_initProtocol    1.000.001     

sss_mgr_xconnect_ac_Protocol    1.000.001     

sss_pm_sgf_init     Protocol    1.000.001     

sss_pm_vpdn_init    Protocol    1.000.001     

sss_pm_rm_init      Protocol    1.000.001     

sss_mgr_vpdn_init   Protocol    1.000.001     

sss_pm_template_authProtocol    1.000.000     

sss_switching_registRegistry    1.000.001     

sss_registry        Registry    1.000.001     

sss_feature_registryRegistry    1.000.001     

l2x_sss_switching_reRegistry    1.000.001     

ac_sss_switching_regRegistry    1.000.001

h.younis
Level 1
Level 1

That was the result 

Zawya_Street#sho subsys | i pppoe
pppoe Protocol 1.000.001

#sho subsys | i sss
pppoa_sss Library 1.000.001
qos_peruser_dp_sss Library 1.000.001
sss Protocol 1.000.001
sss_pm_aaa_author Protocol 1.000.001
hqf_sss_subsys Protocol 1.000.001
hqf_sa_sss_subsys Protocol 1.000.001
ip_sss_switching_hw Protocol 1.000.001
sss_lterm_switching_hw Protocol 1.000.001
idle_timeout_dp_sss Protocol 1.000.001
idle_timeout_dp_lib_sss Protocol 1.000.001
iedge_acct_dp_sss Protocol 1.000.001
iedge_acct_dp_sss_lib Protocol 1.000.001
iedge_policing_dp_sss Protocol 1.000.001
acl_peruser_dp_sss Protocol 1.000.001
portbundle_dp_sss Protocol 1.000.001
tmpl_eot_dp_sss Protocol 1.000.001
keepalive_dp_sss Protocol 1.000.001
l4redirect_dp_sss Protocol 1.000.001
ffr_dp_sss Protocol 1.000.001
prepaid_idle_timeout_dp_sss Protocol 1.000.001
time_monitor_dp_sss Protocol 1.000.001
volume_monitor_dp_sss Protocol 1.000.001
sss_if_config Protocol 1.000.001
sss_ip_config Protocol 1.000.001
sss_pm_idmgr Protocol 1.000.001
sss_pm_profile Protocol 1.000.001
sss_pm_ancp Protocol 1.000.001
sss_pm_isg_parser Protocol 1.000.001
sss_pm_qos_parser Protocol 1.000.001
sss_pm_pacl_parser Protocol 1.000.001
sss_pm_service Protocol 1.000.001
sss_mgr_lterm_init Protocol 1.000.001
sss_pm_lterm_init Protocol 1.000.001
sss_mgr_null_fsp_init Protocol 1.000.001
sss_pm_sgf_init Protocol 1.000.001
sss_pm_vpdn_init Protocol 1.000.001
sss_pm_rm_init Protocol 1.000.001
sss_mgr_vpdn_init Protocol 1.000.001
sss_pm_template_author Protocol 1.000.000
sss_lterm_ccm Protocol 1.000.001
sss_lterm_switching Protocol 1.000.001
pptp_ssm_sss_seghandler Protocol 1.000.001
sss_policy_ha_ccm Management 1.000.001
sss_feature_registry Registry 1.000.001
sss_registry Registry 1.000.001
sss_svm_registry Registry 1.000.001
sss_template_ha_registry Registry 1.000.001
sss_policy_ha_registry Registry 1.000.001
sss_switching_registry Registry 1.000.001
sss_pm_rabapol_ha_registry Registry 1.000.001
iedge_accounting_dp_sss_registry Registry 1.000.001
sss_mobility_registry Registry 1.000.001
sss_lterm_registry Registry 1.000.001
sss_policy_rule_registry Registry 1.000.001
sss_policy_registry Registry 1.000.001
sss_policy_aaa_author_registry Registry 1.000.001
sss_pm_ancp_registry Registry 1.000.001
ip_sss_switching_registry Registry 1.000.001
idle_timeout_dp_sss_registry Registry 1.000.001
ac_sss_switching_registry Registry 1.000.001
l2x_sss_switching_registry Registry 1.000.001

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links