1. The ESP will do the fragmenting
2. L2TPv3 is done in the ESP.
The ASR1k does not forward any IP packets through the RP. It is either done in the ESP or it is not supportted.
For the L2TPv3 case, the packets are received from the SPA, the L2TPv3 headers are added and then the packet is routed in the ESP. If fragmentation needs to occur, it occurs after the routing on the encapsluated packet.