When you DO a transfer its a Network Hold and not user user hold.
If your configuration is to user Unicast MOH, streams that are sent directly from the MOH server to the endpoint that requests an MOH audio stream.
If its Multicast, streams that are sent from the MOH server to a multicast group IP address. Endpoints that request an MOH audio stream can join multicast MOH, as needed.
The call manager 1st looks for the audio source in the line level ,if that is set to none ,it checks the phone level ,followed by the CDC and the cluster wide service parameters.
In simplest terms, the holder's configuration determines which audio file to play, and the holdee's configuration determines which resource or server will play that file. As illustrated by the example in below figure , if phones A and B are on a call and phone B (holder) places phone A (holdee) on hold, phone A will hear the MoH audio source configured for phone B (Audio-source2). However, phone A will receive this MoH audio stream from the MRGL (resource or server) configured for phone A (MRGL A).

Because the configured MRGL determines the server from which a unicast-only device will receive the MoH stream, you must configure unicast-only devices with an MRGL that points to a unicast MoH resource or media resource group (MRG). Likewise, a device capable of multicast should be configured with an MRGL that points to a multicast MRG containing a MoH server configured for multicast.
2.12.0.0