This is probably a question you need to ask Callrex. If you're using SPAN-based recording then their application isn't properly differentiating one RTP stream from another. If you're using BiB-based recording then something else is going on because the Jabber client would not have the RTP packets from other devices and therefore couldn't forward these to Callrex.
Just to rule out the basics, you may want to try calling the Jabber user where the recording has "other voices" and talk to them for a bit. It's possible that the user's microphone is just picking up background noise and there is nothing wrong with the recording.