Hi Amit,
Match protocol rtp video will match only video.
In regards to your second question:
Match rtp audio is good but it should be just a part of full end-to-end QoS policy and you should not only just rely on that.
While deploying QoS - there are some best practices, like marking closest to the source. Most VoIP end devices, servers will mark the traffic (audio as 46 and signalinging as 24 at either l3 or like in case of phones at l2 level)
You should configure QoS on catalyst switches. Traffic is already marked you need to enable trust on the switches and ensure traffic is priortised and markings are carried to routers. At routers you can catch this based on markings, protocol (like rtp audio), source/destination, and several other criteria. Then this is sent across WAN with appropriate markings and get preferential treatment in Service Provides network and markings are maintained through out.
So just to summarise yes it should catch audio by matching rtp audio but for QoS to work effectvely you should deploy QoS based on a wider policy that makes sure voice traffic is priortised at all possible levels.
Hope it helps.
Terry
Please rate if you find it helpful.