Greetings Ting,
I don't believe the packet buffer size on the F1 cards is published externally, sorry. What I can say is that it is smaller than on an M1 linecard as F1s are optimised for minimal latency.
Traffic received on an F1 module which needs to be routed is sent on an internal port-channel that by default consists of ports on all M1 modules present in the VDC. So routed traffic is automatically load-balanced similar to Etherchannel hashing across the available M1 forwarding engines.
You can manually configure which M1 modules are used with the 'hardware proxy layer-3 routing ..' command, the following section of the config guide has some more info:
http://www.cisco.com/en/US/docs/switches/datacenter/sw/5_x/nx-os/unicast/configuration/guide/l3_route.html#wp1090728
Hope you find this useful,
/Phil