I could be wrong, but I have a feeling the hardware queue size is determined by interface speed. So let's say you decrease the hardware queue size to 5 this would cause more traffic to be put in memory which wouldn't be optimal. Once again I don't know the answer, so I'm taking a guess.
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
It depends on you Bc and Be. (In your case, as you're using shape average, Be shouldn't apply.)
The shaper allows ingress to egress traffic at line rate and/or might dequeue at its maximum PPS rate. A burst that's "fits" within Bc (or if shape peek Bc+Be) could queue in the tx-ring.
Considering the performance capacity of an 891 and assuming you don't have LAN ingress bandwidth that oversubscribes the egress gig, I doubt there would be the need for deep tx-ring.
BTW, some of the later IOSs are supposed to minimize tx-ring when a service policy is applied. (This to minimize FIFO tx-ring queuing, before packets are placed into an interface policy [without shaper].) Are you sure tx-ring size is still 256?