how can that be?
Unclear, at least to me, what you're asking about. Are you asking a question about TCP scaling and window sizes or about the lower than expected throughput?
Assuming your question is about lower than expected throughput, there are many possible reasons. First, you mention a "few duplicate packets". Even a few can impact dramatically impact throughput.
Second, you didn't detail your environment or for how long a test you're running. TCP takes time to "ramp-up" its "speed", especially as latency increases.
How are you measuring throughput vs. what's expected? I.e. TCP, due to L2 and L3 overhead, and packet sizes, will have lower bps than what the "wire" supports.