cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
387
Views
0
Helpful
0
Replies

Do Snapshots contribute to the deduplication savings in HXDP?

RedNectar
VIP
VIP

Hi Experts,

Actually, this question is directed squarely at the HyperFlex developers, so if you know anyone in that camp, please ask them to read this.

The question: 

Do Snapshots contribute to the deduplication savings in HXDP?

The evidence I've gathered says that they don't.

AND THAT'S A WORRY FOR ME - and probably every other HyperFlex user out there.

Actually, the questions I'd really like answered are:

  • How does HyperFlex calculate the deduplication ratio? 
  • What files are actually included in the formula?
  • What files are excluded from the calculation, and why? (like: Why aren't snapshots included?)

The background:

I have the luxury of access to a HyperFlex lab - just three HX220 hybrid nodes, but for about a week it has been mine. All mine! And I've been running a lot of experiments trying to understand better the way space is allocated in HX.

I love a picture -this one shows I have 3 nodes, RF=2 total Capacity 9.04TB (which as the HyperFlex Capacity Management White paper explains, is REALLY 9.04 TiB)

image.png

But what I can't figure out is how I can have a VM allocated 8TiB1 consuming 96.67 TiB  in a 9TiB system with a Deduplication Rate of 0%.

I was EXPECTING to see about a 92% deduplication rate ((96.67-8)/96.67=0.92). Or perhaps ((96.67-7.8)/96.67=0.92)

 

image.png

The whole story

I set up a VM with a huge drive capacity, then started filling it with random data. (usingcat /dev/urandom > RandomFile2) which probably explains why I have 0% compression.  When the VM was at about 50% capacity, I set up HXDP snapshots to be taken every hour.

It took about 12 hours to actually fill the disk - so I've ended up with thirteen .vmdk files ranging in size from about 5.4TiB to almost 8TiB.

And since I was writing huge amounts (4+TiB), I can be sure that that 4+TiB is not stuck in some write log somewhere (which can explain why the usage seen in HX Connect does not always match the actual sum of the file sizes)

image.png

In fact, the whole plan was to see if I could validate how compression is calculated.  But this little exercise has not helped me in the slightest!

Which brings me to the questions I'd really like answered. And they are:

  • How does HyperFlex calculate the deduplication ratio? 
  • What files are actually included in the formula?
  • What files are excluded from the calculation, and why? (like: Why aren't snapshots included?)

On a final note, please don't refer me to the Capacity Management in Cisco HyperFlex White Paper which I referred to earlier. I've read it thoroughly, spent some hours pointing out some of the discrepancies and mistakes in the document and sent my comments to the technical writers.  Unless of course you refer me to an updated version with the answers to my questions nicely documented - which is where this information really should be.

 

1. [VMware display TiB values but label them TB]

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.
0 Replies 0
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card