My name is Philipp C. Heckel and I write about nerdy things.

Posts Tagged / Deduplication


  • Jul 22 / 2019
  • Comments Off on Deduplicating NTFS file systems (fsdup)
Programming

Deduplicating NTFS file systems (fsdup)

At my work, we store hundreds of thousands of block-level backups for our customers. Since our customer base is mostly Windows focused, most of these backups are copies of NTFS file systems. As of today, we’re not performing any data deduplication on these backups, which is pretty crazy considering that how well you’d think a Windows OS will probably dedup.

So I started on a journey to attempt to dedup NTFS. This blog post briefly describes my journey and thoughts, but also introduces a tool called fsdup I developed as part of a 3 week proof-of-concept. Please note that while the tool works, it’s highly experimental and should not be used in production!

Continue Reading

  • May 20 / 2013
  • 3
Cloud Computing, Distributed Systems, Security, Synchronization

Minimizing remote storage usage and synchronization time using deduplication and multichunking: Syncany as an example

This post introduces my Master’s thesis “Minimizing remote storage usage and synchronization time using deduplication and multichunking: Syncany as an example”. I submitted the thesis in January 2012, and now found a little time to post it here.

The key goal of this thesis was to determine the suitability of deduplication for end-user applications — particularly for my synchronization application Syncany. As part of this work, the thesis introduces Syncany, a file synchronizer designed with security and provider independence as a core part of its architecture.

Continue Reading