Cloud Computing, Distributed Systems, Security, Synchronization
Minimizing remote storage usage and synchronization time using deduplication and multichunking: Syncany as an example
Contents
Download as PDF: This article is a web version of my Master’s thesis. Feel free to download the original PDF version.
C. List of Variables Recorded
The following table contains a detailed explanation of each variable recorded in experiment 1 (cf. chapter 6.4.1). For each dataset version and each configuration, one set of these variables was saved.
Measured Variable | Description |
totalDurationSec | Duration of the chunking and indexing process for the run |
totalChunkingDurationSec | Duration of the chunking process for the run (excludes index lookups and other management tasks) |
totalDatasetFileCount | Number of analyzed files in the run, i.e. the number of files in this version of the dataset |
totalChunkCount | Number of created chunks from the files analyzed |
totalDatasetSize | Size in bytes of all analyzed files in the run |
totalNewChunkCount | Number of chunks that have not been found in the index (new chunks, negative chunk index lookup) |
totalNewChunkSize | Size in bytes of all the new chunks |
totalNewFileCount | Number of new files (negative file index lookup) |
totalNewFileSize | Size in bytes of all the new files |
totalDuplicateChunkCount | Number of chunks that have been found in the index (positive chunk index lookup) |
totalDuplicateChunkSize | Size in bytes of all duplicate chunks |
totalDuplicateFileCount | Number of duplicate files found during this run (positive file index lookup) |
totalDuplicateFileSize | Size in bytes of duplicate files during this run |
totalMultiChunkCount | Number of multichunks created from the new chunks |
totalMultiChunkSize | Size in bytes of the created multichunks |
totalIndexSize | Size in bytes of the incremental index file for this run |
totalCpuUsage | CPU usage during this run in percent |
tempDedupRatio | Temporal deduplication ratio, excluding the size of the index; calculated by dividing the sum of the cumulated input bytes by the cumulated size of the generated multichunks in bytes (both from t0 to tn) |
tempSpaceReducationRatio | Temporal space reduction in percent, excluding the size of the index; calculated as one minus the inverse temporal deduplication ratio |
tempDedupRatioInclIndex | Temporal deduplication ratio, incl. the size of the index |
tempSpaceRedRatioInclIndex | Temporal space reduction in percent, including the size of the index |
recnstChunksNeedBytes | Size in bytes of chunks needed to reconstruct the current dataset version if no previous version has been downloaded before |
recnstMultChnksNeedBytes | Size in bytes of multichunks needed to reconstruct the current dataset version if no previous version has been downloaded before |
recnstMultOverhDiffBytes | Difference in bytes between required size of multichunks and size of chunks. |
recnst5NeedMultChnkBytes | Size in bytes of multichunks needed to reconstruct the current dataset version if the last five dataset versions are missing |
recnst5NeedChunksBytes | Size in bytes of chunks needed to reconstruct the current dataset version if the last five dataset versions are missing |
recnst5OverheadBytes | Difference in bytes between required size of multichunks and size of chunks (if five versions are missing). |
recnst10NeedMultChnkBytes | Size in bytes of multichunks needed to reconstruct the current dataset version if the last ten dataset versions are missing |
recnst10NeedChunksBytes | Size in bytes of chunks needed to reconstruct the current dataset version if the last ten dataset versions are missing |
recnst10OverhBytes | Difference in bytes between required size of multichunks and size of chunks (if ten versions are missing). |
>> Next chapter: Appendix “Best Algorithms by Deduplication Ratio”
Hi,
I would love to see a ebook version of your thesis (epub or mobi). Would that be possible ?
thanks
@JP: Is there a simple way to compile it from LaTeX format? If it is, I can definitely make one for you. Just let me know :-)
Hi Philipp:
Good Morning. Possible to receive pdf version of your thesis.
cheers
Madhavan