Minimizing remote storage usage and synchronization time using deduplication and multichunking: Syncany as an example
Contents
Download as PDF: This article is a web version of my Master’s thesis. Feel free to download the original PDF version.
B. Pre-Study Folder Statistics
To get a better idea of what kind of data users will be storing inside the Syncany folders, the pre-study asked users and developers to run a small java program to collect data about the files types and sizes. In particular, they were asked to run the following commands on a folder that contains the type of files they would store in Syncany. Because many of them might use Dropbox, they were given the option to choose the Dropbox folder.
1 2 3 4 5 6 |
wget http://syncany.org/thesis/FileTreeStatCSV.java javac FileTreeStatCSV.java java FileTreeStatCSV ~/SomeFolder Analyzing directory /home/username/SomeFolder ... Saving to syncany-size-categories.csv ... Saving to syncany-type-categories.csv ... |
After running the commands, they were asked to upload the CSV files using the an interface on the Syncany Web site. The program generated two CSV files — one containing data about the file types, and one representing a histogram of existing file sizes:
Excerpt of syncany-type-categories.csv:
|
Excerpt of syncany-size-categories.csv:
|
>> Next chapter: Appendix “List of Variables Recorded”
Hi,
I would love to see a ebook version of your thesis (epub or mobi). Would that be possible ?
thanks
@JP: Is there a simple way to compile it from LaTeX format? If it is, I can definitely make one for you. Just let me know :-)
Hi Philipp:
Good Morning. Possible to receive pdf version of your thesis.
cheers
Madhavan