Version control systems like CVS or Subversion are designed for keeping track of the changes of a project and for having the possibility to revert to old revisions if something goes wrong. In contrast to regular relational databases, these systems are made only for adding new content to a repository, and not for removing data from it. In fact, deleting old content is not a built-in functionality in SVN, and mostly requires removing entire revisions from the repository or even creating a new one.
But what happens if you accidentally commit a password or other sensitive information to a repository? This post explains how to remove this confidential data permanently from the repository by simply overwriting it in old revisions, i.e. without having to remove revisions or create a new repository.
Contents
1. Introduction
1.1. Disclaimer
The following actions might lead to data loss. I am not responsible for anything that goes wrong because of my description.
1.2. Requirements
It is absolutely necessary to have root access to the SVN respository. That is not only through the svnadmin command, but full command line access to the files, particularly to the “repos” directory.
If you do not have root access to the repository, you cannot remove any data from the repository! In that case, contact your SVN administrator.
1.3. Example Scenario
For this example, let’s assume you accidentally committed the file config.cfg with a plain text password 123secret a while ago (in revision 12). The repository is currently at revision 25 and you just realized that the password was in there all the time:
1 2 3 4 |
# Config file "config.cfg" username = someone password = 123secret ... |
2. Local machine: Identify the affected revisions in the working copy
2.1. Fix and commit the affected file
The following commands are performed on your local machine within the working copy of the project, i.e. on the client machine.
Before we start tinkering and forging the SVN history and its repository, first fix the affected file and commit a new revision to the repository. In most cases, people are not going to look in old revisions of a config file, so the faster you commit a new version, the less likely it is that someone sees it!
1 2 3 4 5 6 7 |
cd ~/Dev/yourproject vi config.cfg # Change password to something else svn commit -m "config update" ... Transmitting file data . Committed revision 26. |
2.2. Identify the affected file versions locally
In most cases you will probably realize right away that you just committed something confidential to the SVN repository. In this case, you only have to fix one single version of that file and is pretty clear which revision is affected.
In other cases, however, the affected file might be in the repository for many revisions before you realize it. If this is the case, there might be multiple revisions of the file in the repository and each of these versions needs to be fixed. To identify the possibly affected versions of the file, you can peak into the logs:
1 2 3 4 5 6 7 8 9 10 11 |
svn log config.cfg ------------------------------------------------------------------------ r22 | someone | 2011-01-25 01:36:12 +0100 (Tue, 25 Jan 2011) | 1 line update xy config ------------------------------------------------------------------------ r12 | someone | 2011-01-05 00:45:19 +0100 (Wed, 05 Jan 2011) | 1 line added connection details to config ------------------------------------------------------------------------ ... |
In this case, the file has been altered in the two revisions 12 and 22. Both might include the password and are stored in the repository, i.e. both potentially need to be corrected.
2.3. Get MD5 checksums of the affected versions
SVN ensures the integrity of its repository by saving MD5 checksums of all the files and its versions. Since it is now clear which revisions might be affected, you need to get the current checksums of these file versions and calculate checksums for the new corrected (“forged”) versions. In short, you need to do the following for each affected version:
- Retrieve the version and calculate its MD5 checksum
- Make a copy of file, replace the confidential information with “x”s and calculate the MD5 checksum of the new file.
- Remember or copy all the checksums and versions into a file.
In this example, we’ll have to get the checksums for revisions 12 and 22 of the config.cfg-file. The code below only shows what to do for revision 22; revision 12 is analogue:
First, get the current checksum of revision 22:
1 2 3 |
svn --revision 22 config.cfg ... At revision 22. |
Find the checksum using the md5sum utility:
1 2 |
md5sum config.cfg 0e28c6c8342649c290400567130f657b config.cfg |
Copy the ‘wrong’ config file and correct the new file using vi:
1 2 3 |
cp config.cfg /tmp/config.cfg-22 vi /tmp/config.cfg-22 # Overwrite the password with "xxxxxxxxx" (same length as the old password!!) |
Then get the new checksum:
1 2 |
md5sum /tmp/config.cfg-22 459a78e2eae02b28f810f9fdebdc5b52 /tmp/config.cfg-22 |
Then repeat this for revision 12.
3. SVN repository: Correct the affected files
In this step, we finally start altering the repository. All the actions are performed on the server machine as root user inside the actual SVN repository directory, so be sure not to confuse it with you local machine.
3.1. Make a repository backup
Creating some sort of backup is crucial, since we are about to change the binary revision files of the Subversion repository. The easiest way to do this is to backup the whole repository folder of your project, e.g. /path/to/svn/repos/yourproject. However, if its total size is too big you can also choose to only backup the files identified in 3.2.
1 2 3 |
# Make backup of project; Note the "-a" parameter to keep the permissions. mkdir /backups cp -a /path/to/svn/repos/yourproject /backups |
3.2. Verify affected versions
After the backup, we need to verify that we really need to change all the versions we identified earlier. To do that, navigate to the “revs” folder inside the repository and grep for the password:
1 2 3 4 |
cd /path/to/svn/repos/yourproject/db/revs/0 grep 123secret * Binary file 12 matches Binary file 22 matches |
The matching files are the revisions that contain the password, and hence also the files that need to be “corrected”. Note that sometimes not all the versions identified through the “svn log” command appear in this list. That is because when the file is simply moved and not changed or other parts of it were altered, its contents will not be stored in the SVN revision file.
3.3. Replace the password and checksums
Since the SVN revision files are binary, we need a hex editor to edit them. Hence install hexedit, and then simply replace the password and checksums like identified before:
1 2 3 4 5 6 |
apt-get install hexedit hexedit 22 # Hex editor opens the file for revision 22 # Replace passwords and checksums # Repeat this for revision 12 |
Hexedit is not the easiest editor to use. So here is a step-by-step of what you need to do:
- Hit TAB, then CTRL-S to search
- Enter the password 123secret and hit return
- Overwrite the password with xxxxxxxxx (same length!)
- Hit CTRL-S, then “Y” to save
- Repeat 1-4 for each occurance of the password.
- Do the same for the old checksum “0e28c6c8342649c290400567130f657b”, and replace it with the new one “f85abfd8b63fa7ab68abc9364f2d339e”
- Hit CTRL-X to quit
- Repeat this for all affected revisions
That’s the complete magic. If checked out, the revisions 12 and 22 (and of course also their succeeding versions) will show xxxxxxxxx instead of the initially committed password.
4. Test locally
Now test locally if you can switch between revisions and every works without error messages:
1 2 3 4 5 |
svn --revision 12 update ... At revision 12. grep password config.cfg password = xxxxxxxxx |
If you did everything as the tutorial says, you shouldn’t get any errors. If you forgot to replace checksums or you changed something that you weren’t supposed to change in the SVN revision file, you might get an error like below. However, if that happens, you can always go back to your backup and try it again…
1 2 3 4 |
svn --revision 12 update svn: Checksum mismatch while reading representation: expected: f85abfd8b63fa7ab68abc9364f2d339e actual: de6f581d115197baebc43c3975b9e396 |
5. Bash history cleanup
In step 3.2. we typed the plain text password in the bash. As you might know, this leaves traces in the ~/.bash_history file. Delete them by opening the files and then by simply removing the according lines. Make sure that you do not use the search function of VIM, since that has a history on its own. If you do, delete the history of VIM in ~/.viminfo.
1 2 3 |
vi ~/.bash_history # Remove everything that contains the password # Do NOT use the search function, but search manually! |
Hey there, this is Gianluca from the Wuala Team. Thanks for your interest in Wuala. Rest assured that there is no backdoor (unless the NSA managed to put one into AES or other cryptographic building blocks, but in that case, your tipp of separate encryption does not help much either). For 99.9% of the users, the thing to worry is malware with keyloggers and weak passwords.
Hint: If you don’t want a command to be stored in the .bash_history file simplly insert a leading space.
-rw-r–r– 1 www-data www-data 11741494623 Mar 26 10:06 /home/repos/db/revs/121129
it is 11gb. No file or directory in this revs file are required. how do i get rid of this.
Can i truncate like this..will this effects other revision create after this revision.
>/home/repos/db/revs/121129
my current running revs is 121406
Please let me know how to get rid of this revision as it effecting my backup and maintenence.
thanks in advance.