An upcoming feature of OpenZFS (and ZFS on Linux, ZFS on FreeBSD, …) is At-Rest Encryption, a feature that allows you to securely encrypt your ZFS file systems and volumes without having to provide an extra layer of devmappers and such. To give you a brief overview of what the feature can do, I thought I’d write a short post about it.
The current ZFS encryption implementation is not (yet) merged into the upstream repository (as of January 2017). There is a pretty big pull request which is still being reviewed, but because the feature is so incredibly cool (and because my colleague Tom Caputi developed it), I thought a sneak preview is absolutely necessary.
Content
0. This post
This post demonstrates a feature that has not yet been released. For the demos I will focus on ZFS on Linux on an Ubuntu 16.04 based machine. The code I run is available in the official repos (links below) or in my forks (SPL & ZFS, use branch “blogpost” for both).
1. Introduction
At-rest encryption is a new feature in ZFS (zpool set feature@encryption=enabled <pool>) that will automatically encrypt almost all data written to disk using modern authenticated ciphers (AEAD) such as AES-CCM and AES-GCM.
The CLI makes it incredibly easy to enable encryption on a per dataset/volume basis (zfs create -o encryption=on <dataset>). The keys used for encryption can be inherited or are manually set for a dataset. Keys can be loaded from different sources (prompt or file) and various input formats are available (raw, hex or passphrase). Keys and key sources can be changed after the dataset/volume creation, and without re-encrypting the data (as they are never used directly).
The encryption parameters and key status of a dataset/volume are represented in various properties (encryption=<on|aes-128-gcm|...>, keysource=<raw|hex|passphrase>,<prompt|file>, keystatus=<none|available|unavailable>, pbkdf2iters=<n>).
Many normal ZFS commands are available even if the key of a dataset is not loaded, meaning that administrators can manage the pool without having to know the keys. For instance: a pool can be scrubbed (zpool scrub <pool>) without the keys, and datasets and snapshots can be listed (zfs list -rt). In future releases, zfs send and zfs recv will also work even if the key is not available.
Having built-in support for encryption at the file system level is huge. It means that you no longer have to use dm-crypt if you want to encrypt your data on disk, and you can still manage your pools even if keys are not loaded. Many thanks to Tom Caputi for bringing us this incredible feature.
1.1. What’s encrypted
All important pieces are encrypted (actual data and metadata, ACLs, permissions, directory listings, …), while some things are unencrypted to allow managing pools more easily.
Here’s a listing of what’s encrypted and what’s not:
Encrypted | Not Encrypted |
---|---|
|
|
1.2. Crypto Details
Note: This section describes the nitty gritty crypto details. You can safely skip it if you just want to use the feature.
Crypto concepts are always a bit hard to explain without confusing everyone. Tom has done an excellent job explaining the ZFS encryption crypto concept in his talk and it is visualized very nicely in his slides (PDF; or on Google Drive: original, mirror).
If you don’t have the time to watch the entire talk, let me try to summarize the concepts one of his slides:
Normal / non-dedup case: Before the plaintext block data (or metadata) is written, it is encrypted using AES (in CCM or GCM mode, depending on the -o encryption=.. property) with a 128/192/256-bit encryption key (default is AES-CCM-256). The 96-bit initialization vector (IV) used for CCM/GCM is randomly generated using the standard linux PRNG, and it is never reused. The encryption key itself is derived from the encrypted master key (see below) using the key derivation function HKDF. The 64-bit salt used for HKDF is randomly generated (using the above mentioned PRNG) and stored with the encryption key in a volatile salt cache. The encryption key is reused (for performance reasons) until it goes stale.
Dedup case: If deduplication is enabled, the algorithm behaves slightly differently, because it has to produce the same ciphertext for the same plaintext (given the same master key). To achieve that, the salt and the IV are not randomly generated, but instead a 160-bit HMAC of the plaintext is used: the first 64 bits are used as the salt, the remaining 128 bits are used as IV. The 256-bit HMAC key is randomly generated (using above mentioned PRNG), and stored alongside the master key.
Master key: The master key is randomly generated (using above mentioned PRNG) and it is never exposed to the user directly. Instead, the master key is encrypted (with the same cipher and mode with a 256-bit key) using a user provided wrapping key. This wrapping key is provided via a file (as hex or raw, see -o keysource=.. property) or via a password prompt. If a passphrase is supplied by the user, the wrapping key is derived using the password-based key derivation function PBKDF2 (using 100k iterations by default, or whatever you specify in the property -o pbkdf2iters=..).
If you want to know more, I highly suggest watching Tom Caputi’s ZFS encryption talk, or reviewing the slides (PDF; or on Google Drive: original, mirror).
2. Using ZFS encryption
Using the encryption feature is pretty simple. All the relevant commands and properties are described in great detail in the ZFS man page (man zfs), but here’s an excerpt of what you need to know.
2.1. Enabling the feature on the pool
Assuming you have installed a version of ZFS with encryption installed (if not, follow the steps in section compile and install at your own risk), you need to turn it on for your pool. I’ll create a test pool called testpool for this post:
1 2 3 4 5 6 |
# Creating a test pool $ truncate -s 1G block $ zpool create testpool $(pwd)/block # Turning on encryption (you NEED a ZFS version that supports this!) $ zpool set feature@encryption=enabled testpool |
2.2. Creating an encrypted dataset
Once you’ve enabled the feature on the pool, you can create encrypted datasets and volumes. To do that, you need to pass the two properties -o encryption=.. -o keysource=.. to the zfs create command. Depending on your preferences, you may also pass -o pbkdf2iters=..:
1 2 3 4 5 |
$ zfs create \ -o encryption=<off | on | aes-128-ccm | aes-192-ccm | aes-256-ccm | aes-128-gcm | aes-192-gcm | aes-256-gcm> \ -o keysource=<raw | hex | passphrase>,<prompt | file://...> \ -o pbkdf2iters=<n> \ <dataset> |
The -o encryption=.. property controls the ciphersuite (cipher, key length and mode). The default is aes-256-ccm, which is used if you specify -o encryption=on.
The -o keysource=.. property controls what format the encryption key will be provided as and where it should be loaded from. The key can be formatted as raw bytes, as hex representation or as a user password. It can be provided via a user prompt which will pop up when you first create it, or when you mount the dataset (zfs mount) or load the key manually (zfs key -l). Unless you want to automate things, -o keysource=passphrase,prompt seems like a good option.
The -o pbkdf2iters=.. property is only used if a passphrase is used (-o keysource=passphrase,..). It controls the iterations of PBKDF2. Higher is better as it slows down potential dictionary attacks on the password. The default is -o pbkdf2iters=100000.
Here are a few examples of how to create encrypted datasets and volumes (ZVOLs):
Creating an encrypted dataset, using the defaults:
1 2 3 4 5 6 |
$ zfs create \ -o encryption=on \ -o keysource=passphrase,prompt \ testpool/enc1 # This will ask you to enter/confirm a password. |
Creating an encrypted child dataset, which inherits all parameters and keys from its parent:
1 |
$ zfs create testpool/enc1/encinherit |
Creating an encrypted dataset, using AES/GCM with 128-bit key loaded from a file (encoded as hex):
1 2 3 4 5 6 |
$ echo 0000111122223333444455556666777788889999AAAABBBBCCCCDDDDEEEEFFFF > /dev/shm/enc2key $ zfs create \ -o encryption=aes-128-gcm \ -o keysource=hex,file:///dev/shm/enc2key \ testpool/enc2 # No prompts for a password! |
Creating an encrypted dataset, using AES/GCM with a 256-bit key loaded from a file (not encoded):
1 2 3 4 5 |
$ head -c 32 /dev/urandom > /dev/shm/enc3key $ zfs create \ -o encryption=aes-256-gcm \ -o keysource=raw,file:///dev/shm/enc3key \ testpool/enc3 |
Creating an encrypted ZFS volume (ZVOL), using the defaults with one million PBKDF2 rounds:
1 2 3 4 5 6 7 8 |
$ zfs create \ -V 10M \ -o encryption=on \ -o keysource=passphrase,prompt \ -o pbkdf2iters=1000000 \ testpool/enc4 # This will ask you to enter/confirm a password. |
2.3. Reading the encryption properties
Once you’ve created a dataset or volume, you can query its encryption properties like you normally would:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
$ zfs get -p encryption,keystatus,keysource,pbkdf2iters testpool/enc1 encryption aes-256-ccm local testpool/enc1 keystatus available - testpool/enc1 keysource passphrase,prompt local testpool/enc1 pbkdf2iters 100000 local testpool/enc1/encinherit encryption aes-256-ccm inherited from testpool/enc1 testpool/enc1/encinherit keystatus available - testpool/enc1/encinherit keysource passphrase,prompt inherited from testpool/enc1 testpool/enc1/encinherit pbkdf2iters 100000 inherited from testpool/enc1 testpool/enc2 encryption aes-128-gcm local testpool/enc2 keystatus available - testpool/enc2 keysource hex,file:///dev/shm/enc2key local testpool/enc2 pbkdf2iters 1000000 local testpool/enc3 encryption aes-256-gcm local testpool/enc3 keystatus available - testpool/enc3 keysource raw,file:///dev/shm/enc3key local testpool/enc3 pbkdf2iters 0 - testpool/enc4 encryption aes-256-ccm local testpool/enc4 keystatus available - testpool/enc4 keysource passphrase,prompt local testpool/enc4 pbkdf2iters 1000000 local |
2.4. Importing a pool, mounting datasets and loading keys
If a dataset is encrypted, the read-only property keystatus represents the status of the key, and thereby also whether the dataset can be used (mounted, written to, read from …). It can be either off (unencrypted dataset), available (the key is loaded) or unavailable (the key is not loaded).
When a pool is imported using zpool import, encrypted datasets are left unmounted, because their keys are not automatically loaded. Only if the -l option is passed will encrypted datasets be loaded (if they can):
1 2 3 4 |
$ zpool import testpool -d . -l Enter passphrase for 'testpool/enc1': (enter password) Key load error: Failed to open key material file # << Key file does not exist! Enter passphrase for 'testpool/enc4': (enter password) |
Instead of using zpool import -l ..., you can manually load the keys for individual datasets and volumes using zfs key -l:
1 2 3 4 5 |
$ zfs key -l testpool/enc4 Enter passphrase for 'testpool/enc4': (enter password) $ zfs mount testpool/enc4 # Does not prompt. Just mounts! |
If zfs mount is called on an encrypted dataset with unavailable key, it will prompt you:
1 2 |
$ zfs mount testpool/enc4 Enter passphrase for 'testpool/enc4': (enter password) |
Unloading a key of a mounted dataset won’t work, because it’s still in use. The dataset has to be unmounted first:
1 2 3 4 5 6 7 |
# Does not work because dataset is mounted $ zfs key -u testpool/enc4 Key unload error: Dataset is busy. # Works like a charm $ zfs umount testpool/enc4 $ zfs key -u testpool/enc4 |
That’s essentially all the magic. If you want to know more, I suggest reading the ZFS man page (man zfs).
3. Compile and install
If you want to try the current implementation (before it is released), here are a few steps to compile and install it yourself on an Ubuntu 16.04-based system. Other systems will be very similar, but not identical. You can consult the Building ZFS wiki page for details.
Warning: Please be sure to only perform these steps on a test machine or a throw-away VM, because this will replace the ZFS kernel modules.
First, install all the build dependencies:
1 |
apt install libtool zlib1g-dev attr uuid-dev libblkid-dev libattr1-dev autoconf |
Once that’s done, compile and install SPL using the steps below. All relevant ZFS encryption pulls have been merged (as of January 2017), so this should “just work”. If it doesn’t (e.g. because the code has changed; you may be reading this in the future …), you may want to use my forked version instead (see SPL and ZFS, use branch “blogpost” for both):
1 2 3 4 5 6 7 8 9 10 11 12 13 |
# Compile and install SPL # # If this does not work anymore, use my forked version instead: # $ git clone https://github.com/binwiederhier/spl.git # $ git checkout blogpost # git clone https://github.com/zfsonlinux/spl.git cd spl ./autogen.sh ./configure make make install cd .. |
Next, compile and install ZFS using Tom’s fork. If the ZFS encryption pull request has been merged, you may just want to use the upstream master branch:
1 2 3 4 5 6 7 8 9 10 11 12 |
# Compile and install ZFS (Tom's fork) # # If this does not work anymore, use my forked version instead: # $ git clone https://github.com/binwiederhier/zfs.git # $ git checkout blogpost # git clone https://github.com/tcaputi/zfs.git cd zfs ./autogen.sh ./configure --prefix=/usr # << Don't forget the --prefix make install cd .. |
Now the ZFS modules should be built in /lib/modules/$(uname -r)/extra, so all you have to do is load them. Be sure to remove the old modules first:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# Remove existing modules (repeat until successful!) modprobe -r splat modprobe -r zavl modprobe -r zcommon modprobe -r zunicode modprobe -r znvpair modprobe -r icp modprobe -r spl modprobe -r zfs # Insert newly compiled modules (all of these must succeed!) cd /lib/modules/$(uname -r)/extra insmod avl/zavl.ko insmod unicode/zunicode.ko insmod spl/spl.ko insmod nvpair/znvpair.ko insmod zcommon/zcommon.ko insmod icp/icp.ko insmod zfs/zfs.ko |
If that succeeded, be sure to yell “hooray” before you move on, because these steps took me a while to get right when I did it for the first time.
You can now use the feature (as described above).
4. FAQ
4.1. Is deduplication supported?
Yes, it is supported. However, there is some information that is leaked due to the nature of deduplication. See more in Tom’s talk.
4.2. Can I change the password? Will data be re-encrypted?
Yes, the password can be changed. The data does not get re-encrypted, because the password is merely used to decrypt a master key.
4.3. Does it work with TPM?
No, not yet.
5.Links
- OpenZFS Roadmap (with ZFS encryption on it)
- Hacker News discussion about ZFS encryption
- GitHub pull request for ZFS encryption (ZFS on Linux)
- GitHub pull request for ZFS encryption (OpenZFS)
- Talk at OpenZFS Developer Summit by Tom Caputi about ZFS encryption (video)
- Instructions on how to compile / build ZFS
- Slides used for the talk at the OpenZFS Developer Summit (PDF; or on Google Drive: original, mirror)
1. Thanks for the guide – however you forgot a ‘make’ in the installation instructions for ZFS, right before ‘make install’
2. Have you already tried out booting from an encryption-enabled pool with grub? Not sure if I’m doing anything wrong, but grub refuses to detect ZFS as my root filesystem :(
Woaaa, thanks for this write-up.
Can you tell something about PAM integration? It would be so nice to automatically decrypt/encrypt /home/username at login/logout.
I don’t think that a PAM module is part of this at all. It’d be nice to have one though. You could write one :-)
Great post. Will the encryption features be available on existing pools after the next zfs release? Being able to do a zpool upgrade and then turn this feature on for an existing dataset or zvol would obviously be great.
Yes, you’ll be able to enable it for existing pools.
Hello,
Great post! Thank you a lot. I am really curious to have my fingers on that, however i completed the compilation guide and the installation without problems (errors etc). All modules were loaded successfully but when i try to create new volume i get “invalid property ‘keysource'” error. Its like i am running zfs without encryption support. Any ideas?
Hi Philipp,
Great post, do you know how far the upstream process is? – I find it hard to figure out myself
thx
Tom tells me that it is approved and scheduled to be merged in after the next release of ZFS, so it’ll be released with the release after next.
Svetlin: try this
zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase testpool/enc1
(you must have compiled Tom’s latest version and not Phillip’s)
Above you mentioned encryption should be merged in the next release or the one after, which I believe we are at. I’m running Arch Linux with the latest kernel and ZFS package, zfs-linux 0.7.0_4.12.3_1-2 but I don’t think encryption is included yet?
Do you have any idea when we might see this implemented?
I also wanted to ask your opinion on something if you don’t mind. I am looking into syncing backups with a friend over the internet with zfs send/receive. We don’t want to be able to see each others’ data. In your opinion, what would be the best way with zfs send/receive? I was thinking dm-crypt or native with zfs encryption? I’m leaning towards native because it would be the least hassle. Or maybe you have a better idea? Thanks!!
A little birdy told me that it will be merged within the next few days. Officially it’ll be released for 0.7.1 but once it’s merged I’d say you can start using it.
For your use case, I’d say definitely go with native zfs encryption. Everything else is a hassle. dmcrypt does work, but you’ll have to nest filesystems and loop devices, which is annoying…
I see 0.7.1 has been released, but no encryption yet? – any ETA. for final release
Thanks
Oh, i was waiting for the encryption feature…
Can I still follow the instructions above to use encryption?
When will it be merged ?
Merged!
Is it possible to upgrade an existing pool with encryption? Supposing you could do that, what happens to remote snapshots? Can you use an incremental send to update those, or will you have to rebuild it entirely from scratch.
Is the final merged version compatible with the version in your fork?
I can’t for the love of me seem to get the merged version to work. Your fork however works fine.
I can’t seem to get this to build on Ubuntu 16. Seems like kernel modules are disabled?
checking whether modules can be built… no
configure: error: *** Unable to build an empty module.
Any ideas?
Thanks for this very informative article.
I wanted to mention the build process outlined in this article seems like it doesn’t work anymore (at least I couldn’t get it to work), however building from the ZFS on Linux main repository works fine now and has the encryption feature. Some of the zfs command syntax is different than listed here too.
I’ve put together some build instructions here: https://datacenteroverlords.com/2017/12/17/zfs-on-linux-with-encryption-part-2/
I hope people find it useful.
Mailing with Tom Caputi, I learned that the current git master does indeed work with Ubuntu 16.04 if you follow these instructions. SPL does not need to be build anymore for that:
1. add or change /etc/depmod.d/ubuntu.conf to ‘search extra updates ubuntu built-in’
2. compile only the zfsonlinux git master as said with “configure –prefix=/usr”
3. run ‘depmod -a; modprobe -r zfs; modprobe zfs’ to insert the new zfs module.
Smooth Sailin’
Sadly, debian (buster) still not have encryption implemented in zfs till today!
ZFS upstream has not released a version with ZFS encryption yet, so none of the distros have it.
zfs native encryption is part of 0.8 zfs on linux, see: https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.8.0-rc1
Oh nice; I hadn’t seen that 0.8 was released yet. Well, almost, since this is an RC still. Still, good to know that it’s finally out :-)
With the new zfs native encryption in 0.8 the creation of encrypted datasets works a bit different than described above – maybe you can correct this, Philipp. More specifically the option keysource has been split into the two options keyformat and keylocation:
zfs create -o encryption=aes-256-gcm -o keyformat=raw -o keylocation=file:///dev/keys/my_key_file my_pool/my_dataset
Is there a way to get this to prompt for a key automatically on boot or will I need to modify the zfs-mount service file?
Why are the words ‘root’ and ‘filesystem’ not found at all in this article?
So is ZFS cryptography safe if ZFS is the rootfs?
Thanks :)
I’ve never tried it as rootfs. So I can’t tell you if it’ll work or not.
zfs-0.8.0
@behlendorf behlendorf released this a day ago ยท 15 commits to master since this release
New Features
Native encryption #5769
!!
Thanks, after a lot of searching for what is admittedly still somewhat adventurous territory on linux (until Ubuntu 19.10 lands in post-beta anyway) this guide sheds a lot of light on the mechanics of handling encryption and keys.
Kudos!
Thanks for writing this article. ZFS native encryption is working great on FreeBSD 13 in my home environment.
Also, it was interesting to follow the progress of this implementation: https://blog.esp0x31.io/zfs-encrypted-backups/#some-interesting-milestones