Using Let’s Encrypt for internal servers
Let’s Encrypt is a revolutionary new certificate authority that provides free certificates in a completely automated process. These certificates are issued via the ACME protocol. Over the last 2 years or so, the Internet has widely adopted Let’s Encrypt — over 50% of the web’s SSL/TLS certificates are now issued by Let’s Encrypt.
But while there are many tools to automatically renew certificates for publicly available webservers (certbot, simp_le, I wrote about how to do that 3 years back), it’s hard to find any useful information about how to issue certificates for internal non Internet facing servers and/or devices with Let’s Encrypt.
This blog posts describes how to issue Let’s Encrypt certificates for internal servers. At my work, we issued a certificate for each of our 65,000 90,000+ BCDR appliances using this exact mechanism.
Content
Hello Hacker News, first time on the HN front page! I feel honored, yeyy! I responded to all of the concerns in the comments section.
If you’re looking for an implementation of this idea, you may find localtls interesting. I have not tested it myself, but it seems to do similar things to what I am describing here.
1. How does it work?
To issue a certificate through Let’s Encrypt, you must prove that you either own the website you want to issue the certificate for, or that you own the domain it runs on. Typically, automated tools like certbot use the HTTP challenge to prove site ownership using the .well-known directory. While this works beautifully if the site is Internet-facing (and Let’s Encrypt can verify the HTTP challenge files via a simple HTTP request), it doesn’t work if your server runs on 10.1.1.4 or any other internal address.
The DNS challenge solves this problem by letting you prove domain ownership through the DNS TXT record _acme-challenge.example.com. Let’s Encrypt will verify that the record matches what it expects and issue your certificate if it all adds up.
So really the magic ingredients to issuing certificates for internal non Internet facing machines are:
- A dedicated DNS zone for all your internal devices, e.g. xi8qz.example.com, and a dynamic DNS server to manage this zone (here: example.com)
- An ACME client capable of using the Let’s Encrypt’s DNS challenge to prove domain ownership
2. Example: An internal server 10.1.1.4, aka. xi8qz.example.com
The following diagram shows how we have implemented our Let’s Encrypt integration for our backup appliances. Each appliance (read: internal server) is behind a NAT and carries its own local IP address.
The general approach is simple: The appliance regularly reaches out to our control server to ensure that it can be reached via its own subdomain. If its local IP address changes, it triggers an update of its own subdomain. In addition, it checks regularly if the certificate is still valid, and requests a renewal if it’s outdated.
Here’s a bit more detail to this process:
For this example, let’s assume we’re trying to issue a certificate for an appliance with the identifier xi8qz and the local IP address 10.1.1.4. From the perspective of this appliance, there are two requests to be made:
- Steps 1-3: First, it needs to set/update its own DNS domain (here: xi8qz.example.com). This domain will later be used as a common name (CN) in the certificate. On top of that, it needs to make sure that this record is updated every time the server’s IP address changes.
- Steps 4-14: It needs to regularly check if the local certificate needs to be renewed and request a renewal if it’s time. Obviously, if there is no certificate it needs to be “renewed”.
Let’s now examine these steps in greater detail.
2.1. Prerequisites: Assigning a domain for each machine (steps 1-3)
As mentioned above, we need to give each appliance a proper domain name in order to be able to prove ownership to Let’s Encrypt, so we need to buy a domain (here: example.com) and delegate its NS records to our DDNS server:
1 2 |
$ dig +short NS example.com ddns1.mycompany.com. |
On top of that, we need the ability to dynamically add and remove records from it (via an API of some sort). I’ve previously written about how to spin up your own DDNS server, if you are interested.
Once that’s all set up, we need to make sure that the machine’s A record is updated whenever its IP address changes. For our internal machine, let’s assign xi8qz.example.com as its domain. If everything’s working properly, you should be able to resolve this domain to its IP address using a normal DNS query:
1 2 |
$ dig +short xi8qz.example.com 10.1.1.4 |
2.2. Requesting a certificate (steps 4-14)
Assuming you now control the DNS zone for example.com completely and you can quickly edit it dynamically, you’re all set for actually issuing certificates for your local device domain via Let’s Encrypt.
For our example appliance, it will regularly check if the existing certificate is still valid (step 4). If there is no certificate or the existing one is about to expire, the device will generate a keypair and a certificate signing request (CSR) using its assigned hostname (here: xi8qz.example.com) as a CN, and it’ll send that CSR to the control server (step 5).
After authorizing the request (an important step not shown in the diagram!), the control server requests a DNS challenge for the given domain from the ACME API via the Pre-Authorization/new-authz API call (step 6). The ACME API responds with a DNS challenge (step 7). If all goes well, this looks something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
{ "identifier": { "type": "dns", "value": "xi8qz.example.com" }, "status": "pending", "expires": "2018-04-15T21:26:29Z", "challenges": [ { "type": "dns-01", "status": "pending", "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/VtjihR4X8nLAj4MDwI...", "token": "aLptEKAeUOajkiGrx-kkbjUX4b1MC..." }, // ... ], // ... } |
Using this response, the control server must set a DNS TXT record at _acme-challenge.xi8qz.example.com (step 8) and notify the ACME API that the challenge response has been placed (step 9).
Once the challenge response has been verified by Let’s Encrypt (step 10-11), the certificate can finally be requested using the CSR (step 12-13).
After Let’s Encrypt responds with a certificate, you’ll see something like this on the wire:
1 2 3 4 |
-----BEGIN CERTIFICATE----- MIIGEjCCBPqgAwIBAgISAyk2izMz7OXSqHeZhg+rUR5uMA0GCSqGSIb3DQEBCwUA MEoxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MSMwIQYDVQQD ... |
If decoded with openssl, we can see that’s it’s the real deal:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
$ openssl x509 -in www.crt -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 03:29:36:8b:33:33:ec:e5:d2:a8:77:99:86:0f:ab:51:1e:6e Signature Algorithm: sha256WithRSAEncryption Issuer: C=US, O=Let's Encrypt, CN=Let's Encrypt Authority X3 Validity Not Before: Jul 18 23:37:35 2018 GMT Not After : Oct 16 23:37:35 2018 GMT Subject: CN=xi8qz.example.com Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:be:69:df:28:04:9c:2b:e9:94:72:c3:de:a6:fd: a4:38:93:be:43:a7:81:8b:dc:9a:be:19:0d:c0:d1: ... |
This certificate is then returned to the machine (step 14). After the webserver of the appliance/server has been restarted, it’s web interface can be accessed via HTTPS in the browser or on the command line:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
$ curl -v https://xi8qz.example.com/login * Trying 10.1.1.4... * TCP_NODELAY set * Connected to xi8qz.example.com (10.1.1.4) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Client hello (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384 * ALPN, server accepted to use http/1.1 * Server certificate: * subject: CN=xi8qz.example.com * start date: Jul 18 23:37:35 2018 GMT * expire date: Oct 16 23:37:35 2018 GMT * subjectAltName: host "xi8qz.example.com" matched cert's "xi8qz.example.com" * issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3 * SSL certificate verify ok. > GET /login HTTP/1.1 > Host: xi8qz.example.com > User-Agent: curl/7.58.0 > Accept: */* > < HTTP/1.1 200 OK < Date: Sun, 05 Aug 2018 17:38:49 GMT < Server: Apache/2.4.18 (Ubuntu) ... |
3. Deployment considerations: Let’s Encrypt rate limits
It’s important to note that if you are considering implementing this mechanism for a large number of servers that you use the Let’s Encrypt staging environments for testing and, more importantly, that you consider their rate limit restrictions.
By default, Let’s Encrypt only allows you to issue 20 certificates per week for the same domain or the same account. To increase this number, you have to either request a higher rate limit or get your domain added to the public suffix list (note: adding your domain here has other implications!).
Due to these rate limits, it is vital that you spread out the initial deployment enough to stay under the rate limit, and that you leave enough room for future servers to be added. Also consider renewals in the initial rollout plan.
4. Summary
As you can see it’s not really rocket science.
We first assigned each appliance (aka. internal server) a public domain name using our own dynamic DNS server and a dedicated DNS zone. Using the server’s assigned domain (here: xi8qz.example.com), we then used Let’s Encrypt’s free certificate offering and their DNS challenge to issue a certificate for that server.
By doing that for all internal servers, we can provide secure communication in our internal IT infrastructure without having to deploy a custom CA cert or having to pay for certificates.
Thanks for the informative article. There had been a joke going around the Internet, “how do you know you’re about to log into a piece of critical infrastructure” with the answer being “the invalid cert warning in your browser.” As you point out, ACME is a solution to automate generation and deployment for certificates, especially internally from a local CA, but your use of LE here (a public as opposed to internal CA) is interesting as it eliminates what would have been the problem of distributing trust for an internal CA, especially as many of your endusers may be external partners. That said, I think a primary consideration within many organizations to move in the direction of an internal CA for internal infrastructure is a preference to use internal DNS zones for internal network space as opposed to split-horizon DNS, which has its own implications to be considered. Thanks again for the great write-up!
But it reveals internal information like naming convention, IP ranges etc.
See http://pdns.daloo.de/search.php?alike=1&q=datto for example
Greetings from good old Frankfurt
Ha. Hi! It does indeed, got a better idea though?
Also, there is no naming scheme (it’s random), and the internal IPs are from tens of thousands of different networks, so they’re all 10.x and 192.168.x … I’m all ears if you have a better idea.
“Someone” has to decide if exposing this information is fine or running your own CA (with the additional work) suits them better. I guess many factors will have to be put into account. I don’t make this kind of decisions :)
I guess the new LE wild card cert support should remove Alexander Georgiev’s concerns – one cert for all the machines so no need to have all their DNS details exposed.
In my company’s case, we can’t use a wildcard cert, because the appliances that my company sells belong to different customers. Having only one cert/key would mean they’d be able to mitm.
Why not set the A record for the external DNS server to something like 1.1.1.1 and the internal DNS server has the correct IP address (or a view on the same DNS Server). This way you won’t leak internal IP addresses to the Internet.
Robert, thanks for the comment. Your approach would certainly work if the 65k servers were in one single controlled network. However, the use case we made this for is different:
My company sells backup appliances which our customers place in their company network (much like you would place a router in your own network). Since we do not control anything other than our appliance inside our customers’ networks, we went this route to provide a secure web interface to our appliance. And since the 65k servers/appliances (now more like 80k) are located in tens of thousands of different networks, leaking the internal IP isn’t bad at all — especially since there is no way to correlate them with the customer.
For normal “internal servers” within a company, I’d probably recommend using a wildcard cert and an internal DNS server.
The real magic is implementing such internal certificates with Gitlab. It seems impossible.
If you’re using split DNS, all the public DNS can point to the same system, as long as you’re happy distributing the certs to your internal systems. Which is relatively trivial.
You could just use the bzed-dehydrated puppet module. Of course that needs a working puppet installation….
Thanks for detailing the steps you take to accomplish all this. Any chance you would consider open-sourcing the code/scripts that you use to execute all these steps?
What is the control server in this example? No information provided.
Re split DNS: If you control the infrastructure, there are tons of other solutions. If you do not, this is the only one I could think of. The only thing we control in our customers’ network is our own backup appliance …
Re open source: The code that execute these steps is unfortunately proprietary code, but the idea is pretty simple. You can use any “ACME client” in your favorite language (Google will help you there), and you can use a simple DynDNS such as the one I described in my other post.
Here’s a small snippet from the API endpoint that our appliances call which I realize is not thaaat helpful, but it shows that we execute the exact steps that I described in the article. And it’s real code, not pseudo code:
Re “what is the control server”: Since we do not control the devices that need the certificate, we do not blindly trust them ever. The control server is the only server talking to the Let’s Encrypt APIs.
Again about Split-DNS:
You don’t need to control more than the appliance at your customers.
Your external DNS resolves blabla.mydomain.com to one of you own IPs (does not mater which). For ACME, you use the DNS TXT method.
On your internal DNS server (the server your systems use to get the IP to access backup appliances) blabla.mydomain.com resolves to the real IP address.
You can access the appliances via their DNS name, but no one on the Internet is able to resolve DNS Name and so search for your appliances via that way. If you use DynDNS for the appliance to update the DNS Entry that’s also not a problem, just only update the internal DNS.
The word “internal DNS” is a little misleading in this chase.
Great to see I’m not alone. I’ve came to this solution around late 2017, automating my internal certificate refreshes in this very same way. Since the scripts were developed in house, we also extended them to apply to other non web server systems as well – pretty much anything that uses SSL, including mail servers, routers, firewalls and many other products from major names like Cisco and F5. It is truly great what you can achieve when you control the entire infrastructure and is a tad creative. Great documentation!
Have you looked into acme-dns ( https://github.com/joohoi/acme-dns )? I think that could have simplified and automated the approach even further.
Thanks a lot for your work. It is a typical issue for IoT devices embedding a web console and until then, I never found a sound solution.
Now:
– the “internal server” is in your customer premise.
– “Let’s encrypt” is the real public service (not a local instance)
– The DNS .example.com must be any public service and you have to pay for the name
– The DNS xx123.example.com could be at your company or at your customer premise.
– I did not understand if you setup up a single root name x12-custom1.datto-appliance.com
– or if all have a different name x12.customer1.com ? which is more costly.
– The “control server” is in your company, right? Not at your customer premise.
Thanks again
Sorry I just found out looking at http://pdns.daloo.de/search.php?alike=1&q=datto so all your customer access something.dattolocal.net
I guess you could even fill it with random data if you had not already many customers
Re id/wno / acme-dns: I have not looked at that yet. I’ll check it out.
Re Bruno V:
> I did not understand if you setup up a single root name x12-custom1.datto-appliance.com
As you found out already, we have one domain for all customers. The subdomain part is a random string.
> The “control server” is in your company, right? Not at your customer premise.
Correct. We are the only ones talking to Let’s Encrypt. Otherwise we’d have to share the LE credentials with our customers, which we don’t want to do, for obvious reasons.
Thanks for the interesting article. DDns is easy enough to setup.
What command do you use to get certbot to work in this scenario, or do you have to write scripts for it?
I’ve never used certbot, so I cannot comment on that unfortunately. We implemented this ourselves using the ACME API. I’d love to share the code but it’s company internal code. You can probably find good libraries these days for Let’s Encrypt/ACME and use the diagram in the article to call the API endpoints in order.
This library has all the API calls you need afaik, although it’s a little dated and barely testable. But it’d be a good starting point: https://github.com/skoerfgen/CertLE
Sorry I can’t help more.
The flow is: new-authz
Sorry my lack of knowledge, I’m just starting with such networking…
I understood how you are getting certificates for your company’s appliances.
But if your registered DNS subdomains is pointing to internal addresses, how do you access those appliances inside your customer’s network?
Thank you for the interesting article! We’re trying to figure out how to build something similar for our internal network. In general, it’s all figured out, and we’re now at a point where we need to make sure that it’s actually secure. In particular, clients need to be able to provide some kind of identification for the control server to authorize a request.
> After authorizing the request (an important step not shown in the diagram!), the control server requests a DNS challenge for the given domain from the ACME API […].
How did you implement this authorization step? For new servers is there a bootstrap phase, which includes deployment of a private key of some sort?
@Nakkle: Something like that. Correct.
Sorry but this article did not help me at all. It is very confusing and just assumes all the vital steps needed to actually make this work with no actual examples from start to finish. I am simply trying to enable https with let’s encrypt on a local non-internet facing server. I have a domain and local dns server.
@Evin, you could try using https://github.com/Corollarium/localtls
@Philipp, thanks for the great write up. Have a look at the nice FOS implementation it seems to have informed, but wasn’t linked here, yet.
For people coming here from search engines, without reading the article, the linked GitHub project implements a minimal, lesser-than-secure DNS server with HTTP API to produce and offer Let’s Encrypt certificates to be retrieved for services running under .local’ish addresses.
Hey cool I didn’t know this project existed. Thanks for linking it. Will link it in the post.
Very nice article and clever solution. So the DNS server example.com will resolve appliance web address to a the local ip address within the customer’s network?
I believe the following (self hosted portal) software solution simplifies what is explained in this tutorial:
https://github.com/RealTimeLogic/SharkTrust
Hi, I’m curious to understand which strategy you applied to “circumvent” Let’s encrypt’s rates limit.
At 20 renewal per week you can renew 1040 (52X20) devices certificates every year. Considering that let’s encrypt certificates last 90 days the number of renewal requests is higher (so the number of devices you can renew is lower).
You are doing this for 90000 devices so… did you asked for an higher rate limit? Did you used multiple domains?
Thank you.
You can request a higher rate limit yes. There’s a form. It’s quite easy to find.
On top of that, we do donate to Let’s Encrypt, but we could have gotten that higher rate limit even without that.