My name is Philipp C. Heckel and I write about nerdy things.
This site moved here from blog.philippheckel.com/blog.heckel.xyz!

Using Let’s Encrypt for internal servers


Linux, Security

Using Let’s Encrypt for internal servers


Let’s Encrypt is a revolutionary new certificate authority that provides free certificates in a completely automated process. These certificates are issued via the ACME protocol. Over the last 2 years or so, the Internet has widely adopted Let’s Encrypt — over 50% of the web’s SSL/TLS certificates are now issued by Let’s Encrypt.

But while there are many tools to automatically renew certificates for publicly available webservers (certbot, simp_le, I wrote about how to do that 3 years back), it’s hard to find any useful information about how to issue certificates for internal non Internet facing servers and/or devices with Let’s Encrypt.

This blog posts describes how to issue Let’s Encrypt certificates for internal servers. At Datto, we issued a certificate for each of our 65,000+ BCDR appliances using this exact mechanism.


Content


1. How does it work?

To issue a certificate through Let’s Encrypt, you must prove that you either own the website you want to issue the certificate for, or that you own the domain it runs on. Typically, automated tools like certbot use the HTTP challenge to prove site ownership using the .well-known directory. While this works beautifully if the site is Internet-facing (and Let’s Encrypt can verify the HTTP challenge files via a simple HTTP request), it doesn’t work if your server runs on 10.1.1.4 or any other internal address.

The DNS challenge solves this problem by letting you prove domain ownership through the DNS TXT record _acme-challenge.example.com. Let’s Encrypt will verify that the record matches what it expects and issue your certificate if it all adds up.

So really the magic ingredients to issuing certificates for internal non Internet facing machines are:

  • A dedicated DNS zone for all your internal devices, e.g. xi8qz.example.com, and a dynamic DNS server to manage this zone (here: example.com)
  • An ACME client capable of using the Let’s Encrypt’s DNS challenge to prove domain ownership

2. Example: An internal server 10.1.1.4, aka. xi8qz.example.com

The following diagram shows how we have implemented our Let’s Encrypt integration for our Datto backup appliances. Each appliance (read: internal server) is behind a NAT and carries its own local IP address.

The general approach is simple: The appliance regularly reaches out to our control server to ensure that it can be reached via its own subdomain. If its local IP address changes, it triggers an update of its own subdomain. In addition, it checks regularly if the certificate is still valid, and requests a renewal if it’s outdated.

Here’s a bit more detail to this process:

For this example, let’s assume we’re trying to issue a certificate for an appliance with the identifier xi8qz and the local IP address 10.1.1.4. From the perspective of this appliance, there are two requests to be made:

  • Steps 1-3: First, it needs to set/update its own DNS domain (here: xi8qz.example.com). This domain will later be used as a common name (CN) in the certificate. On top of that, it needs to make sure that this record is updated every time the server’s IP address changes.
  • Steps 4-14: It needs to regularly check if the local certificate needs to be renewed and request a renewal if it’s time. Obviously, if there is no certificate it needs to be “renewed”.

Let’s now examine these steps in greater detail.

2.1. Prerequisites: Assigning a domain for each machine (steps 1-3)

As mentioned above, we need to give each appliance a proper domain name in order to be able to prove ownership to Let’s Encrypt, so we need to buy a domain (here: example.com) and delegate its NS records to our DDNS server:

On top of that, we need the ability to dynamically add and remove records from it (via an API of some sort). I’ve previously written about how to spin up your own DDNS server, if you are interested.

Once that’s all set up, we need to make sure that the machine’s A record is updated whenever its IP address changes. For our internal machine, let’s assign xi8qz.example.com as its domain. If everything’s working properly, you should be able to resolve this domain to its IP address using a normal DNS query:

2.2. Requesting a certificate (steps 4-14)

Assuming you now control the DNS zone for example.com completely and you can quickly edit it dynamically, you’re all set for actually issuing certificates for your local device domain via Let’s Encrypt.

For our example appliance, it will regularly check if the existing certificate is still valid (step 4). If there is no certificate or the existing one is about to expire, the device will generate a keypair and a certificate signing request (CSR) using its assigned hostname (here: xi8qz.example.com) as a CN, and it’ll send that CSR to the control server (step 5).

After authorizing the request (an important step not shown in the diagram!), the control server requests a DNS challenge for the given domain from the ACME API via the Pre-Authorization/new-authz API call (step 6). The ACME API responds with a DNS challenge (step 7). If all goes well, this looks something like this:

Using this response, the control server must set a DNS TXT record at _acme-challenge.xi8qz.example.com (step 8) and notify the ACME API that the challenge response has been placed (step 9).

Once the challenge response has been verified by Let’s Encrypt (step 10-11), the certificate can finally be requested using the CSR (step 12-13).

After Let’s Encrypt responds with a certificate, you’ll see something like this on the wire:

If decoded with openssl, we can see that’s it’s the real deal:

This certificate is then returned to the machine (step 14). After the webserver of the appliance/server has been restarted, it’s web interface can be accessed via HTTPS in the browser or on the command line:

3. Deployment considerations: Let’s Encrypt rate limits

It’s important to note that if you are considering implementing this mechanism for a large number of servers that you use the Let’s Encrypt staging environments for testing and, more importantly, that you consider their rate limit restrictions.

By default, Let’s Encrypt only allows you to issue 20 certificates per week for the same domain or the same account. To increase this number, you have to either request a higher rate limit or get your domain added to the public suffix list (note: adding your domain here has other implications!).

Due to these rate limits, it is vital that you spread out the initial deployment enough to stay under the rate limit, and that you leave enough room for future servers to be added. Also consider renewals in the initial rollout plan.

4. Summary

As you can see it’s not really rocket science.

We first assigned each appliance (aka. internal server) a public domain name using our own dynamic DNS server and a dedicated DNS zone. Using the server’s assigned domain (here: xi8qz.example.com), we then used Let’s Encrypt’s free certificate offering and their DNS challenge to issue a certificate for that server.

By doing that for all internal servers, we can provide secure communication in our internal IT infrastructure without having to deploy a custom CA cert or having to pay for certificates.

23 Comments

  1. Lincoln DeCoursey

    Thanks for the informative article. There had been a joke going around the Internet, “how do you know you’re about to log into a piece of critical infrastructure” with the answer being “the invalid cert warning in your browser.” As you point out, ACME is a solution to automate generation and deployment for certificates, especially internally from a local CA, but your use of LE here (a public as opposed to internal CA) is interesting as it eliminates what would have been the problem of distributing trust for an internal CA, especially as many of your endusers may be external partners. That said, I think a primary consideration within many organizations to move in the direction of an internal CA for internal infrastructure is a preference to use internal DNS zones for internal network space as opposed to split-horizon DNS, which has its own implications to be considered. Thanks again for the great write-up!



  2. Philipp C. Heckel

    Ha. Hi! It does indeed, got a better idea though?

    Also, there is no naming scheme (it’s random), and the internal IPs are from tens of thousands of different networks, so they’re all 10.x and 192.168.x … I’m all ears if you have a better idea.


  3. Alexander Georgiev

    “Someone” has to decide if exposing this information is fine or running your own CA (with the additional work) suits them better. I guess many factors will have to be put into account. I don’t make this kind of decisions :)


  4. James Horsley

    I guess the new LE wild card cert support should remove Alexander Georgiev’s concerns – one cert for all the machines so no need to have all their DNS details exposed.


  5. Philipp C. Heckel

    In my company’s case, we can’t use a wildcard cert, because the appliances that my company sells belong to different customers. Having only one cert/key would mean they’d be able to mitm.


  6. Robert

    Why not set the A record for the external DNS server to something like 1.1.1.1 and the internal DNS server has the correct IP address (or a view on the same DNS Server). This way you won’t leak internal IP addresses to the Internet.


  7. Philipp C. Heckel

    Robert, thanks for the comment. Your approach would certainly work if the 65k servers were in one single controlled network. However, the use case we made this for is different:

    My company sells backup appliances which our customers place in their company network (much like you would place a router in your own network). Since we do not control anything other than our appliance inside our customers’ networks, we went this route to provide a secure web interface to our appliance. And since the 65k servers/appliances (now more like 80k) are located in tens of thousands of different networks, leaking the internal IP isn’t bad at all — especially since there is no way to correlate them with the customer.

    For normal “internal servers” within a company, I’d probably recommend using a wildcard cert and an internal DNS server.


  8. Hans Werfer

    The real magic is implementing such internal certificates with Gitlab. It seems impossible.


  9. mrkunkel

    If you’re using split DNS, all the public DNS can point to the same system, as long as you’re happy distributing the certs to your internal systems. Which is relatively trivial.


  10. Bernd Zeimetz

    You could just use the bzed-dehydrated puppet module. Of course that needs a working puppet installation….


  11. Eric

    Thanks for detailing the steps you take to accomplish all this. Any chance you would consider open-sourcing the code/scripts that you use to execute all these steps?


  12. AGB

    What is the control server in this example? No information provided.


  13. Philipp C. Heckel

    Re split DNS: If you control the infrastructure, there are tons of other solutions. If you do not, this is the only one I could think of. The only thing we control in our customers’ network is our own backup appliance …


  14. Philipp C. Heckel

    Re open source: The code that execute these steps is unfortunately proprietary code, but the idea is pretty simple. You can use any “ACME client” in your favorite language (Google will help you there), and you can use a simple DynDNS such as the one I described in my other post.

    Here’s a small snippet from the API endpoint that our appliances call which I realize is not thaaat helpful, but it shows that we execute the exact steps that I described in the article. And it’s real code, not pseudo code:

    $domain = $this->getVerifiedDomain($id, $csr);
    $this->logger->info(‘Requesting new certificate for ‘ . $domain . ‘ …’);

    $challenge = $this->acmeService->requestDnsChallenge($domain);
    $this->ddnsService->set(
    “_acme-challenge.$domain”,
    ‘TXT’,
    $challenge->getDigest(),
    self::CHALLENGE_RECORD_PRIORITY,
    self::CHALLENGE_RECORD_TTL_SECONDS,
    self::CHALLENGE_RECORD_EXPIRY_SECONDS
    );

    $this->time->sleep(self::CHALLENGE_PLACED_SETTLE_SECONDS);

    $this->acmeService->notifyChallengePlaced($challenge);
    $this->acmeService->waitForChallengeVerification($challenge);
    $certificate = $this->acmeService->requestCertificate($csr);

    $this->ddnsService->remove(“_acme-challenge.$domain”, ‘TXT’);

    return $certificate;


  15. Philipp C. Heckel

    Re “what is the control server”: Since we do not control the devices that need the certificate, we do not blindly trust them ever. The control server is the only server talking to the Let’s Encrypt APIs.


  16. Robert Penz

    Again about Split-DNS:

    You don’t need to control more than the appliance at your customers.

    Your external DNS resolves blabla.mydomain.com to one of you own IPs (does not mater which). For ACME, you use the DNS TXT method.
    On your internal DNS server (the server your systems use to get the IP to access backup appliances) blabla.mydomain.com resolves to the real IP address.

    You can access the appliances via their DNS name, but no one on the Internet is able to resolve DNS Name and so search for your appliances via that way. If you use DynDNS for the appliance to update the DNS Entry that’s also not a problem, just only update the internal DNS.

    The word “internal DNS” is a little misleading in this chase.


  17. Alex H

    Great to see I’m not alone. I’ve came to this solution around late 2017, automating my internal certificate refreshes in this very same way. Since the scripts were developed in house, we also extended them to apply to other non web server systems as well – pretty much anything that uses SSL, including mail servers, routers, firewalls and many other products from major names like Cisco and F5. It is truly great what you can achieve when you control the entire infrastructure and is a tad creative. Great documentation!



  18. Bruno V.

    Thanks a lot for your work. It is a typical issue for IoT devices embedding a web console and until then, I never found a sound solution.
    Now:
    – the “internal server” is in your customer premise.
    – “Let’s encrypt” is the real public service (not a local instance)
    – The DNS .example.com must be any public service and you have to pay for the name
    – The DNS xx123.example.com could be at your company or at your customer premise.
    – I did not understand if you setup up a single root name x12-custom1.datto-appliance.com
    – or if all have a different name x12.customer1.com ? which is more costly.
    – The “control server” is in your company, right? Not at your customer premise.

    Thanks again




  19. Philipp C. Heckel

    Re Bruno V:

    > I did not understand if you setup up a single root name x12-custom1.datto-appliance.com
    As you found out already, we have one domain for all customers. The subdomain part is a random string.

    > The “control server” is in your company, right? Not at your customer premise.
    Correct. We are the only ones talking to Let’s Encrypt. Otherwise we’d have to share the LE credentials with our customers, which we don’t want to do, for obvious reasons.


Leave a comment

I'd very much like to hear what you think of this post. Feel free to leave a comment. I usually respond within a day or two, sometimes even faster. I will not share or publish your e-mail address anywhere.