Last time, I mentioned how this website is set up on Azure.
It's a simple static website, so the setup is just to upload the static assets to an Azure Storage Account and set up an Azure CDN with a custom domain name in front of it. I also wanted to use my own custom TLS certificate for it sourced from Let's Encrypt, which means I needed to set up an automatic renewal workflow for said cert. But the web server is in Azure CDN's control so I can't run something turnkey like Certbot. Besides, I wanted to do this as much as possible by myself rather than relying on third-party stuff anyway, so I implemented my own ACME renewal workflow that runs periodically in an Azure Function.
I'll write more details about the Azure setup later.
The Azure setup for renewing the Let's Encrypt cert ran successfully for the first time yesterday, so this is the right time to talk about it.
What do you need to set up a static web site hosted on Azure?
An Azure Storage account.
This Storage account is used to host the files served by the static website. You have to enable it to serve static websites, after which it automatically gets a contained named
$web
. Any files you put inside this container are served by the Storage account's blob HTTP endpoint.An Azure CDN profile and endpoint.
Since I want to use my own custom domain instead of the Storage account's blob HTTP endpoint, I also provisioned an Azure CDN profile in front of the Storage account. This also means every HTTP request does not go to the single Storage account in the US, but to the CDN cache that has endpoints all over the world.
Note that the CDN endpoint's origin type must be set to "Custom origin" and point to the Storage account's web endpoint. You don't want to set it to "Storage", because then the container name becomes part of the URL, like
https://cdnendpoint.azureedge.net/$web/index.html
A custom domain. Configure your domain's DNS to add a CNAME record pointing to the CDN endpoint, and configure the CDN endpoint itself to accept the custom domain.
An Azure KeyVault to host the Let's Encrypt account key, and the HTTPS certificate itself.
An Azure Function app to run the ACME cert renewal workflow.
An Azure DNS server used for dns-01 challenges. This server does not need to serve the whole domain; it's only used to serve the dns-01 challenge's TXT record.
An Azure Function app that ensures the CDN custom domain is using the latest cert from the KeyVault.
Azure Storage accounts for each of the two Function apps. You could use the same Storage account for both, and even use the same Storage account as the one hosting the website.
Design of the Let's Encrypt auto-renewal Function app
While Azure CDN does support provisioning and using a certificate automatically (via DigiCert), I found this process very unreliable. It's supposed to be that you just select the "CDN managed" option and Azure reaches out to Digicert and provisions the cert. However I waited many hours and this never happened. If you search around, you'll find other people with this problem got it resolved by having Azure customer support manually resend the request to DigiCert.
Eventually I did see the cert provisioned by DigiCert on crt.sh, but by then I'd given up on it and aborted the process from the Azure end.
So this was a good opportunity to use Let's Encrypt instead.
Function apps are limited in what programming languages they support. I wanted to only use one of the GA languages and not the preview ones, so I had a choice between Java, JavaScript and any .Net Core language. I decided to go with F#, as that is the most modern and type-safe language among the choices I had.
You can find the code for both Functions here.
The
Acme
Function periodically checks if the cert in the KeyVault is close to expiry. If it is, the Function requests a new cert from the ACME endpoint and uploads it to the KeyVault.The
UpdateCdnCertificate
Function periodically checks if the CDN custom domain is using the latest version of the cert in the KeyVault. If it isn't, the Function updates the CDN custom domain so that it does.
Each Function's directory also has an ARM deployment template and documentation for how to deploy it.
The reason there are two distinct Functions is that the two are
independent of each other. The Acme
Function completes the
dns-01 challenge, and thus only needs to work with the Azure DNS server
to complete the challenge. (It requests a wildcard cert, so it couldn't
complete an http-01 challenge anyway.) It doesn't need to know that this
cert is then used with a CDN custom endpoint; that's the
UpdateCdnCertificate
Function's responsibility.
Also, as a general principle, I did not want to use any of the existing Azure libraries for interacting with its REST API. I've had bad experiences with them in the past given that they pull in megabytes of dependencies and frequently have conflicts with the versions of those dependencies. I would also have to keep on top of their new releases / CVEs and update the dependency versions. Instead, I just wrote the minimal amount of code I needed to directly make HTTP requests to the REST API endpoints for the operations I cared about.
I did however have to depend on the
Microsoft.NET.Sdk.Functions
package, since it contains the
types and attributes you need to write the Functions so that they can be
loaded from the host.
Accessing Azure resources
The Function app needs OAuth2 tokens from Azure Active Directory, one for each Azure resource that it wants to access. There are two ways of getting these tokens:
Use an Azure Service Principal (SP).
Create an SP and save its
appId
andpassword
in your Function app's settings, along with your subscription's "tenant ID". TheappId
is the "client ID", and thepassword
is the "client secret". The app uses them by sending an HTTP POST request tohttps://login.microsoftonline.com/${TENANT_ID}/oauth2/token
with a URL-encoded form body that looks like:grant_type=client_credentials&client_id=${CLIENT_ID}&client_secret=${CLIENT_SECRET}&resource=${RESOURCE}
This is the only option for testing the Functions locally.
Use the app's "Managed Service Identity" (MSI).
There will be two environment variables set on its process named
MSI_ENDPOINT
andMSI_SECRET
. The app sends an HTTP GET request to${MSI_ENDPOINT}?resource=${RESOURCE}&api-verson=2017-09-01
with aSecret
header set to${MSI_SECRET}
.Initially, Linux Function apps did not support MSI. As of 2019-08-25, they do.
In either case, the app should get a 200 OK
response,
with a JSON body that looks like this:
{
"access_token": "...",
"token_type": "..."
}
The app then constructs an HTTP Authorization
header
that looks like
Authorization: ${TOKEN_TYPE} ${ACCESS_TOKEN}
, and uses this
header for all requests to that resource.
The value of the RESOURCE
component depends on what
Azure resource the app wants to operate on:
For working with the Azure Management API,
RESOURCE
ishttps://management.azure.com
For working with the contents of an Azure KeyVault,
RESOURCE
ishttps://vault.azure.net
. This does not apply to operations on the KeyVault itself, which are part of the management API and use the Management APIAuthorization
header.
See here for the official documentation of how to
construct these Authorization
headers.
Azure REST API
Here are links to the REST API docs for the specific operations that the Function app uses.
CDN
Use the Management API
Authorization
header for all requests.Get the certificate name and version that a custom domain is currently set to use
The documentation of the response body is outdated and does not mention the
properties.customHttpsParameters
value. Specifically, the response will be a JSON object that looks like:{ "properties": { "customHttpsParameters": { ... } } }
This
properties.customHttpsParameters
value is the same as theUserManagedHttpsParameters
object described here.Note that, despite their names, the
properties.customHttpsParameters.certificateSourceParameters.secretName
and.secretVersion
values are not specific to KeyVault secrets and also apply to KeyVault certificates. (The private key of a KeyVault certificate is implicitly a KeyVault secret.)Set the certificate name and version that a custom domain should use
Use the
UserManagedHttpsParameters
form of the request body, not theCdnManagedHttpsParameters
form, and setprotocolType
toServerNameIndication
.The documentation is wrong about the API version. The API version must be
2018-04-02
or higher, not2017-10-12
as the documentation suggests. If you use2017-10-12
then the CDN API will ignore thecertificateSource
andcertificateSourceParameters
and start provisioning a "CDN managed" cert from DigiCert. (This mistake is also present in the example in the ARM specs repository.)This operation is asynchronous (returns
202 Accepted
) and can take many hours to complete, so it's possible for the Function to time out waiting for it to complete. It may be sufficient to poll it for a few minutes to ensure it doesn't fail, and then assume it will eventually succeed.
DNS
Use the Management API
Authorization
header for these requests.KeyVault
Use the KeyVault API
Authorization
header for all requests.Get a certificate's version and expiry
If there are multiple versions of this cert, the response only contains the latest one.
-
Only the
value
field is required. KeyVault will automatically set the attributes likeexp
andnbf
by parsing the certificate.If the cert of this name already exists, the original cert will be marked an older version of the cert.
-
It's useful to set the
contentType
to a MIME type likeappliction/octet-stream
so that the portal does not try to render it as text. Otherwise only thevalue
field is required.
Miscellaneous caveats
Using the Azure SDKs / clients
If you do want to use the Azure SDKs or clients to have your Azure CDN use your custom KeyVault cert, note that support for "user managed" certs is not complete in all of them.
The .Net SDK only started supporting the feature in 2019-03, so ensure you use a version of
Microsoft.Azure.Management.Cdn
that includes that commit.As of 2020-03, the
Enable-AzureCdnCustomDomainHttps
PowerShell command still does not support it. It only supports "CDN-managed" certs.The
az
CLI tool hasaz cdn custom-domain enable-https --custom-domain-https-parameters
, but does not explain how to set thecustom-domain-https-parameters
parameter. This GitHub issue from 2019-07 assumed it would be a JSON object, but ran into trouble using it anyway. As of 2020-03, it is apparently being worked on.
If you do use an SDK or client, make sure it doesn't end up using the
"CDN managed" certs, either because it doesn't let you specify KeyVault
certificate source parameters, or because it internally uses an API
version lower than 2018-04-02
Why use the Linux runtime for the Function apps?
When the Acme
Function app receives the cert from Let's
Encrypt, it combines the cert with the private key and uploads it to the
KeyVault. Then it tells the CDN to use this cert.
When I was initially coding up the Function app, I was doing it on Windows. I had no problem with combining and uploading the cert to the KeyVault, but the CDN API to make CDN use the cert would fail. To be sure, I also did it manually from the Azure Portal, and it failed in the same way:
The server (leaf) certificate doesn't include a private key or the size of the private key is smaller than the minimum requirement.
I was sure the cert I uploaded to KeyVault absolutely did have a private key, and that the key was 4096 bits, so this error did not make any sense. The error message also didn't say anything about what "the minimum requirement" might be. The closest thing to any requirements I could find was this page that lists the CAs that Azure CDN allows, and it does contain DST Root CA X3 (Let's Encrypt's parent CA).
I filed a support request on 2019-05-13. Surprisingly, from 2019-05-14, the error message from the CDN API changed to:
We were unable to read the private key for the certificate provided. The server (leaf) certificate private key may be corrupted.
The timing was probably a coincidence, but it did at least make it seem the key length was a red herring. However, I was still confident that the KeyVault certificate did contain a private key. To be even more confident, I spun up a local nginx server and it was able to use the cert without any problems.
There was one more red herring in the subsequent back-and-forth between me and the product developers (via customer support). Specifically, the product developers said:
Verify the uploaded certificate is a KeyVault Secret, not a KeyVault Certificate
... as if implying that CDN does not support using KeyVault certificates, only secrets. But I did not believe this, since even CDN's own documentation for user-managed certificates explicitly talks about using KeyVault certificates. (The KeyVault certificate object obviously does not contain the private key that Azure CDN would need to serve HTTPS, but every certificate that's uploaded to KeyVault also implicitly creates a KeyVault secret that holds the full certificate, including its private key. The KeyVault certificate object contains a reference to this KeyVault secret object, and Azure CDN uses this to determine the KeyVault secret.)
Eventually they realized the issue was that the private key in the
cert was not marked "exportable", so even though it existed CDN was not
able to access it. This is apparently a Windows-specific feature that
works by adding an msPKI-Private-Key-Flag
attribute to the
private key, that contains a CT_FLAG_EXPORTABLE_KEY
flag. Windows and Windows tooling checks for the presence of this
attribute and flag, and artificially rejects access to the private key
if the flag is unset.
There are search results that mention using the
System.Security.Cryptography.X509Certificates.X509KeyStorageFlags.Exportable
flag. But since I was using
System.Security.Cryptography.X509Certificates.RSACertificateExtensions.CopyWithPrivateKey
to generate the combined cert, I could not see where I would set this
attribute. Setting it on the original public cert's
X509Certificates2
object did not help, and there was no way
to do it on the System.Security.Cryptography.RSA
object for
the private key either.
I thought about using a third-party library, but BouncyCastle
was the only one I'd heard
of and it doesn't support .Net Core. There is an unofficial fork Portable.BouncyCastle
that claims to
support .Net Core but still
internally uses API that is only implemented in .Net Framework, so I
could not assume that code which compiled would be guaranteed to work.
In any case, I didn't want to use a third-party library for the same
reason I didn't want to use the Azure .Net SDK - worrying about keeping
the dependency up-to-date.
Since this is a Windows-specific attribute and only artificially
prevents accessing the private key, I figured that non-Windows tooling
would not have this problem. Indeed, openssl ignores the attribute and
can export the key just fine, which is why my nginx server had no
problem using the cert despite the "non-exportable" private key.
Furthermore, .Net Core uses openssl to implement the
System.Security.Cryptography
API on Linux, and I confirmed
that CDN was able to use the cert just fine when I generated it with
RSACertificateExtensions.CopyWithPrivateKey
on Linux.
So I decided to not waste any more time figuring out the right incantation of API to make it work on Windows, and settled on using the Linux runtime for the Function app.