Digital signatures and certificates are widely used to prove the authenticity of data and they underpin much of the security of the Internet. Most famously, they protect websites, giving users assurance that they are connected to the genuine website. But they are also used in many other places, including code signing, and often e-mail signing/encryption.
An Analysis of Dangerous Curves
Certificates link a signature to a specific originator; an attacker who can forge one of these certificates can make web browsers trust spoofed websites, have arbitrary code trusted, and fake protected e-mail.
The process of verifying a digital signature used with a certificate is relatively expensive. Two things must be verified; not only is it necessary to check that the digital signature is valid, but also that the certificate linking it to the originator is valid — without the latter step, a signature could be forged. The process is shown in Fig. 1.
In figure 1, the inputs to the validation process are the digital signature, the data that it signs (such as an e-mail, application, HTTPS connection, etc.), and a certificate. The certificate contains the public key that can verify the signature, details about the originator, and links to a Public Key Infrastructure (PKI) that is used to authenticate the certificate as genuine. These inputs feed in to the two verifications that must take place — verifying the certificate and verifying the digital signature. The certificate and its links to the environment’s trusted PKI is used to verify the certificate. With the certificate verified, the digital signature public key it contains can then be trusted and used to verify the digital signature. The digital signature is verified using this public key, the digital signature itself, and the data that the signature purportedly signs.
To improve performance, systems can remember a certificate once they have validated it, so if more data arrives from the same originator (with the same certificate), there is no need to repeat the certificate validation step, and so only one of the two verifications need be performed. But a mistake here in matching up certificates can undermine the validation process, and this is what happened in Windows’ certificate validation engine, disclosed as CVE-2022-34689, published late last year.
Windows’ certificate validation engine implements this optimization, though it is disabled by default, and an application has to specifically choose to use it (specifically, the optimization is enabled by passing CERT_CHAIN_CACHE_END_CERT to the CertGetCertificateChain API). When enabled, the validation engine may end up storing large numbers of validated certificates, so to optimize performance it needs to be able to search the stored certificates quickly.
This is done by creating an index that can be searched fast, rather than by comparing a new certificate against each stored one in turn. The index is built by using a hash function to give a small unique identifier for each certificate. The index is a sorted map from the hash value to the certificate — the sorted map can be searched fast, so it becomes quick to check whether a certificate has already been validated.
When new data arrives, whose digital signature uses a certificate, the certificate is hashed to determine its unique identifier, and the store’s index is searched for it. If found, then there’s no need to validate the certificate, as its presence in the store indicates it has already been successfully validated. If not found, then the certificate is put through the validation process, and if successful, it and its hash value are added to the store. This works fine, but it assumes no two certificates have the same hash value.
A completely secure hash algorithm will never calculate the same hash value for two different pieces of data, but in practice hash algorithms will give the same result for different data — when this happens, it is called a hash collision — but it is extremely unlikely that different data taken at random will have the same hash value.
Additionally, for a completely secure hash algorithm, it is not possible (technically “computationally infeasible”) to create some data that hashes to a specific value. This would mean it is not possible to create a certificate that has the same hash as another, but in practice weaknesses have been discovered in older hash algorithms, such as MD5, that makes this possible, and is the reason Windows’ optimization was flawed. The use of MD5 means an attacker can create a certificate of their own that has the same hash as a genuine certificate that Windows has already validated. When Windows checks the attacker’s certificate, it finds the certificate’s hash in the index and so assumes it has been successfully validated.
One of the techniques for generating hash collisions is known as the “Chosen Prefix” method. This involves injecting arbitrary data into the attacker’s certificate such that it ends up having the same hash value as a genuine certificate (Figure 2). The arbitrary data is added in such a way that it is ignored when the certificate is looked at — it is redundant and has no effect on the processing of the certificate; it only changes the data’s hash.
This is where Forcepoint’s Zero Trust Content Disarm and Reconstruction (CDR) can help.
The Zero Trust CDR solution is designed to deliver clean, malware free data, but it does not work by looking for malware. Instead, it extracts the useful business information from some data and then discards the data, building new data to carry the extracted information. Any malware is discarded along with the original data because it is not useful. The new data is safe to deliver because it is built in the normal way that applications expect.
Zero Trust CDR thwarts “Chosen Prefix” hash collision attacks because the arbitrary data that is injected to make the hashes collide is not useful and so gets discarded. This means the newly built certificate that is delivered does not have the hash of any other certificate. So an attacker’s attempt to fool Windows into trusting their certificate is thwarted because they cannot get their own certificate to have the same hash as a trusted one (Figure 3).
Another example of Zero Trust CDR protecting systems that use certificates was described in our earlier article about CVE-2022-0778. Here, a dangerous cryptographic elliptic curve is embedded into a certificate, that causes a validating application to hang when it attempts to understand the curve. Furthermore, Zero Trust CDR also inherently protects against another elliptic curve cryptography vulnerability, CVE-2020-0601; here the structures inside a certificate are made inconsistent, fooling some software into using an unauthorized weak elliptic curve. Zero Trust CDR builds a consistent certificate, using only the strong, authorized curve, and so prevents the attack.
Zero Trust CDR is often thought of as a defense against vulnerabilities in applications handling complex data formats, like Microsoft Word and PDF. But formats such as digital signatures and certificates are also relatively complex, so vulnerabilities can appear in the applications that handle them. Using Zero Trust CDR to deliver normal, consistent data helps defend these applications.
Here's a quick overview of how Zero Trust CDR stops malware: