SSL/TLS: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content deleted Content added
m Bot: Replace deprecated source tag with syntaxhighlight
 
(19 intermediate revisions by 3 users not shown)
Line 7: Line 7:
'''WARNING''': implementing your own TLS layer is no guarantee of security. It is indeed recommended to never even write your own implementation of known, secure cryptographic algorithms as multiple attacks have been known to exploit some faults in the implementation. Writing your own TLS layer is however useful if you want to understand how SSL/TLS works and/or if you want to access Websites which are only available through HTTPS.
'''WARNING''': implementing your own TLS layer is no guarantee of security. It is indeed recommended to never even write your own implementation of known, secure cryptographic algorithms as multiple attacks have been known to exploit some faults in the implementation. Writing your own TLS layer is however useful if you want to understand how SSL/TLS works and/or if you want to access Websites which are only available through HTTPS.


There are a few tools that can assist you when developing your own TLS layer. First of all, [http://www.wireshark.org Wireshark] is a free tool that captures network traffic and explains in details how the different packets are composed, down to the signification of each byte (save the encrypted parts). Also, Python can an invaluable tool to prototype and verify your cryptographic algorithms (you might want to write a prototype of a TLS connection in Python first). Python indeed natively supports very large integers (e.g. 1024-bit ones), and it has several libraries for things it cannot do out of the box such as [https://www.dlitz.net/software/pycrypto/ PyCrypto] which contains most cryptographic primitives required or [https://pypi.python.org/pypi/scapy-ssl_tls/1.2.2 Scapy SSL] that allows you to forge SSL packets. These two tools can greatly help testing how TLS works.
There are a few tools that can assist you when developing your own TLS layer. First of all, [http://www.wireshark.org Wireshark] is a free tool that captures network traffic and explains in details how the different packets are composed, down to the signification of each byte (save the encrypted parts). Also, Python can be an invaluable tool to prototype and verify your cryptographic algorithms (you might want to write a prototype of a TLS connection in Python first). Python indeed natively supports operations over very large integers like 1024-bit integers, and it has several cryptography libraries such as [https://www.dlitz.net/software/pycrypto/ PyCrypto] or [https://pypi.python.org/pypi/scapy-ssl_tls/1.2.2 Scapy SSL] that allows you to forge SSL packets. These two tools can greatly help testing how TLS works.


==Cryptography==
==Cryptography recap==

A SSL/TLS connection is actually using a whole set of cryptographic algorithms called a cipher suite. On top of that, SSL/TLS does not support one but multiple cipher suites. An SSL/TLS connection might use a completely different cipher suite depending on what the client and server support. Fully supporting TLS would actually require to implement a whole series of cipher suites. Fortunately, implementing only a few popular cipher suites is enough for most cases. You can use SSL Labs [https://www.ssllabs.com/ssltest/ SSL test] to check what cipher suites are supported by various Web servers.

===Cryptography recap===


Here are the main types of cryptographic algorithms:
Here are the main types of cryptographic algorithms:
Line 37: Line 33:
* Asymmetric encryption (e.g. RSA): one party generates a private/public key pair and makes the public key readily available. Anybody can encrypt data using that public key, but only the owner of the private key can decrypt it
* Asymmetric encryption (e.g. RSA): one party generates a private/public key pair and makes the public key readily available. Anybody can encrypt data using that public key, but only the owner of the private key can decrypt it
* Symmetric encryption (e.g. AES): both parties need to use a shared secret key to encrypt and decrypt data
* Symmetric encryption (e.g. AES): both parties need to use a shared secret key to encrypt and decrypt data
* Signing (RSA): one party generates a private/public key pair and makes the public key readily available. Only the owner of the private key can sign data, but anybody with the public key can verify that the signature matches the data
* Signing (e.g. RSA): one party generates a private/public key pair and makes the public key readily available. Only the owner of the private key can sign data, but anybody with the public key can verify that the signature matches the data
* Message Authentication Cipher aka MAC (e.g. HMAC): generates a signature using a secret key
* Message Authentication Cipher aka MAC (e.g. HMAC): generates a signature using a secret key
* Cryptographic hash (e.g. SHA1, SHA256): generates a signature of some data, but it is very hard to find another data that would generate the same signature
* Cryptographic hash (e.g. SHA1, SHA256): generates a signature of some data, but it is very hard to find another data that would generate the same signature


==Cipher Suites==
===How TLS uses those===


A SSL/TLS connection is actually using a whole set of cryptographic algorithms called a cipher suite. On top of that, SSL/TLS does not support one but multiple cipher suites. An SSL/TLS connection might use a completely different cipher suite depending on what the client and server support. Fully supporting TLS would actually require to implement a whole series of cipher suites. Fortunately, implementing only a few popular cipher suites is enough for most cases.
We will study how things work with TLS version 1.2 using the TLS_DHE_RSA_AES_128_CBC_SHA cipher suite. This cipher suite indicates the algorithm used for the key exchange (DHE, using RSA for verification), for the actual encryption/decryption (AES 128-bit in CBC mode) and verification (HMAC+SHA1). This cipher suite thus requires to implement the following:

You can get an exhaustive list of the cipher suites available [http://www.thesprawl.org/research/tls-and-ssl-cipher-suites/ here], and use SSL Labs' [https://www.ssllabs.com/ssltest/ SSL test] to check what cipher suites are supported by various Web sites.

A TLS cipher suite is denoted as TLS_[key exchange protocol]_WITH_[encryption]_[authentication]. Because TLS is using symmetric encryption, the two parties first need to determine a common secret key over a non-secure connection. This is where the key exchange protocol is used. Then the data is encrypted. Last but not least, it is authenticated using a MAC, in order to make sure that data was not tampered with.

We will study how things work with TLS version 1.2 when the TLS_DHE_RSA_WITH_AES_128_CBC_SHA cipher suite is used. This cipher suite indicates the algorithms used for the key exchange (DHE, using RSA for verification), for the actual encryption/decryption (AES 128-bit in CBC mode) and verification (HMAC+SHA1). This cipher suite thus requires to implement the following:


* The Diffie-Hellman Ephemeral (DHE) key exchange protocol. This protocols relies on [https://en.wikipedia.org/wiki/Modular_exponentiation modular exponentiation] over very large numbers, although it is possible to get past it if security is not your primary goal
* The Diffie-Hellman Ephemeral (DHE) key exchange protocol. This protocols relies on [https://en.wikipedia.org/wiki/Modular_exponentiation modular exponentiation] over very large numbers, although it is possible to get past it if security is not your primary goal
Line 53: Line 55:
Note that you can easily find on the Internet source code for AES, SHA1, SHA256 and HMAC.
Note that you can easily find on the Internet source code for AES, SHA1, SHA256 and HMAC.


This cipher suite is not the strongest available, but is still relatively popular and shows the key mechanisms of a secure TLS interaction. Another cipher suite useful to implement is TLS_RSA_AES_128_CBC_SHA. The only difference is that the key exchange is using RSA instead of Diffie-Hellman. People interested in implementing a stronger suite can look at TLS_ECDHE_RSA_AES_128_GCM which requires to implement the Elliptic Curve version of Diffie-Hellman as well as the Galois Counter Mode (GCM) instead of the easier-to-implement CBC mode.
This cipher suite is not the strongest available, but is still relatively popular and shows the key mechanisms of a secure TLS interaction. Another cipher suite useful to implement is TLS_RSA_WITH_AES_128_CBC_SHA. The only difference is that the key exchange is using RSA instead of Diffie-Hellman. People interested in implementing a stronger suite can look at TLS_ECDHE_RSA_WITH_AES_128_GCM which requires to implement the Elliptic Curve version of Diffie-Hellman as well as the Galois Counter Mode (GCM) instead of the easier-to-implement CBC mode.


==TLS Packets==
Aside from the cipher suite, TLS defines its own PRF (Pseudo-Random Function) which is used to generate pseudo-random data. Here is an implementation example in Python:

<source lang="python">
def HMAC_hash(secret, val):
h = HMAC.new(secret, digestmod=SHA256)
h.update(val)
return h.digest()

def P_hash(secret, seed, size):
A = seed
result = ''
while size > 0:
A = HMAC_hash(secret, A)
result += HMAC_hash(secret, A+seed)
size -= 32
return result

def PRF(secret, label, seed, size):
return P_hash(secret, label+seed, size)[0:size]
</source>

The PRF here will generate [size] bytes of pseudo-random data based on the secret, the label and the seed.

Note the use of SHA256, even though the cipher suite specifies SHA1. TLS 1.2 requires at least SHA256 for its PRF (SHA384 if the cipher suite is using SHA384).

==Handshake==


Any communication in TLS starts with a 5-byte TLS Record header:
Any communication in TLS starts with a 5-byte TLS Record header:


<source lang="C">
<syntaxhighlight lang="C">
typedef struct __attribute__((packed)) {
typedef struct __attribute__((packed)) {
uint8_t content_type;
uint8_t content_type;
uint16_t version;
uint16_t version;
uint16_t length;
uint16_t length;
} TLSRecordHeader;
} TLSRecord;
</syntaxhighlight>
</source>


This header may be followed by another TLS header, such as a TLS Handshake header. Like for a TCP connection, a TLS connection starts with a handshake between the client and the server:
This header may be followed by another TLS header, such as a TLS Handshake header or a TLS Application Data header.


=== Record Types ===
* The client sends a Client Hello message, including a list of 32-byte list of random data and the list of its supported cipher suites. In our example we only send one supported cipher suite (code 0x0033)
* The server responds with a Server Hello message, telling the client what cipher suite is going to be used as well as its own 32-byte list of random data
* The server sends its certificates. These are used by the client to verify that it is actually talking to the site it thinks it is talking to, as opposed to a malicious site
* The server sends a Server Key Exchange message, initiating the key exchange and signing it with its public key
* The server sends a Server Hello Done message, indicating it is waiting for the client
* The client sends a Client Key Exchange message, containing its part of the key exchange transaction
* The client sends a Change Cipher Spec message
* The client sends a Encrypted Handshake Message
* The server sends a Change Cipher Spec
* The server sends a Encrypted Handshake Message
* The client and the server can communicate by exchanging encrypted Application Data messages


{| class="wikitable"
The Change Cipher Spec message tells the other party its is OK with the terms of the handshake.
|-
! Value (Hex)
! Description
|-
| 0x14
| Change Cipher Spec
|-
| 0x15
| Alert
|-
| 0x16
| Handshake
|-
| 0x17
| Application Data
|}


The Encrypted Handshake messages are the first ones to be sent encrypted. They contain a hash of the initial handshake messages and are here to ensure these were not tampered with.


=== Handshake Records===
Any subsequent communication is of type Application Data and encrypted.


<syntaxhighlight lang="C">
==Key Exchange==
typedef struct __attribute__((packed)) {
uint8_t handshake_type;
uint8_t[3] length;
} HandshakeRecordHeader;
</syntaxhighlight>


{| class="wikitable"
TLS encryption is performed using symmetric encryption. The client and server thus need to agree on a secret key. This is done in the key exchange protocol.
|-

! Value (Hex)
In our example, TLS is using the DHE/RSA algorithms: the Diffie-Hellman Ephemeral protocol is used to come up with the secret key, and the server is using the RSA protocol to sign the numbers it sends to the client (the signature is linked to its SSL certificate) to ensure that a third party cannot inject a malicious number. The upside of DHE is that it is using a temporary key that will be discarded afterwards. Key exchange protocols such as DH or RSA are using numbers from the SSL certificate. As a result, a leak of the server's private key (for example through [https://en.wikipedia.org/wiki/Heartbleed Heartbleed]) means that a previously recorded SSL/TLS encryption can be decrypted. Ephemeral key exchange protocols such as DHE or ECDHE offer so-called forward secrecy and are safe even if the server's private key is later compromised.
! Description

|-
Diffie-Hellman Ephemeral works as follows:
| 0x00

| Hello Request
* The server comes up with a secret number y, with a number g and a modulo p (p typically being a 1024 bit integer) and sends (p, g, pubKey=g<sup>y</sup> mod p) to the client in its "Server Key Exchange" message. It also sends a signature of the Diffie-Hellman parameters (see SSL Certificate section)
|-
* The client comes up with a secret number x and sends pubKey=g<sup>x</sup> mod p to the server in its "Client Key Exchange" message
| 0x01
* The client and server derive a common key premaster_secret = (g<sup>x</sup>)<sup>y</sup> mod p = (g<sup>y</sup>)<sup>x</sup> mod p = g<sup>xy</sup> mod p. If p is large enough, it is extremely hard for anyone knowing only g<sup>x</sup> and g<sup>y</sup> (which were transmitted in clear) to find that key.
| Client Hello

|-
Because computing g<sup>xy</sup> mod p using 1024-bytes integers can be tedious in most programming languages, if security is not a concern, one way to avoid this is to use x=1. This way, premaster_secret is just g<sup>y</sup> mod p, a value directly sent by the server. The security in such a case is of course compromised.
| 0x02

| Server Hello
premaster_key is however only a first step. Both client and server uses the PRF function to come up with a 48-byte master secret. The PRF function is used once again to generate a 104-bytes series of data which will represent all the secret keys used in the conversation (the length may differ depending on the cipher suite used):
|-

| 0x0B
<source lang="python">
| Certificate
# g_y, g and p are provided in the Server Key Exchange message
|-
# The client determines x
| 0x0C
premaster_secret = pow(g_y, x, p)
| Server Key Exchange

|-
# client_random and sever_random are the 32-bytes random data from the Client Hello and Server Hello messages
| 0x0D
master_secret = PRF(premaster_secret, "master secret", client_random + server_random, 48)
| Certificate Request
keys = PRF(master_secret, "key expansion", server_random + client_random, 104)
|-

| 0x0E
# The MAC keys are 20 bytes because we are using HMAC+SHA1
| Server Hello Done
client_write_MAC_key = keys[0:20]
|-
server_write_MAC_key = keys[20:40]
| 0x0F
# The client and server keys are 16 bytes because we are using AES 128-bit aka a 128 bit = 16 bytes key
| Certificate Verify
client_write_key = keys[40:56]
|-
server_write_key = keys[56:72]
| 0x10
# The IVs are always 16 bytes because AES encrypts blocks of 16 bytes
| Client Key Exchange
client_write_IV = keys[72:88]
|-
server_write_IV = keys[88:104]
| 0x14
</source>
| Finished

|}
Note how different secret keys are used for the client and for the server, as well as for encryption and to compute the MAC.

==Another Key Exchange: Elliptical Curve Diffie Hellman==

If Diffie-Hellman is a very powerful algorithm, it requires very large numbers to be considered secure (1024-bit at minimum). A variant is Elliptical Curve Diffie-Hellman, which is much harder to break even with 256-bit numbers. Numerous TLS cipher suites now rely on the ECDHE_RSA key exchange instead of DHE_RSA.

ECDH works as follows: consider a point Q = (x, y) on a curve y<sup>2</sup> = x<sup>3</sup> + a.x + b mod p. Both parties come up with secret numbers d1 and d2, and will send each other d1.Q and d2.Q (d1.Q means adding Q to itself d1 times). The shared secret key is d1.d2.Q.

TLS can use the ECDHE key exchange to come up with an ephemeral shared secret key the following way:

* The server indicates in the Server Key Exchange message what type of curve is going to be used (secp256r1 is a very common one). This tells what parameters a, b, p and G to use (see [http://www.secg.org/sec2-v2.pdf] to see the domain parameters for each curve)
* The server comes up with a random 256-bit number (or whatever the curve says) server_secret and sends pubKey = G*server_secret in the Server Key Exchange message. pubKey is sent as a 65-bytes block composed of 0x04 | Gx | Gy (both numbers being 32-bytes long)
* The client comes up with a random 256-bit number client_secret and sends pubKey = G*client_secret in the Client Key Exchange message. pubKey is sent in the same format as the server's
* Both parties will derive premaster_secret by computing server_pubKey * client_secret = client_pubKey * server_secret = G * client_secret * server_secret and taking the x coordinate of this result
* Once premaster_secret is determined, the rest of the computation works the same regardless of the key exchange protocol used

Regarding how to compute elliptic curve point multiplication, Wikipedia offers [https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication more details]. Note that, because we are only dealing with large integers, you should use [https://en.wikipedia.org/wiki/Modular_multiplicative_inverse modular multiplicative inverse] instead of divisions.

If you want to test Elliptic Curves in Python, [https://pypi.python.org/pypi/tinyec TinyEC] is a very useful package (along with the [https://github.com/alexmgr/tinyec source code] in pure Python):

<source lang="python">
import tinyec.ec as ec
import tinyec.registry as reg

# Get the domain parameters for the named curve specified in the Server Key Exchange message
c = reg.get_curve("secp256r1")

# Comes up with a random 256-bit (32 bytes) client_secret
# c.g is the default point on the elliptic curve (defined through the domain parameters)
# We multiply it with client_secret to obtain the public key
client_pubKey = c.g * client_secret
# Retrieved from the Server Key Exchange message
server_pubKey = ...

premaster_secret = (server_pubKey * client_secret).x
</source>

==SSL Certificate (optional)==

In order to prevent a Man-In-The-Middle attack (MITM), the server will sign the Diffie-Hellman parameters it sent to the client. Because the client may have never contacted the server before (and thus cannot securely obtain its public key), the client and server rely on a trusted third party known as a Certificate Authority (CA).

In order to verify the signature using the RSA algorithm, the client need to do the following:

* Retrieve the Certificate message sent by the server, which contains one or more certificates (look at a such a packet in Wireshark)
* Verify that the first certificate's RDN sequence (signedCertificate/subject:rdnSequence/rdnSequence) contains the Web site the client is trying to contact
* Get the RSA e and n values from the first certificate's public key (signedCertificate/subjectPublicKeyInfo/subjectPublicKey). Those parameters are encoded using the [https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One ASN.1] format (as a verification, e is very often 65537, or 0x10001)
* Compute the hash of the whole DH parameters (as sent by the server) preceded with the client and server random data. The certificate indicates what type of hash to use (signedCertificate/subjectPublicKeyInfo/algorithm):
* Compute signature<sup>e</sup> mod n, convert it to a string and take the last 20 bytes
* Both computations should be the same
* Because this certificate is probably generated by an intermediate CA, the client needs to verify that certificate
* Compute the hash of the whole signedCertificate section and repeat the operation using the next certificate
* Follow the certificate chain up to the end. The last certificate should belong to a root CA (any TLS implementation should contain a list of the root CAs and their public key) and is self-signed

==Encrypted Handshake Message==

The TLS handshake is concluded with the two parties sending a hash of the complete handshake exchange, in order to ensure that a middleman did not try to conduct a downgrade attack.

If your TLS client technically does not have to verify the Encrypted Handshake Message sent by the server, it needs to send a valid Encrypted Handshake Message of its own, otherwise the server will abort the TLS session.

Here is what the client needs to do to create :

* Compute a SHA256 hash of a concatenation of all the handshake communications (or SHA384 if the PRF is based on SHA384). This means the Client Hello, Server Hello, Certificate, Server Key Exchange, Server Hello Done and Client Key Exchange messages. Note that you should concatenate only the handshake part of each TLS message (i.e. strip the first 5 bytes belonging to the TLS Record header)
* Compute PRF(master_secret, "client finished", hash, 12) which will generate a 12-bytes hash
* Append the following header which indicates the hash is 12 bytes: 0x14 0x00 0x00 0x0C
* Encrypt the 0x14 0x00 0x00 0x0C | [12-bytes hash] (see the Encrypting / Decrypting data section). This will generate a 64-bytes ciphertext
* Send this ciphertext wrapped in a TLS Record


==TLS Handshake==
The server will use a similar algorithm, with two notable differences:


* It needs to compute a hash of the same handshake communications as the client as well as the decrypted "Encrypted Handshake Message" message sent by the client (i.e. the 16-bytes hash starting with 0x1400000C)
Like for a TCP connection, a TLS connection starts with a handshake between the client and the server to establish the cipher suite used. See [[TLS Handshake]] for more details.
* It will call PRF(master_secret, "server finished", hash, 12)


==TLS Encryption==
==Encrypting / Decrypting data==


Once the handshake has completed, all data sent both way is encrypted using the algorithms and secret keys agreed upon during the TLS Handshake. See [[TLS Encryption]] for more information.
Any encrypted data in this example is using AES 128-bit in CBC mode. AES encrypts 128-bit (16 bytes) blocks of data using a 128, 192 or 256-bit secret key. The CBC mode tells how to use AES to encrypt some plaintext which is not 16-bytes long.


==Math operations on large integers==
The following steps needs to be implemented:


Most cipher suites require to perform operations on large integers (e.g. 1024-bit integers) - something that low-level languages such as C/C++ do not handle out of the box. You can either port an existing library (such as [https://gmplib.org/ GMP]) to your operating system or write your own large integer library.
* Create an intermediary plaintext which concatenates:
** The 8-bytes sequence number. This number is 0 for handshake messages, 1 for the first application data message, 2 for the next application data message, etc.
** The 1-byte content type (0x16 for a handshake message, 0x17 for an application data message)
** The TLS version (0x0303)
** The 2-bytes plaintext length
** The original plaintext
* Compute the MAC on that intermediary plaintext using HMAC+SHA1 and the client/server_write_MAC_key
* The final plaintext will be the concatenation of [original plaintext] + [20-bytes MAC] + [CBC padding]. Because AES-CBC only encrypts data whose size is a multiple of 16, the CBC padding is composed of bytes to fill to 16 (16 full bytes if the plaintext size is already a multiple of 16). The value of each of those padding bytes is the length of the padding + 1. So in the case of a 16-bytes plaintext, the final plaintext would be [16-bytes plaintext] | [20 bytes MAC] | 0x0B0B0B0B0B0B0B0B0B0B0B0B
* Come up with a random 16-bytes IV (or you can use client/server_write_IV)
* Encrypt the final plaintext using the client/server_write_key and this IV
* The ciphertext is the concatenation of IV + ciphertext


The most common operation (used among others by the RSA and Diffie-Hellman Ephemeral key exchange) is modular exponentiation i.e. computing a<sup>b</sup> mod c. This operation requires to implement multiplication, addition and modulo (which in turns requires to implement comparison, multiplication and subtraction) over large integers. You can find on Wikipedia an [https://en.wikipedia.org/wiki/Modular_exponentiation#Right-to-left_binary_method algorithm] for modular exponentiation which is quite efficient even when using large numbers.
<source lang="python">
from Crypto.Hash import *
from Crypto.Cipher import AES


For better performance, it is recommended to use the full power of 32 (or 64-bit) numbers instead of performing operations one bit at a time. Here is an example:
def to_n_bytes(number, size):
h = '%x' % number
s = ('0'*(size*2 - len(h)) + h).decode('hex')
return s


<syntaxhighlight lang="C">
def encrypt(plaintext, iv, key_AES, key_MAC, seq_num, content_type):
typedef struct {
hmac = HMAC.new(key_MAC, digestmod=SHA)
uint16_t size; // number of 32-bit words
plaintext_to_mac = to_n_bytes(seq_num, 8) + to_n_bytes(content_type, 1) + '\x03\x03' + to_n_bytes(len(plaintext), 2) + plaintext
uint32_t *data;
hmac.update(plaintext_to_mac)
} LargeInt;
mac_computed = hmac.digest()


// Adds a to b and stores the result in a
cipher = AES.new(key_AES, AES.MODE_CBC, iv)
void LargeInt_add(LargeInt *a, LargeInt *b) {
plaintext += mac_computed
uint64_t carry = 0;
padding_length = 16 - (len(plaintext) % 16)
uint32_t *carry32 = (uint32_t*)&carry;
if padding_length == 0:
int size;
padding_length = 16
if (a->size > b->size) size = b->size;
else size = a->size;


for (int i=0; i< size; i++) {
padding = chr(padding_length - 1) * padding_length
carry = (uint64_t)a->data[i] + (uint64_t)b->data[i] + carry;
ciphertext = cipher.encrypt(plaintext + padding)
a->data[i] = carry32[0];
carry >>= 32;
}


if (size+1 <= a->size)
return ciphertext
a->data[size] = carry32[0];
}
</syntaxhighlight>


The same principle can be applied to multiplication: you can use the traditional [https://en.wikipedia.org/wiki/Multiplication_algorithm#Long_multiplication long multiplication algorithm], but instead of multiplying decimal digits (or even worse, bits), you can instead 32-bit "digits" which drastically reduce the number of operations to perform.
def decrypt(message, key_AES, key_MAC, seq_num, content_type, debug=False):
iv = message[0:16]
cipher = AES.new(key_AES, AES.MODE_CBC, iv)
decoded = cipher.decrypt(message[16:])


===Mind the endian===
padding = to_int(decoded[-1:]) + 1
plaintext = decoded[0:-padding-20]
mac_decrypted = decoded[-padding-20:-padding]


Remember that the data sent across the network is in big endian, whereas x86 computers use little endian. Last but not least, the premaster_secret passed to the PRF should be in big endian. Don't forget to perform the necessary conversions.
hmac = HMAC.new(key_MAC, digestmod=SHA)
plaintext_to_mac = to_n_bytes(seq_num, 8) + to_n_bytes(content_type, 1) + '\x03\x03' + to_n_bytes(len(plaintext), 2) + plaintext
hmac.update(plaintext_to_mac)
mac_computed = hmac.digest()


== See Also ==
if debug:
print('Decrypted: [' + decoded.encode('hex') + ']')
print('Plaintext: [' + plaintext.encode('hex') + ']')
print('MAC (decrypted): ' + to_hex(mac_decrypted))
print('MAC (computed): ' + to_hex(mac_computed))
print('')


* [https://tools.ietf.org/html/rfc5246 The TLS 1.2 Specifications]
return plaintext
</source>


[[Category:Network Protocols]]
[[Category:Network Protocols]]

Latest revision as of 05:43, 9 June 2024

This page is a work in progress.
This page may thus be incomplete. Its content may be changed in the near future.

SSL/TLS is a protocol used to ensure a secure connection in various standard networking protocols (HTTP, FTP, etc.). Even though people talk about SSL, this protocol has been since mostly replaced with TLS (versions 1.0, 1.1 or 1.2). SSL should not be used anymore as it is not considered secure.

In order to setup an HTTPS connection, SSL/TLS is used between TCP and HTTP. In other word, the HTTP command sent by the Web browser and the HTML returned by the server are encrypted using SSL/TLS.

WARNING: implementing your own TLS layer is no guarantee of security. It is indeed recommended to never even write your own implementation of known, secure cryptographic algorithms as multiple attacks have been known to exploit some faults in the implementation. Writing your own TLS layer is however useful if you want to understand how SSL/TLS works and/or if you want to access Websites which are only available through HTTPS.

There are a few tools that can assist you when developing your own TLS layer. First of all, Wireshark is a free tool that captures network traffic and explains in details how the different packets are composed, down to the signification of each byte (save the encrypted parts). Also, Python can be an invaluable tool to prototype and verify your cryptographic algorithms (you might want to write a prototype of a TLS connection in Python first). Python indeed natively supports operations over very large integers like 1024-bit integers, and it has several cryptography libraries such as PyCrypto or Scapy SSL that allows you to forge SSL packets. These two tools can greatly help testing how TLS works.

Cryptography recap

Here are the main types of cryptographic algorithms:

Public/private key Secret key No key
Encryption Asymmetric encryption Symmetric encryption
Verification Signing Message Authentication Cipher Cryptographic hash
  • Asymmetric encryption (e.g. RSA): one party generates a private/public key pair and makes the public key readily available. Anybody can encrypt data using that public key, but only the owner of the private key can decrypt it
  • Symmetric encryption (e.g. AES): both parties need to use a shared secret key to encrypt and decrypt data
  • Signing (e.g. RSA): one party generates a private/public key pair and makes the public key readily available. Only the owner of the private key can sign data, but anybody with the public key can verify that the signature matches the data
  • Message Authentication Cipher aka MAC (e.g. HMAC): generates a signature using a secret key
  • Cryptographic hash (e.g. SHA1, SHA256): generates a signature of some data, but it is very hard to find another data that would generate the same signature

Cipher Suites

A SSL/TLS connection is actually using a whole set of cryptographic algorithms called a cipher suite. On top of that, SSL/TLS does not support one but multiple cipher suites. An SSL/TLS connection might use a completely different cipher suite depending on what the client and server support. Fully supporting TLS would actually require to implement a whole series of cipher suites. Fortunately, implementing only a few popular cipher suites is enough for most cases.

You can get an exhaustive list of the cipher suites available here, and use SSL Labs' SSL test to check what cipher suites are supported by various Web sites.

A TLS cipher suite is denoted as TLS_[key exchange protocol]_WITH_[encryption]_[authentication]. Because TLS is using symmetric encryption, the two parties first need to determine a common secret key over a non-secure connection. This is where the key exchange protocol is used. Then the data is encrypted. Last but not least, it is authenticated using a MAC, in order to make sure that data was not tampered with.

We will study how things work with TLS version 1.2 when the TLS_DHE_RSA_WITH_AES_128_CBC_SHA cipher suite is used. This cipher suite indicates the algorithms used for the key exchange (DHE, using RSA for verification), for the actual encryption/decryption (AES 128-bit in CBC mode) and verification (HMAC+SHA1). This cipher suite thus requires to implement the following:

  • The Diffie-Hellman Ephemeral (DHE) key exchange protocol. This protocols relies on modular exponentiation over very large numbers, although it is possible to get past it if security is not your primary goal
  • Encryption and decryption using AES 128-bit in CBC mode
  • The SHA1 and SHA256 cryptographic hashing algorithm
  • HMAC, a Message Authentication Code (MAC). A MAC is similar to a cryptographic hash function except that it requires a secret key
  • Optional: if you want to verify the server certificate, you will need to implement the RSA algorithm, which also relies on modular exponentiation as well as SHA1/SHA256/SHA384 (depending on the certificate chain)

Note that you can easily find on the Internet source code for AES, SHA1, SHA256 and HMAC.

This cipher suite is not the strongest available, but is still relatively popular and shows the key mechanisms of a secure TLS interaction. Another cipher suite useful to implement is TLS_RSA_WITH_AES_128_CBC_SHA. The only difference is that the key exchange is using RSA instead of Diffie-Hellman. People interested in implementing a stronger suite can look at TLS_ECDHE_RSA_WITH_AES_128_GCM which requires to implement the Elliptic Curve version of Diffie-Hellman as well as the Galois Counter Mode (GCM) instead of the easier-to-implement CBC mode.

TLS Packets

Any communication in TLS starts with a 5-byte TLS Record header:

typedef struct __attribute__((packed)) {
	uint8_t content_type;
	uint16_t version;
	uint16_t length;
} TLSRecordHeader;

This header may be followed by another TLS header, such as a TLS Handshake header or a TLS Application Data header.

Record Types

Value (Hex) Description
0x14 Change Cipher Spec
0x15 Alert
0x16 Handshake
0x17 Application Data


Handshake Records

typedef struct __attribute__((packed)) {
	uint8_t handshake_type;
	uint8_t[3] length;
} HandshakeRecordHeader;
Value (Hex) Description
0x00 Hello Request
0x01 Client Hello
0x02 Server Hello
0x0B Certificate
0x0C Server Key Exchange
0x0D Certificate Request
0x0E Server Hello Done
0x0F Certificate Verify
0x10 Client Key Exchange
0x14 Finished

TLS Handshake

Like for a TCP connection, a TLS connection starts with a handshake between the client and the server to establish the cipher suite used. See TLS Handshake for more details.

TLS Encryption

Once the handshake has completed, all data sent both way is encrypted using the algorithms and secret keys agreed upon during the TLS Handshake. See TLS Encryption for more information.

Math operations on large integers

Most cipher suites require to perform operations on large integers (e.g. 1024-bit integers) - something that low-level languages such as C/C++ do not handle out of the box. You can either port an existing library (such as GMP) to your operating system or write your own large integer library.

The most common operation (used among others by the RSA and Diffie-Hellman Ephemeral key exchange) is modular exponentiation i.e. computing ab mod c. This operation requires to implement multiplication, addition and modulo (which in turns requires to implement comparison, multiplication and subtraction) over large integers. You can find on Wikipedia an algorithm for modular exponentiation which is quite efficient even when using large numbers.

For better performance, it is recommended to use the full power of 32 (or 64-bit) numbers instead of performing operations one bit at a time. Here is an example:

typedef struct {
	uint16_t size;  // number of 32-bit words
	uint32_t *data;
} LargeInt;

// Adds a to b and stores the result in a
void LargeInt_add(LargeInt *a, LargeInt *b) {
	uint64_t carry = 0;
	uint32_t *carry32 = (uint32_t*)&carry;
	int size;
	if (a->size > b->size) size = b->size;
	else size = a->size;

	for (int i=0; i< size; i++) {
		carry = (uint64_t)a->data[i] + (uint64_t)b->data[i] + carry;
		a->data[i] = carry32[0];
		carry >>= 32;
	}

	if (size+1 <= a->size)
		a->data[size] = carry32[0];
}

The same principle can be applied to multiplication: you can use the traditional long multiplication algorithm, but instead of multiplying decimal digits (or even worse, bits), you can instead 32-bit "digits" which drastically reduce the number of operations to perform.

Mind the endian

Remember that the data sent across the network is in big endian, whereas x86 computers use little endian. Last but not least, the premaster_secret passed to the PRF should be in big endian. Don't forget to perform the necessary conversions.

See Also