Primitives and Interfaces

We next define (informally, but then more formally), two important pieces of the language used in Tink, the Primitive and the Interface.

Primitive

A primitive is a mathematical object corresponding to all algorithms performing some task securely. For example, the AEAD primitive consists of all encryption algorithms which satisfy the security properties which Tink requires of an Aead.

We stress that primitives are not bound to a programming language, or a specific way of accessing them. Instead, one should think of the primitive as the purely mathematical object. For example, if we consider AEAD, fundamentally it will consist of pairs of functions, one which performs encryption, and one which performs decryption.

Interfaces

An interface is a way in which we provide users access to a primitive. For example, we expect that in the future Tink will provides a Mac interface, but also a StreamingMac interface, which allows to compute the mac of data which is not directly loaded into memory.

Note that we explicitly distinguish interfaces and primitives here. This should make clear that the mathematical object to which these two interfaces give access are the same.

Formal definitions

For most readers, the above intuitive explanations are probably enough. Nevertheless, we feel that it can be important sometimes to provide formal definitions of these concepts.

Cryptographic functions

The concept of a cryptographic function is not as important as the concept of a primitive, but we need to introduce it to formally define primitive.

Cryptographic Function

A cryptographic function is a map

\[ f: {\bf K} \times {\bf R} \times {\bf I} \to {\bf O}\]

from a set \({\bf K}\) (the key space), a set \({\bf R} = \{0,1\}^{\infty}\) (randomness, which we assume to be the set of infinite bitstrings), and a set \({\bf I}\) (the input space), to a set \({\bf O}\) (the output space).

It will become clear later why we added a specific randomness parameter.

As an example, we show one possibility how these concepts can be mapped to AES-GCM. For each valid key size \(s_k\), nonce size \(s_n\), and tag size \(s_t\), AES-GCM consists of two cryptographic functions, one for encryption, and one for decryption. Both will have the same key space \({\bf K} = \{0,1\}^{s_k}\).

For the encryption function \(\mathrm{Enc}\), the first \(s_n\) bits of randomness will be used to select the nonce.

Let \({\bf B} = \{0,1\}^8\) denote a byte. The input space of the encryption function is the pairs \({\bf I} = {\bf B}^{*} \times {\bf B}^{*}\) of pairs of byte strings of arbitrary length. The first element of the pair is meant to be the message, the second element is the associated data. The AES-GCM standard has an upper limit on the lengths of the inputs, but we prefer to allow arbitrary lengths, and instead add a special error symbol \(\bot\) to the output space. The output space then becomes \({\bf O} = {\bf B}^* \cup \{\bot\}\), where we arbitrarily define the result of successful computations as the concatenation \((\mathrm{IV} \| \mathrm{ciphertext} \| \mathrm{tag})\) as given in the standard, and output \(\bot\), in case some input is too long. Hence, for a fixed key, the encryption function becomes of type \(\mathrm{Enc}_k : {\bf R} \times {\bf B}^* \times {\bf B}^* \rightarrow {\bf B}^* \cup \{\bot\}\).

For the decryption function \(\mathrm{Dec}\) the key space is the same. The input space coincidentally is the same: \({\bf I} ={\bf B}^* \times {\bf B}^*\), but now the first element is meant to be the output of the encryption function, while the second one is still the associated data.

The output space also happens to be the same \({\bf O} = {\bf B}^* \cup \{\bot\}\) (again a coincidence). The interpretation is somewhat different, as \(\bot\) usually denotes an authentication error (though it will also be the output in case some input is too long).

We stress that the above formalization is not the only option to formalize the standard. For example, one could consider the nonce a part of the input, instead of reading it from the randomness (which results in a very different primitive). Alternatively, one could define the output as a triple containing the nonce, the ciphertext, and the tag (instead of the concatenation). Or one could restrict the key space (somewhat arbitrarily) to \({\bf K} = \{0,1\}^{128} \cup \{0,1\}^{256}\).

Cryptographic Algorithm:

A (symmetric) cryptographic algorithm is a tuple

\[(f_1, ... f_k)\]

of cryptographic functions, where all functions have the same key space. The type of the cryptographic algorithm is the tuple \((({\bf I}_1, {\bf O}_1), \ldots, ({\bf I}_k, {\bf O}_k))\).

For example, for each valid triple \((s_k, s_n, s_t)\) of key, nonce, and tag size, AES-GCM\({}_{s_k, s_n, s_t}\) is a cryptographic algorithm with the two functions \(\mathrm{Enc}\) and \(\mathrm{Dec}\) described above.

Primitives and interfaces

We next define a cryptographic primitive.

Primitive
A primitive is a set of cryptographic algorithms, where all the algorithms have the same type \((({\bf I}_1, {\bf O}_1), \ldots, ({\bf I}_k, {\bf O}_k))\), and the key spaces of the algorithms are pairwise disjoint.

As an example, consider the \(\mathrm{AEAD}\) primitive in Tink. It has multiple algorithms, among those are AES-GCM for key sizes 128 and 256 bits, with nonce size 96 bits, AES-EAX with some key sizes, and XChaCha20Poly1305. They have disjoint key spaces, but all provide the same cryptographic functions \(\mathrm{Enc}\) and \(\mathrm{Dec}\). (We do not see a purpose in somehow collapsing different key sizes of AES-GCM in this formal discussion, but of course one could do so).

Defining primitives

The usual way of thinking of primitives is to first define properties of the cryptographic functions, and then simply considering the primitive to be all such algorithms.

For example, for AEAD we would say that \(\mathrm{Dec}_k(\mathrm{Enc}_k(m, a), a) = m\) is 'always' satisfied (except e.g. if the plaintext \(m\) is too long). In addition, we have security properties; for example, for a random key, the encryption is semantially secure.

The AEAD primitive is then simply the set of all cryptographic algorithms which satisfy these properties. In other words, in practice, when we define a specific primitive, we define it based on properties. We do not give a list of algorithms, as the definition suggests.

Interfaces

An interface in Tink gives access to a primitive, in the sense that it allows to compute an element of the output space from the input space. For example, consider the AEAD interface in Java:

public interface Aead {
  byte[] encrypt(byte[] plaintext, byte[] associated_data) throws GeneralSecurityException;
  byte[] decrypt(byte[] ciphertext, byte[] associated_data) throws GeneralSecurityException;
}

Note that we do not give access to the randomness. Instead, we allow the user to provide elements of the input space. Disallowing access to the randomness is of course on purpose.1

Tink sometimes offers multiple interfaces for a single primitive. This can be very useful, as requirements sometimes differ. Still, doing this comes at a price: in general, the more interfaces one offers, the lower interoperability is. For example, imagine that someone writes a library based on Tink that requires the user to pass in an Aead object (to encrypt something internally). If Tink offers too many different interfaces to the \(\mathrm{AEAD}\) primitive, chances are high that the user does not have an instance ready which works for the key the user picked and the library at the same time. Hence, adding more interfaces is a trade-off.


  1. AEAD ciphers have the property that they are secure against chosen ciphertext attacks, which is guaranteed only if there is no reuse of the nonce. The Aead interface in Tink is designed such that it prevents nonce reuse: the user cannot provide a nonce as input for encryption, instead, a new nonce is randomly generated for each encrypt operation.