Skip to main content
  • Review Article
  • Open access
  • Published:

Overview on Selective Encryption of Image and Video: Challenges and Perspectives

Abstract

In traditional image and video content protection schemes, called fully layered, the whole content is first compressed. Then, the compressed bitstream is entirely encrypted using a standard cipher (DES, AES, IDEA, etc.). The specific characteristics of this kind of data (high-transmission rate with limited bandwidth) make standard encryption algorithms inadequate. Another limitation of fully layered systems consists of altering the whole bitstream syntax which may disable some codec functionalities. Selective encryption is a new trend in image and video content protection. It consists of encrypting only a subset of the data. The aim of selective encryption is to reduce the amount of data to encrypt while preserving a sufficient level of security. This computation saving is very desirable especially in constrained communications (real-time networking, high-definition delivery, and mobile communications with limited computational power devices). In addition, selective encryption allows preserving some codec functionalities such as scalability. This tutorial is intended to give an overview on selective encryption algorithms. The theoretical background of selective encryption, potential applications, challenges, and perspectives is presented.

1. Introduction

Because of the explosion of networks and the huge amount of content transmitted along, securing video content is becoming more and more important. A traditional approach for content access control is to first encode data with a standard compressor and then to perform full encryption of the compressed bitstream with a standard cipher (DES, AES, IDEA, etc.). In this scheme, called fully layered, compression and encryption are totally disjoint processes. The media stream is processed as a classical text data with the assumption that all symbols or bits in the plain text are of equal importance. This scheme is relevant when the transmission of the content is unconstrained. In situations where only few resources are available (real-time networking, high-definition delivery, low memory, low power, or computation capabilities), this approach seems inadequate. Shannon [1] pointed out the specific characteristic of image and video content: high-transmission rate and limited allowed bandwidth, which justifies the inadequacy of standard cryptographic techniques for such content. Another limitation of the fully layered scheme consists of altering the original bitstream syntax. Therefore, many functionalities of the encoding scheme may be disabled (e.g., scalability). Some recent works explored a new way of securing the content, named, partial encryption or selective encryption, soft encryption, perceptual encryption, by applying encryption to a subset of a bitstream. The main goal of selective encryption is to reduce the amount of data to encrypt while achieving a required level of security. An additional feature of selective encryption is to preserve some functionalities of the original bitstream (e.g., scalability). The general approach is to separate the content into two parts. The first part is the public part, it is left unencrypted and made accessible to all users. The second part is the protected part; it is encrypted. Only authorized users have access to protected part. One important feature in selective encryption is to make the protected part as small as possible.

How to define public and protected parts depends on the target application. In some applications (video on demand, database search, etc.), it could be desirable to encourage customers to buy the content. For this purpose, only a soft visual degradation is achieved, so that an attacker would still understand the content but prefer to pay to access the full-quality unencrypted content. However, for sensitive data (e.g., military images/videos, etc.), hard visual degradation could be desirable to completely disguise the visual content. The peak signal-to-noise ratio (PSNR) is the common criterion used to evaluate visual degradation.

This paper is intended to give an overview of state-of-the-art selective encryption algorithms.We introduce selective encryption in a close link to Shannon's work on information theory in Section1.2. Evaluation criteria of selective encryption algorithms are presented in Section 1.2. In Section1.3, we give one classification of selective encryption algorithms. Section 2 proposes potentialapplications of selective encryption. In Section 3, we will present a summary of different selectiveencryption algorithms, their advantages, and limitations. In Section 4, based on previous discussion,we will discuss the principal challenges and perspectives for selective encryption.

1.1. Shannon and Selective Encryption

In [24], Lookabaugh pointed out the closelink between selective encryption and Shannon's work on communication and security [1].It is well known that statistics for image and video data differ much from classical text data. Indeed, imageand video data are strongly correlated and have strong spatial/temporal redundancy. In addition, contrarily tobanking information or other highly sensitive information, the image and video content has high-information ratewith low value from the security point of view. Shannon highlighted the relationship between source statistics andthe ciphertext security; a secure encryption scheme should remove all the redundancies in the plaintext, so that noexploitable correlation is observed in the ciphertext. Shannon introduced the equivocation function as a measure ofhow much a cryptanalyst is uncertain of the plaintext observing a set of ciphertexts.Figure 1 illustrates the definition above. A unicity distance is defined as the minimum number of ciphertext blocks required to yield a unique solution in a ciphertext-only attack, this is given by

(1)

where is the key entropy, and is the plaintext redundancy. From this, we can say that the less redundant the source code is, the more secure the ciphertext is. Shannon favors a fully layered system (see Figure 2), where perfect lossless compression is first performed to remove "all" redundancies from the plaintext (a perfect compressor achieves a rate equal to the source entropy), and then full encryption is applied. Shannon argues that the compressor should be perfect, this means that, given a plaintext , let be its "perfect" compression by the perfect compressor. We can split into two parts and . Then, let and be the encryption of and by the encryption algorithm (see Figure 2). Perfect compression implies that if we know only , then is completely unpredictable. This can be demonstrated using a proof by contradiction. If the statement above was false, then an extra prediction block would yield additional compression of based on . This is impossible since we assumed that the compression is perfect [3]. This result is very interesting; let us consider a configuration, where only a subset of the compressed bitstream requires protection (e.g., ) we can replace the encryption block by a selective encryption one. Only the protected subset is encrypted ( as illustrated in Figure 3), and the security of the ciphertext is preserved for the same reasons discussed above, with the assumption that all redundancies of the source were removed. is protected and unpredictable from because the compressor is perfect.

Figure 1
figure 1

Key equivocation function.

Figure 2
figure 2

Fully layered system: the whole compressed bitstream is encrypted.

Figure 3
figure 3

In perfect compression configuration, a subset of the bitstream can be encrypted; protected part is not predictable from the public one.

Hence, good compression is a good help for the security of selective encryption. The only question that remains is which part to encrypt to obtain a desired visual degradation. In Shannon's theory, the energy of the "perfectly" compressed plaintext is uniformly distributed, thus encrypting a fraction of the compressed plaintext would yield the same fraction of distortion on the ciphertext. However, most existing compression algorithms are not perfect and concentrate information energy unevenly in the bitstream; for example, in JPEG, the bits that encode the DC coefficients have stronger impact on the reconstruction quality than the AC coefficients. In wavelet-based compression algorithms, most of the signal energy is concentrated in lower resolutions. One advantage of energy concentration is that it gives a hint about which part of the bitstream to encrypt. Most state-of-the-art selective encryption algorithms exploit this energy concentration.

This gap between theoretical selective encryption which is based on perfect compression and existing selective encryption algorithms makes the security aspect more difficult to evaluate. In most cases, visual degradation is used as the exclusive security measure of selective encryption by assuming that harder visual distortion implies more security. It turns out that this argument is not relevant as can be observed in related works.

1.2. Evaluation Criteria

We need to define a set of evaluation criteria that will help evaluating and comparing selective encryption algorithms. Some criteria listed below are gathered from the literature. We introduce new criteria that were not considered previously.

(I) Tunability (T)

Most of the proposed algorithms in the literature use static definition of encrypted part and encryption parameters. This property limits the usability of the algorithm to a restricted set of applications. It could be very desirable to be able to dynamically define the encrypted part and the encryption parameters with respect to different applications and requirements.

(II) Visual Degradation (VD)

This criterion measures the perceptual distortion of the cipher image (or video) with respect to the plain image (or video). It assumes that the cipher image (or video) can be decoded and viewed without decryption. This assumption is not satisfied for all existing algorithms. In some applications, it could be desirable to achieve enough visual degradation, so that an attacker would still understand the content but prefer to pay to access the unencrypted content. However, for sensitive data (e.g., military images/videos), high visual degradation could be desirable to completely disguise the visual content. For this reason, tunability property is very important to be able to tune the visual degradation of the encrypted content depending on the target application and requirements. The peak signal-to-noise ratio (PSNR) is the main metric used in the literature to measure visual degradation. Visual degradation is a subjective criterion that is why it is difficult to define a threshold for acceptable visual distortion regarding a given application.

(III) Cryptographic Security (CS)

Most of the research works on selective encryption evaluate the security level based only on visual degradation. In [5], Tang proposes a selective encryption algorithm based on DES encryption of DC coefficients and replacing the zigzag scan of the AC coefficients by a random permutation. The visual degradation achieved is very high, but the cryptographic security of the algorithm is very weak as pointed out in [6, 7]. The cryptographic security should rely on

  1. (i)

    the encryption key (of a well-scrutinized encryption algorithm),

  2. (ii)

    unpredictability of the encrypted part.

This criterion will be explained in more detail in Section 4.1.2.

(IV) Encryption Ratio (ER)

This criterion measures the ratio between the size of the encrypted part and the whole data size. Encryption ratio has to be minimized by selective encryption.

(V) Compression Friendliness (CF)

A selective encryption algorithm is considered compression friendly if it has no or very little impact on data compression efficiency. Some selective encryption algorithms impact data compressibility or introduce additional data that is necessary for decryption. It is desirable that this impact remains limited.

(VI) Format Compliance (FC)

The encrypted bitstream should be compliant with the compressor. Any standard decoder should be able to decode the encrypted bitstream without decryption. This property is very important because it allows preserving some features of the compression algorithm used (e.g., scalability).

(VII) Error Tolerance (ET)

This criterion is not very considered in the literature. It is very desirable especially in networks prone to errors. As standard ciphers are required to have strong avalanche effect, a single bit error that occurs in the encrypted bitstream during transmission will propagate many other bits after decryption. This causes decoding failure or important distortion to the plain data at the receiver side. A challenge is to design a secure selective encryption algorithm that trades off important avalanche effect and error tolerance.

1.3. Classification of Selective Encryption Algorithms

One possible classification of selective encryption algorithm is relative to when encryption is performed with respect to compression. This classification is adequate since it has intrinsic consequences on selective encryption algorithms behavior. We consider three classes of algorithms as follows.

(I) Precompression

Selective encryption algorithms from this class perform encryption before compression (resp., decompression before decryption) (see Figure 4). Note that these algorithms are inherently format compliant and generally inapplicable for lossy compression. Finally, in most cases, performing encryption prior to compression causes bandwidth expansion which adversely impact compression efficiency. Hence, this class of algorithms is generally not compression friendly.

Figure 4
figure 4

Precompression approach.

(II) Incompression

Selective encryption algorithms from this class perform joint compression and encryption (resp., joint decompression and decryption) (see Figure 5). Algorithms from this class imply modifications of both encoder and decoder which may adversely impact format compliance and compression friendliness.

Figure 5
figure 5

Incompression approach.

(III) Postcompression

Selective encryption algorithms from this class perform compression before encryption (resp., decryption before decompression) (see Figure 6). This class of algorithms is generally compression friendly; small overhead can be introduced to send the encryption key or some information about encryption. Encryption and decryption do not need modifications at encoder or decoder sides. Finally, it was suggested in [8] that postcompression class is inherently nonformat compliant. In this paper, we give example of existing algorithms that achieve format compliance by using pattern-constrained encryption.

Figure 6
figure 6

Postcompression approach.

2. Applications

Digital multimedia content is becoming widely used over networks and public channels (cable, satellite, wireless networks, Internet, etc.), which is unsecured transmission media. Many applications that exploit these channels (pay-TV, videoconferences, medical imaging, etc.) need to rely on access control systems to protect their content. Standard cryptographic techniques can guarantee high level of security but at the cost of expensive implementation and important transmission delays. Selective encryption comes as an alternative that aims at providing sufficient security with an important gain in computational complexity and delays. This allows a variety of possible applications for selective encryption. Below, we give a set of potential applications as follows.

(I) Mobile Communication

PDAs, mobile phones, and other mobile terminals are more and more used for multimedia communication (voice, image, video, etc.) while still requiring copyright protection and access control. Their moderate resolution, computational power, and limited battery life impose to make an effort in reducing the encryption computational complexity to save battery life, silicon area, and cost. Image and video content have lower value than banking information, for example. Thus, it is not necessary to encrypt the whole data. It would be enough to degrade content quality so that people would prefer to buy a full-quality version.

(II) Monitoring Encrypted Content

One can imagine a situation where the encrypted content itself is usable for monitoring. For example, in many applications such as military images, video surveillance (where some faces have to be scrambled), media audience, identifying a partially encrypted content without decryption can be desirable.

(III) Multiple Encryptions

Efficient overlay of more than one encryption system within a single bitstream can be very desirable. In a scheme where a TV broadcaster using an encryption system that is proprietary of one supplier wants to introduce new encryption systems of new independent suppliers, he would like to optimize bandwidth use by avoiding duplicating every channel on the network. Selective encryption could be very helpful; only a small fraction of the channel is duplicated (the part that will be encrypted). Each duplicated part will go through one supplier equipment and be encrypted by its encryption system. The remaining part (the shared one) will be sent once in the network and in the clear. Sony's Passage system proposed for the US cable market is a concrete example of this application [9]. This solution is particularly desirable when the suppliers are not willing to agree on a shared scrambling solution as done in DVB Simulcrypt [10].

(IV) Transcodability/Scalability of Encrypted Content

These are very desirable properties in image and video communication. Some compression algorithms such as JPEG-2000 allow natural transcodability/scalability thanks to its embedded-code nature. For some other algorithms it is necessary to decompress and recompress at lower bitrate at intermediate routers of the transmission channel. When the content is fully encrypted, decryption, decompression, and recompression at lower bitrate and reencryption are needed at intermediate routers. It may also cause important transmission delays and defeat the security of the system since access to the encryption key is needed at the network nodes. Selective encryption could be a good response to this problem. Encrypting a small fraction of the content while sending the remainder in the clear allows transcodability and scalability without accessing the encryption keys; the basic part (needed by all users) is sent in the clear (unencrypted) while the encrypted enhancement part is sent only to authorized users who paid to access the full-quality content.

(V) Database Search

Selectively encrypted content can be used as low-quality previews that are made public. This preview will be used as a catalog to select content and pay to be able to decrypt and view it.

(VI) Renewable Security Systems

In their eternal battle against pirates, digital rights management systems have to periodically update their technologies and equipments all along the network. Changing the whole infrastructure would be very costly. Selective encryption can avoid the burden of having to change a whole system. Because of computational complexity saving due to selective encryption, it is possible to move to software solutions which are less expensive and can be easily and economically updated.

3. Related Work

3.1. Precompression

Tang, 1996. The basic idea of the selective encryption algorithm proposed in [5] is to selectively encrypt I-frames of the MPEG stream; DES on DC coefficients (preferably in CBC mode to avoid dictionary attack) and random permutation on the AC coefficients instead of the standard zigzag. This is done before compression.

  1. (a)

    Tunability: the algorithm is not tunable since encryption parameters are static.

  2. (b)

    Visual degradation: since intraframes are very important in MPEG compression (all B- and P-frames are computed accordingly to I-frames), by encrypting them, high-visual degradation is achieved.

  3. (c)

    Cryptographic security: the AC coefficients zigzag scan used in I-frames encoding is replaced by a pseudorandom permutation. Statistics of the AC coefficients are preserved. Therefore, ciphertext-only, chosen, and known-plaintext attacks are feasible and allow recovering all AC coefficients. Qiao et al. [6] and Uehara and Safavi-Naini [7] propose cryptanalytic attacks (chosen-plaintext attacks) on this approach. The DC coefficient can be set to a fixed value while still having a comprehensible result, and then a chosen or known-plaintext attack can be conducted to reconstruct the AC coefficients and get a semantically good reconstruction [11]. Two conclusions can be made. First, energy concentration is not systematically a good criterion for selective encryption. Second, high-visual distortion does not mean high security level.

  4. (d)

    Encryption ratio: not specified.

  5. (e)

    Compression friendliness: the nonoptimal scanning of the DCT coefficients introduces loss in compression efficiency of about 40% [6]. Indeed, this adversely affects Huffman encoding (due to distortion of the probability distribution of run-lengths for AC coefficients).

  6. (f)

    Format compliance: the proposed scheme is compliant to JPEG and MPEG standards.

  7. (g)

    Error tolerance: the proposed algorithm is not tolerant to errors that occur at DC coefficients. The avalanche effect of DES in CBC mode causes important error propagation.

  8. (h)

    Data type: image and video.

Shi and Bhargava, 1998. In [12], the authors proposed video encryption algorithm (VEA) which uses a secret key to randomly change the signs of all DCT coefficients in an MPEG stream (this is justified by the fact that DCT sign bits are very random, thus neither predictable nor compressible). In [13], the authors present a new version of VEA reducing computational complexity; it consists in encrypting the sign bits of differential values of DC coefficients of I-frames and sign bits of differential values of motion vectors of B- and P-frames.

  1. (a)

    Tunability: not tunable, the proposed algorithm relies on static parameters.

  2. (b)

    Visual degradation: high-visual degradation due to the encryption of DCT coefficients and motion vectors.

  3. (c)

    Cryptographic security: the first version of VEA [12] is only secure if the secret key is used once. Otherwise, knowing one plaintext and the corresponding ciphertext, the secret key can be computed by XORing the DCT sign bits. Both versions of VEA are vulnerable to chosen plaintext attacks; in [12], it is feasible to create a repetitive/periodic pattern and then compute its inverse DCT. The encryption of the image obtained will allow us to get the key length and even compute the secret key by chosen-plaintext attack.

  4. (d)

    Encryption ratio: not specified.

  5. (e)

    Compression friendliness: not specified.

  6. (f)

    Format compliance: the encrypted bitstream is MPEG compliant.

  7. (g)

    Error tolerance: any error in motion vector bits may have important adverse impact on the decidability of the bitstream.

  8. (h)

    Data type: video.

Shi, Wang and Bhargava, 1999. In [14], a new version of the modified VEA presented in [13] is proposed, called real-time video encryption algorithm for (RVEA). It encrypts selected sign bits of the DC coefficients and/or sign bits of motion vectors using DES or IDEA. Sixty four sign bits are encrypted per frame (starting by DC coefficients because they concentrate most of the frame energy).

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: changing the sign bit of one DC coefficient will affect all the following ones in I-frames (since they are differentially encoded), the same thing applies for motion vectors in P- and B-frames; the sign changes not only the direction but also motion magnitude, since they are differentially encoded. The visual degradation achieved is very high.

  3. (c)

    Cryptographic security: bounding the encryption to the first 64 sign bits is not sufficient from the security point of view. Indeed, when considering high-resolution videos with high bitrate, the first 64?bits represent a very small fraction of the data.

  4. (d)

    Encryption ratio: only 64?bits are encrypted per frame. Thus, encryption reduction depends on the image bitrate.

  5. (e)

    Compression friendliness: not specified.

  6. (f)

    Format compliance: the proposed scheme is MPEG compliant.

  7. (g)

    Error tolerance: poor error tolerance is achieved due to motion information encryption.

  8. (h)

    Data type: video.

Podesser, Schmidt and Uhl, 2002. In [15], a selective bitplane encryption (using AES) is proposed, several experiments were conducted on 8-bit grayscale images, and the main results retained are the following: (1) encrypting only the MSB is not secure; a replacement attack is possible [15], (2) encrypting the first two MSBs gives hard visual degradation, and (3) encrypting three bitplanes gives very hard visual degradation.

  1. (a)

    Tunability: the algorithm is not tunable; a fixed number of bits need to be encrypted to guarantee confidentiality.

  2. (b)

    Visual degradation: for 8?bits per pixel uncompressed image, hard visual degradation (of 9?dB) can be observed for a minimum of 3?MSB bits encrypted.

  3. (c)

    Cryptographic security: even when a secure cipher is used (AES), the selective encryption algorithm proposed is vulnerable to replacement attacks [15]. This attack does not break AES but replaces the encrypted data with an intelligible one. It is worth to note that visual distortion is a subjective criterion and does not allow to measure security as illustrated in this example.

  4. (d)

    Encryption ratio: at least 3 bitplanes over 8 (more than 37.5%) of the bitstream have to be encrypted using AES to achieve sufficient security.

  5. (e)

    Compression friendliness: this algorithm is intended for uncompressed data. However, important bandwidth expansion is introduced by selectively encrypting MSBs which adversely impact the compressibility of encrypted images.

  6. (f)

    Format compliance: as a precompression algorithm, it is format compliant.

  7. (g)

    Error tolerance: the avalanche effect of AES causes important error propagation.

  8. (h)

    Data type: uncompressed image.

Zeng and Lei, 2003. In [16], selective encryption in the frequency domain ( DCT and wavelet domains) is proposed. The general scheme consists of selective scrambling of coefficients by using different primitives (selective bit scrambling, block shuffling, and/or rotation).

(I) Wavelet Transform Case

The proposed scheme combines two primitives.

  1. (i)

    Selective bit scrambling: it is a bitplane selective encryption; each individual coefficient bitplane is partitioned into a sign bit, which is very random and uncorrelated with neighboring coefficient sign bits, thus highly unpredictable. Then significance bits (the first nonzero magnitude bit and all subsequent zero bits if any), these give a range for the coefficient value. These bits have low entropy and thus are highly compressible. Finally, the refinement bits (all remaining bits) are uncorrelated with neighboring coefficients and are randomly distributed.The authors propose to randomly scramble sign bits and refinement bits. The encryption algorithm is not specified.

  2. (ii)

    Block shuffling: the basic idea is to shuffle the arrangement of coefficients within a block in a way to preserve some spatial correlation; this can achieve sufficient security without compromising compression efficiency. Each subband is split into equal-sized blocks (the block size can be different for each subband). Within the same subband, block coefficients are shuffled according to a shuffling table generated using a secret key (this table can be different from a subband to another or from one frame to another). Since the shuffling is block based, it is expected that most 2D local subband statistics are preserved and compression not greatly impacted.

    1. (a)

      Tunability: not tunable.

    2. (b)

      Visual degradation: high-visual degradation is achieved. Indeed, coefficient change at low resolutions propagates to larger parts at higher resolutions.

    3. (c)

      Cryptographic security: attacking the lowest pyramid level of the wavelet decomposition is much simpler (small block size and high energy concentration) this helps to construct the subsequent levels by correlation.

    4. (d)

      Encryption ratio: about 20% of the data has to be encrypted.

    5. (e)

      Compression friendliness: little impact on compression efficiency is observed (less than 5%).

    6. (f)

      Format compliance: the algorithm proposed is fully compliant to DWT-based compression since the encryption is performed in the transform domain prior to compression.

    7. (g)

      Error tolerance: depends on the encryption algorithm used to scramble sign bits.

    8. (h)

      Data type: image and video.

(II) DCT Transform Case

The DCT coefficients can be considered as individual local frequency components located at some subband. The same scrambling operations as described above (block shuffling and sign bits change) can be applied on these "subbands." I-, B-, and P-frames are processed in different manners. For I-frames, the image is first split into segments of macroblocks (e.g., a segment can be a slice), blocks/macroblocks of a segment can be spatially disjoint and chosen at random spatial positions within the frame. Within each segment, DCT coefficients at the same frequency location are shuffled together (in order to preserve coefficients distribution property). Then, sign bits of AC coefficients are randomly changed and DC coefficients (which are always positive for intracoded blocks) are flipped with respective threshold (e.g., zmaximum DC value/2). There may be many intracoded blocks in P- and B-frames. At least DCT coefficients of the same intracoded block in P- or B-frames are shuffled. Sign bits of motion vectors are also scrambled.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: high-visual degradation is achieved. Indeed, most of the image energy is concentrated in DC coefficients, thus, encrypting them affects considerably the image content.

  3. (c)

    Cryptographic security: vulnerable to chosen and known plaintext attacks since it is based only on permutations. In addition, replacing the DC coefficients with a fixed value still gives an intelligible version of the image.

  4. (d)

    Encryption ratio: if we consider only the AC sign bit encryption, it represents 16 to 20% of data. This is relatively high [16].

  5. (e)

    Compression friendliness: a bitrate increase by about 20% is observed.

  6. (f)

    Format compliance: compliant with JPEG and MPEG standards.

  7. (g)

    Error tolerance: depends on the encryption algorithm used to scramble sign bits.

  8. (h)

    Data type: image and video.

Van de Ville, Philips, Van de Walle, and Lemahieu, 2004. A particular orthonormal transform is used in this proposal, the discrete prolate spheroidal sequences (DPSSs) [17]. This is an adapted base to represent band limited signals (which is the case for 2D images). A bandwidth preserving scrambling is proposed; the image signal is projected on the DPSS (which is a base for band limited signals). Then, the transform coefficients are scrambled using an orthonormal (thus energy preserving) transform.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: depends on the number of coefficients to scramble.

  3. (c)

    Cryptographic security: a large key space is obtained due to the use of equivalent Hadamard matrices in the scrambling. However, statistical correlations exist between coefficients to encrypt; this leakage has been exploited to mount an error-concealment-based attack (ECA) [18]. Finally, the Hadamard matrix-based encryption has insufficient diffusion, this leads to a reduction in key space. Experimental results show that when guessing 100 random keys, the best recovered image has low-visual degradation compared to the unencrypted one.

  4. (d)

    Encryption ratio: variable, it depends on the number of coefficients to scramble.

  5. (e)

    Compression friendliness: limited bandwidth expansion is allowed by this proposal. However, the major drawback of this scheme is that the encryption is lossy. Indeed, the encryption process implies a rounding operation that induces precision loss (so inadequate to lossless compression).

  6. (f)

    Format compliance: as a precompression algorithm, it is format compliant.

  7. (g)

    Error tolerance: important error propagation due to the avalanche property of Hadamard matrices used in encryption.

  8. (h)

    Data type: image.

3.2. In-Compression

Meyer and Gadegast, 1995. The algorithm is proposed for MPEG selective encryption (called SECMPEG). It modifies the MPEG stream [19]. It uses RSA or DES (in CBC mode) and implements 4 levels of security.

  1. (i)

    Encrypting all stream headers.

  2. (ii)

    Encrypting all stream headers and all DC and lower AC coefficients of intracoded blocks.

  3. (iii)

    Encrypting I-frames and all I-blocks in P- and B-frames.

  4. (iv)

    Encrypting all the bitstreams.

    1. (a)

      Tunability: the algorithm can be considered as tunable since many security levels are allowed.

    2. (b)

      Visual degradation: the encrypted content is not MPEG compliant, and thus cannot be viewed without decryption.

    3. (c)

      Cryptographic security: many security levels can be obtained. Encrypting only stream headers is not sufficient since this part is easily predictable.

    4. (d)

      Encryption ratio: the number of I blocks in P or B frames can be of the same order as the number of I blocks in I frames. This reduces considerably the efficiency of the selective encryption scheme [20].

    5. (e)

      Compression friendliness: no impact is observed on the compression efficiency.

    6. (f)

      Format compliance: the encoder proposed is not MPEG compliant since it requires major additions and changes to the standard; a special encoder/decoder is required to read unencrypted SECMPEG streams.

    7. (g)

      Error tolerance: the ciphers used for encryption have important avalanche properties, especially in CBC mode. Hence, poor error tolerance is achieved.

    8. (h)

      Data type: video.

Wu and Kuo, 2001. In [11, 21], based on a set of observations, the authors point out that energy concentration does not mean intelligibility concentration. Indeed, they discussed the technique proposed by Tang [5]. They show that by fixing DC values at a fixed value and recovering AC coefficients (by known or chosen plaintext attacks), a semantically good reconstruction of the image is obtained. Even using a very small fraction of the AC coefficients does not fully destroy the image semantic content. The authors argued that both orthogonal transform-based compression algorithms followed by quantization and compression algorithms that end with an entropy coder stage are bad candidates to selective encryption. They investigate another approach that turns entropy coders into ciphers. They propose two schemes for the most popular entropy coders: multiple Huffman tables (MHTs) for the Huffman coder and multiple state index (MSI) for the QM arithmetic coder.

(I) MHT

The authors propose a method using multiple Huffman coding tables. Four Huffman tables are published, and millions of different tables are generated using a technique called Huffman tree mutation [11, 21].

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: very high-visual degradation can be achieved.

  3. (c)

    Cryptographic security: Gillman and Rivest [22] showed that decoding a Huffman coded bitstream without any knowledge about the Huffman coding tables would be very difficult. However, the basic MHT is vulnerable to known and chosen plaintext attacks as pointed out in [23].

  4. (d)

    Encryption ratio: variable, it depends on the size of the data to encrypt. Indeed, the larger the data is, the smaller the relative size of the Huffman table will be.

  5. (e)

    Compression friendliness: no impact on compression is observed, the encryption does not affect the probability distribution of symbols.

  6. (f)

    Format compliance: not compliant, the decoder needs to decrypt the Huffman table to be able to decompress.

  7. (g)

    Error tolerance: as Huffman coding relies on variable length codes, any single codeword error may propagate at many subsequent codewords.

  8. (h)

    Data type: image and video.

(II) MSI

The arithmetic QM coder is based on an initial state index; the idea is to select 4 published initial state indices and to use them in a random but secret order.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: very high-visual degradation can be achieved.

  3. (c)

    Cryptographic security: high security level. It is very difficult to decode the bitstream without the knowledge of the state index used to initialize the MQ coder.

  4. (d)

    Encryption ratio: very low encryption ratio is achieved. However, the computation cost is relatively high; this is due to multiple updates in the QM coder states.

  5. (e)

    Compression friendliness: a little effect on compression efficiency is observed. This is due to multiple initializations of the QM coder due to initial state index changing.

  6. (f)

    Format compliance: not compliant. It is impossible to decode without the encryption key.

  7. (g)

    Error tolerance: frequent reset of state indices allows high error tolerance.

  8. (h)

    Data type: image and video.

Wen, Severa, Zeng, Luttrel, and Jin, 2002. A general selective encryption approach for fixed and variable length codes (FLC and VLC) is proposed in [24]. FLC and VLC codewords corresponding to important information carrying fields are selected. Then, each codeword in the VLC and FLC (if the FLC code space is not full) table is assigned a fixed length code index, when we want to encrypt the concatenation of some VLC (or FLC) codewords, only the indices are encrypted (using DES). Then the encrypted concatenated indices are mapped back to a different but existing VLC.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: very high-visual degradation can be achieved.

  3. (c)

    Cryptographic security: acceptable security level based on the secrecy of the Huffman table.

  4. (d)

    Encryption ratio: good encryption reduction (<15%).

  5. (e)

    Compression friendliness: the encryption process compromises the compression efficiency. Indeed, some short VLC codewords (which are the most probable/frequent) can be replaced by longer ones. This is antagonistic with the entropy coding idea.

  6. (f)

    Format compliance: the proposed scheme isfully compliant to any compression algorithm that uses VLC or FLC entropy coder.

  7. (g)

    Error tolerance: any error affecting one variable length code may potentially propagate to subsequent codewords.

  8. (h)

    Data type: image and video.

Pommer and Uhl, 2003. The algorithm proposed in [25] is based on AES encryption of the header information of wavelet packet encoding of an image, this header specifies the subband tree structure.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: the encrypted content cannot be viewed without decryption.

  3. (c)

    Cryptographic security: no secure against chosen plaintext attack. Because statistical properties of wavelet coefficients are preserved by the encryption, then the approximation subband can be reconstructed. This will give the attacker the size of the approximation subband (lower resolution) and then neighboring subbands can be reconstructed since close subbands contain highly correlated coefficients.

  4. (d)

    Encryption ratio: the encrypted part represents a very small fraction of the bitstream.

  5. (e)

    Compression friendliness: the subband tree is pseudorandomly generated. This adversely impacts the compression efficiency.

  6. (f)

    Format compliance: no format compliant; the encoder does not use standard wavelet packet decomposition.

  7. (g)

    Error tolerance: the avalanche effect of AES cipher causes poor error tolerance.

  8. (h)

    Data type: image.

Lian, Sun, and Wang, 2004. A selective encryption algorithm is proposed for JPEG2000 standard [26]. A quality factor controls the strength of the encryption algorithm. The encryption algorithm is performed in a bottom-up order where detail data (high-resolution coefficients) are encrypted first. The algorithm consists in three steps.

(I) Selective Sign Bit Encryption

A selected number () of sign bits are encrypted using a chaotic stream cipher. The quality factor tunes .

(II) Intra-Bitplane Permutation

For each bitplane, in each code block, a pseudorandom space filling curve (PR-SFC) is used to permute bits of the same bitplane. It seems that the algorithm uses the same SFC for all bitplanes in a given bitplane. Hence, it is a simple coefficient permutation; this is not secure against ciphertext-only, chosen- and known-plaintext attacks [27, 28]. Each 4 bits of a stripe column are grouped together to form a unit element for the permutation (to be compliant to the JPEG2000 standard). The SFC is chosen to preserve spatial correlation of DWT coefficients. The quality factor tunes the number of code-blocks to be intra-permuted.

(III) Interblocks Permutation

Code blocks within the same subband are permuted using a particular 2D chaotic map, the Cat map. If the quality factor is above a certain threshold, no intercodeblock permutation is performed.

  1. (a)

    Tunability: dynamic encryption parameters can be fine tuned to control visual distortion.

  2. (b)

    Visual degradation: the encryption strength (and hence the visual degradation) can be fine tuned using a quality factor.

  3. (c)

    Cryptographic security: low diffusion effect, the ciphertext is not key sensitive enough. In addition, SFC is vulnerable to ciphertext-only, chosen- and known-plaintext attacks [27, 28].

  4. (d)

    Encryption ratio: variable, it depends on the parameters selected for encryption.

  5. (e)

    Compression friendliness: because bitplane encoding depends from the previous bitplanes encoding, independently encrypting each bitplane of a codeblock will inevitably impact the arithmetic coder compression performance.

  6. (f)

    Format compliance: JPEG2000 compliant.

  7. (g)

    Error tolerance: chaotic stream ciphers allow high error tolerance since each sign bit is independently scrambled by a XOR.

  8. (h)

    Data type: image and video.

Grangetto, Magli, and Olmo, 2006. The basic approach proposed in [29] is a randomization of the arithmetic coder. This is achieved by randomly swapping the most probable symbol (MSP) and least probable symbol (LSP) intervals. Since only the interval magnitude is important for encoding, the compression performance remains unchanged. Both total and selective encryptions are possible by choosing the layers or resolution levels to encrypt. Selective region encryption is made possible since JPEG2000 is a codeblock-based algorithm. To encrypt a region of interest, we have to apply the encryption on the codeblocks contributing to precincts of the region considered.

  1. (a)

    Tunability: selective to full encryption is allowed. Selective region encryption is allowed with dynamic selection of codeblocks to encrypt.

  2. (b)

    Visual degradation: depends on the number of codeblocks to be encrypted.

  3. (c)

    Cryptographic security: low security, brute force attack is feasible. Indeed, trying 30 millions random keys will allow retrieving the secret encryption key.

  4. (d)

    Encryption ratio: variable, depends on the number of codeblocks to be encrypted.

  5. (e)

    Compression friendliness: no impact on compression.

  6. (f)

    Format compliance: fully compliant to JPEG2000.

  7. (g)

    Error tolerance: since arithmetic coding is context based, any error will propagate to subsequent contexts and adversely impact probabilities computations.

  8. (h)

    Data type: image and video.

Bergeron and Lamy-Bergot, 2005. A syntax compliant encryption algorithm is proposed for H.264/AVC [30]. Encryption is inserted within the encoder. To achieve syntax compliance, selected compliant codewords are randomly permuted with other compliant codewords. The shift used for permutation is determined by the AES counter.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: 25 to 30?dB PSNR drop is achieved. However, blocks at the border of video frames cannot be encrypted. This leakage could be important in some applications.

  3. (c)

    Cryptographic security: the main drawback of this scheme is the lack of cryptographic security. Indeed, the security of the encrypted bitstream does not depend more on the AES cipher. It depends on the size of the compliant codewords. Hence, the diffusion of the AES cipher is reduced to the plaintext space size. In addition, a bias is introduced in the ciphertext. This bias depends on the key size and the plaintext space size.

  4. (d)

    Encryption ratio: the paper does not give precise values for overall encryption ratio. However, it is mentioned that about 25% of I-slices and 10–15% of P-slices are encrypted. Since intracoded slices can represent 30–60%, the encryption ratio is expected to be relatively high.

  5. (e)

    Compression friendliness: negligible overhead is introduced (0.1%) by the insertion of encryption key.

  6. (f)

    Format compliance: the encrypted bitstream is decodable by any standard decoder without decryption. However, for decryption, a modified decoder is required.

  7. (g)

    Error tolerance: the randomness of the permutation causes poor error tolerance. Indeed, one single bit error could result in many bit errors if the new permuted codewords have many different bits.

  8. (h)

    Data type: video.

Engel and Uhl, 2006. In [31], a JPEG2000 lightweight encryption scheme is proposed. Only lower resolutions are compressed with classical dyadic wavelet transform. For higher resolutions, the algorithm relies on a secret transform domain constructed with anisotropic wavelet packets (AWPs). The aim of this proposal is to allow transparent encryption for applications requiring low-resolution preview. Therefore, low resolution is accessible by all users and decodable with any JPEG2000 compliant codec.

  1. (a)

    Tunability: limited tunability is permitted. Only lightweight encryption is allowed. Indeed, this algorithm does not allow encrypting lower resolutions. It is intended to particular applications with public thumbnail preview.

  2. (b)

    Visual degradation: high-visual degradation is achievable.

  3. (c)

    Cryptographic security: encryption key space is very large ensuring high security level.

  4. (d)

    Encryption ratio: very low, only the subband tree structure is kept secret.

  5. (e)

    Compression friendliness: only a slight drop in compression performance can be observed.

  6. (f)

    Format compliance: no compliant to JPEG2000, the encrypted bitstream is not decodable without the secret wavelet transform.

  7. (g)

    Error tolerance: it offers poor error tolerance since any error in the encrypted parameters for generating random AWP would severely impact the decoding of the bitstream.

  8. (h)

    Data type: image and video.

3.3. Postcompression

Spanos and Maples, 1995. Aegis mechanism is proposed [32]; it consists in DES (CBC mode) encryption of intraframes, video stream header (all the decoding initialization parameters: frame size, frame rate, bitrate, etc.), and the ISO 32 bits end code of the MPEG stream. Experimental results were conducted by the authors showing the importance of selective encryption in high bitrate video transmission to achieve acceptable end-to-end delay. It is also shown that full encryption creates bottleneck (important end-to-end delay and overflow in buffers) in high bitrate distributed video applications.

  1. (a)

    Tunability: no tunability is allowed.

  2. (b)

    Visual degradation: the encrypted content is not MPEG compliant, and thus cannot be viewed without decryption.

  3. (c)

    Cryptographic security: Agi and Gong [33] showed that this algorithm has low security since encrypting of only I-frames offers limited security because of the intercorrelation of frames; some blocks are intracoded in P- and B-frames. Furthermore, P- and B-frames are highly correlated when they correspond to the same I-frame. They also underlined that it is unwise to encrypt stream headers since they are predictable and can be broken by plaintext-ciphertext pairs. Alattar and Al-Regib [34], apparently unaware of Agi and Gong work [33], stressed the same security leakage.

  4. (d)

    Encryption ratio: I-frames alone occupy about 30 to 60% of the whole video stream, which is quite high. Thus, no important encryption saving is achieved. It is suggested that reducing I-frames frequency could achieve better encryption efficiency; on the other hand, this will adversely impact compression performance and random acquisition delay in case of channel change.

  5. (e)

    Compression friendliness: the encryption is performed after compression, thus no impact is observed on the compression efficiency.

  6. (f)

    Format compliance: the resulting bitstream is not MPEG compliant; encrypting the end code conceals the MPEG syntax.

  7. (g)

    Error tolerance: DES in CBC mode offers poor error tolerance due to its avalanche property.

  8. (h)

    Data type: video.

Alattar and Al-Regib, 1999. In [34], the security of Spanos and Maples algorithm is evaluated [32]. It is argued that motion information has to be disguised when motion information is very important to protect (e.g., military). Spanos and Maples algorithm [32] reveals motion information especially when many blocks are intracoded in P- and B-frames. The proposed technique is an enhancement and improvement to the method proposed in [32]. It requires the transmission of additional information. The proposed scheme consists in the following.

  1. (i)

    Take all I-blocks and parse the obtained stream into 64-bit segments, encrypt all of them using DES if the last segment is less than 64-bits then leave it unencrypted.

  2. (ii)

    For predicted blocks in P- and B-frames.

  3. (iii)

    Group all predicted block headers in one header sub-bitstream.

  4. (iv)

    Group all prediction block data in one data sub-bitstream.

  5. (v)

    Parse the header sub-bitstream into 64-bit segments and DES encrypt them.

  6. (vi)

    Concatenate the encrypted header sub-bitstream with the data sub-bitstream.

  7. (vii)

    To allow decoding, the length of the header sub-bitstream is transmitted in each slice (in the user section of each slice), this introduces a slight overhead.

    1. (a)

      Tunability: no tunability is allowed.

    2. (b)

      Visual degradation: the encrypted content is not MPEG compliant, and thus cannot be viewed without decryption.

    3. (c)

      Cryptographic security: the algorithm can be considered as secure enough.

    4. (d)

      Encryption ratio: high encryption ratio is required (intracoded blocks represent 30% to 60% of the bitstream).

    5. (e)

      Compression friendliness: a slight overhead is introduced to indicate the header sub-bitstream length.

    6. (f)

      Format compliance: no MPEG compliant; a parser module has to be implemented to interface the encryption/decryption system with the MPEG-1 encoder/decoder.

    7. (g)

      Error tolerance: poor error tolerance is achieved due to avalanche property of DES cipher.

    8. (h)

      Data type: video.

Cheng and Li, 2000. In [35], selective encryption is proposed for quadtree compression algorithm. The compressor output is partitioned into two parts; an "important part" that consists of the quadtree structure, and an "unimportant part" that consists of the leaf values. No encryption algorithm is specified, only the important part is encrypted.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: high-visual degradation can be achieved only for images with high information rate (many colors, details, etc.). But quadtree compression is more efficient at low bitrates (for images with low information).

  3. (c)

    Cryptographic security: no encryption algorithm is specified in [35]. Independently from the encryption algorithm used, brute force attack is practical for low information images where quadtree structure is very simple.

  4. (d)

    Encryption ratio: low encryption ratio is required for typical images with low information content, about 14%. For high bitrate image, the encrypted part can reach about 50%.

  5. (e)

    Compression friendliness: the encryption is performed after compression, no impact on the compression efficiency is observed.

  6. (f)

    Format compliance: quadtree is not part of any compression standard.

  7. (g)

    Error tolerance: depends on the encryption primitive used to encrypt quadtree structure.

  8. (h)

    Data type: image.

Cheng and Li, 2000. The wavelet-based compression algorithm SPIHT partitions the data into two parts [35]. The first part can be considered as the "important part," it consists of significant information (of coefficients and sets) for the two highest levels of the pyramid and the initial threshold parameter n of significance computation (). The second part is the "unimportant part," it consists of sign bits and refinement bits. No encryption algorithm is specified, only the important part is encrypted.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: the algorithm is not format compliant and therefore encrypted content cannot be viewed without decryption key.

  3. (c)

    Cryptographic security: if the two highest resolutions are very small, brute force attack becomes possible to guess the initial threshold and significance information.

  4. (d)

    Encryption ratio: due to the energy concentration obtained by the DWT, only 7% of the bitstream is encrypted.

  5. (e)

    Compression friendliness: no impact on compression efficiency.

  6. (f)

    Format compliance: SPIHT is not part of any compression standard. In addition, since SPIHT algorithm is context based, no decoding/processing is possible without the knowledge of the first significance bits.

  7. (g)

    Error tolerance: poor error tolerance is achieved due to the context nature of SPIHT.

  8. (h)

    Data type: image and video.

Droogenbroeck and Benedett, 2002. The JPEG Huffman coder terminates runs of zeros with codewords/symbols in order to approach the entropy. Appended bits are added to these codewords to fully specify the magnitudes and signs of nonzero coefficients, only these appended bits are encrypted (using DES or IDEA) [36].

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: high-visual degradation is achievable.

  3. (c)

    Cryptographic security: about 92% of the data is encrypted using well-scrutinized symmetric ciphers. It would be very difficult to break the encryption algorithm or try to predict the encrypted part.

  4. (d)

    Encryption ratio: very high encryption ratio is required (about 92%).

  5. (e)

    Compression friendliness: the encryption is separated from the Huffman coder and has no impact on the compression efficiency.

  6. (f)

    Format compliance: JPEG compliant.

  7. (g)

    Error tolerance: poor error tolerance is achieved due to avalanche property of symmetric ciphers used.

  8. (h)

    Data type: image.

Sadourny and Conan, 2003. In [37], a signaling scheme is proposed for JPSEC [38]. JPSEC is Part 8 of JPEG2000, called also secure JPEG2000. An important effort has been made in JPSEC to provide a standardized framework to implement security tools and services such as selective encryption, authentication, integrity, and so on. In [37], the signaling scheme proposed is intended to support selective encryption in JPSEC. Two marker segments are used, security components description (SCD) to signal the presence of protected parts in the bitstream and associated encryption parameters and codestream security information (CSI) to signal each individual protected part encryption parameters such as the protection method, some integrity data (hash values, signatures, etc.).

  1. (a)

    Tunability: high flexibility is allowed by the signaling information to encrypt different parts with different encryption parameters.

  2. (b)

    Visual degradation: the tenability of the scheme allows tunable visual degradation.

  3. (c)

    Cryptographic security: depends on encryption parameters.

  4. (d)

    Encryption ratio: depends on encryption parameters.

  5. (e)

    Compression friendliness: the paper presents few overhead tests on encrypted data. A single set of encryption parameters is tested yielding a signaling overhead of 104?bytes. The size of the overhead needs to be measured with respect to image file size and with different encryption parameters.

  6. (f)

    Format compliance: JPEG2000 and JPSEC compliant.

  7. (g)

    Error tolerance: depends on encryption parameters, for example, in the experiments presented in [37], DES is used in CFB mode for encryption which yields poor error tolerance due to chaining mode.

  8. (h)

    Data type: image.

Wu and Deng, 2004. The proposed encryption scheme [39] is a JPEG2000 compliant algorithm which iteratively encrypts codeblock contribution to packets (CCPs). The encryption process acts on CCPs (in the packet data) using stream ciphers or block ciphers. The described proposal is based mainly on stream ciphers with arithmetic module addition. The key stream is generated using RC4. Each CCP is iteratively encrypted until it has no forbidden codewords (in the range [0XFF90, 0XFFFF]) because this range is reserved for packet headers and is necessary for error resiliency and resynchronization.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: depends on the number of CCPs encrypted.

  3. (c)

    Cryptographic security: iterative encryption of CCPs may give a hint for side channel attacks (e.g., timing attack).

  4. (d)

    Encryption ratio: depends on the number of CCPs encrypted. However, the number of iterations per CCP increases exponentially with CCP length [40] which increases the overall effective encryption ratio.

  5. (e)

    Compression friendliness: no impact on compression.

  6. (f)

    Format compliance: fully compliant to JPEG2000 bitstream and preserving scalability and error resiliency which are desirable properties in JPEG2000.

  7. (g)

    Error tolerance: the use of RC4 causes important error propagation.

  8. (h)

    Data type: image and video.

Norcen and Uhl, 2003. JPEG2000 is an embedded bitstream. In addition, most important data is sent at the beginning of the bitstream. Based on these observations, the proposed scheme consists in AES encryption of selected packet data [41]. The algorithm uses two optional markers start of packet (SOP) marker 0xFF91 and end of packet (EPH) marker 0xFF92 to identify packet data. Then, this packet data is encrypted using AES. CFB mode is used because the packet data has variable length. The experiments have been conducted on two kinds of images (lossy and lossless compressed), with different progression orders (resolution and layer progression orders). The evaluation criterion was the visual degradation obtained for a given amount of encrypted data. It was found that for the lossy compressed images, layer progression gives better results. For lossless compressed images, resolution progression gives better results.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: high-visual degradation is achievable by encrypting 20% of the data.

  3. (c)

    Cryptographic security: visual degradation is not the unique criterion that characterizes the security of the algorithm. In [5], the visual degradation achieved is very high while the algorithm security is very weak.

  4. (d)

    Encryption ratio: 20% of the data is encrypted to achieve an acceptable level of visual degradation. However, only resolution and layer progressions are considered.

  5. (e)

    Compression friendliness: no impact on compression.

  6. (f)

    Format compliance: not JPEG2000 compliant. Indeed, forbidden codewords in the range??[0XFF90; 0XFFFF]??can be generated by the AES-CFB mode.

  7. (g)

    Error tolerance: AES in CFB mode has poor error tolerance.

  8. (h)

    Data type: image and video.

Stütz and Uhl, 2006. In [40], the algorithm proposed by Wu and Deng [39] is revisited. The complexity of the iterative encryption of CCPs was less than estimated in [39], Stütz and Uhl gave a more exact formulation of the CCPs distribution and hence for the encryption complexity [40]. The number of rounds needed to achieve compliant codestream increases exponentially with the CCPs length. In addition, experimental results were conducted to test the practicality of both CCPs and packets iterative encryption. CCPs iterative encryption can be well performing if the compression parameters are well selected (use of sufficient quality layers and/or small precincts with small codeblocks). On the other hand, reducing codeblocks size severely impact compression performance. For packets iterative encryption, it was shown that the distribution of packets length make it impractical. This shows that Wu and Deng approach is not general for JPEG2000 compressed images and special care has to be taken when selecting compression parameters.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: depends on the number of CCPs encrypted.

  3. (c)

    Cryptographic security: iterative encryption of CCPs may give a hint for side channel attacks (e.g., timing attack).

  4. (d)

    Encryption ratio: depends on the number of CCPs encrypted.

  5. (e)

    Compression friendliness: small codeblocks adversely impact compression performance; the MQ coder performs better on large codeblocks. In addition for small packets, the packet headers and marker sequences (e.g., SOP and EPH) will represent an important fraction of the bitstream.

  6. (f)

    Format compliance: JPEG2000 compliant, but the proposed technique is not applicable for any set of compression parameters (many quality layers are needed and small codeblocks).

  7. (g)

    Error tolerance: the use of RC4 causes important error propagation.

  8. (h)

    Data type: image and video.

Engel, Stütz, and Uhl, 2007. In [42], a syntax-compliant encryption method is proposed for JPEG2000. Each codeblock CCP or segment is independently encrypted. The method is based on a new format compliant encryption called ciphertext switching encryption (CSE). A stream cipher is used for encryption with backward checking, each time a forbidden codeword is generated by encryption, it is switched back to plaintext and neighboring codewords are checked back for compliance. This process is iterated until no forbidden codeword is found.

  1. (a)

    Tunability: not tunable.

  2. (b)

    Visual degradation: high-visual degradation can be achieved.

  3. (c)

    Cryptographic security: each time a forbidden codeword is generated, it is switched back to plaintext. In addition, switching impacts all previously encrypted bytes, backward check is necessary for each switched byte. The number of bytes sent in plaintext can be unpredictable.

  4. (d)

    Encryption ratio: depends on the number of CCPs encrypted. However, important memory is required to buffer the previously encrypted bytes for backward check. This check is performed after each byte encryption.

  5. (e)

    Compression friendliness: one major advantage of this scheme is its compression friendliness. Indeed, a negligible overhead of 11?bytes is introduced. Only a short global IV (initial value) is inserted in the bitstream main header. This global IV is used in generating IVs for independent CCPs encryption.

  6. (f)

    Format compliance: JPEG2000 compliant with fine granularity scalability.

  7. (g)

    Error tolerance: the main drawback of this scheme is the need to do backward checking and switching if necessary. Indeed, a single byte error could impact the whole CCP decryption due to the dependency between bytes encryption.

  8. (h)

    Data type: image and video.

Table 1 summarizes the related work with respect to each criteria described above. The main symbols used are

  1. (i)

    "+" for satisfied criterion,

  2. (ii)

    "-" for nonsatisfied criterion,

  3. (iii)

    "H" for high,

  4. (iv)

    "V" for variable, it is appreciated that visual degradation is variable in order to adapt to different application requirements,

  5. (v)

    "?" for nonspecified.

It is desirable that visual degradation is variable and dynamically tunable to adapt to different application requirements. Encryption ratio needs to be minimized. Grayed boxes indicate unsatisfied criteria.

4. Discussion and Perspectives

4.1. Discussion

As we can see from state-of-the-art summary and Table 1, trading off all aforementioned criteria is a crucial task. We can observe that tunability, cryptographic security, and error tolerance are the main unsatisfied criteria. In the following sections, each of these criteria is discussed.

4.1.1. Tunability

Selective encryption algorithms based on static encryption parameters do not allow tunability. Tunability is a desirable property especially for content protection systems targeting different applications with different requirements in terms of security or visual degradation and different devices with different capabilities in terms of memory, computational power, or display capabilities. It is therefore appreciated to design a tunable selective encryption algorithm with dynamic encryption parameters. Signaling information can be inserted within the bitstream in order to indicate the location of encrypted parts and encryption primitives and functionalities that are used.

4.1.2. Cryptographic Security

Very few papers have proposed a serious evaluation of the security of selective encryption algorithms. In most cases, visual distortion (measured using the PSNR) is used as the exclusive criterion for such purpose. However, visual degradation remains a subjective measure. In addition, it has been shown that some selective encryption algorithms that yield important visual distortion may have important security leakages [17, 18]. Cryptanalysis of selective encryption algorithms rely on key recovery (if encryption key space is not large enough) or prediction of encrypted part. Hence, cryptographic security should rely on

  1. (i)

    the encryption key (of a well-scrutinized encryption algorithm);

  2. (ii)

    unpredictability of the encrypted part.

As shown in Section 1.1, postcompression selective encryption algorithms are more suited for selective encryption from the security point of view. Indeed, compression eliminates data correlation which reduces the predictability of the encrypted part.

Very few works have been reported on the unpredictability of the encrypted part. Security of the selective encryption algorithm depends on how much and which parts of a message we have to encrypt to ensure that brute force on the encryption key space is easier than brute force attack on the plaintext itself. Otherwise, the attacker could bypass encryption and concentrate his effort on predicting the plaintext. It is hard to find an absolute measure for security. Instead, we define indirect measures that could approximate the security of a selective encryption algorithm. Examples of such measures are entropy, unicity distance guesswork, and -work factor [43]. Entropy, as suggested by Shannon [1], measures the message uncertainty. It defines the message randomness. It is used to calculate unicity distance [1] which is an approximation of the minimum number of ciphertexts needed in a ciphertext-only attack to yield a unique solution. Guesswork, as suggested in [44, 45], measures the expected number of guesses to perform in optimal brute force attack (where the attacker has perfect knowledge about symbols probability distribution) to find the plain message. In [44, 45], the authors showed that it is not possible to find simple bounds for guesswork (and -work factor) based on entropy. They found that guesswork can be arbitrarily large while entropy tends to zero. In [44], the author considers entropy inappropriate as confidentiality measure in ciphertext attacks. Based on these observations, [43] proposes guesswork as measure for confidentiality of selectively encrypted messages. We investigate the implications of these results on postcompression selective encryption algorithms.

We consider a message , compressed by a "perfect compressor." is composed of symbols. We arbitrarily choose symbols that will be encrypted (), designates the encrypted part. The remainder of the message is left unencrypted (Figure 7). The encryption ratio is given by

(2)

We will evaluate the difficulty for an attacker to guess the encrypted part in a brute force attack and try to find conditions that make brute force attack on the key space easier than optimal brute force attack on the plaintext space. We assume that the attacker knows the length and the location of the encrypted part and is able to recognize when a right guess occurs.

Figure 7
figure 7

Selectively encrypting a message M , only gray units are encrypted.

Perfect compression implies that all source redundancies are eliminated and that all symbols in the compressed message are independent and identically distributed. Hence, can be considered as a discrete random variable that takes its values in the language , with being the symbols space and being its cardinality. The attacker would try to guess the value of by trying all possible values in the decreasing order of their probabilities: , the guesswork is given by

(3)

Note that for perfect compression, all symbols are equally probable: , this gives a guesswork:

(4)

Now, if we consider the guesswork on the key space (of k bits), we would have

(5)

From (4) and (5), we can conclude that brute force attack on the message space is harder than key guessing if . In other terms,

(6)

This yields a minimum number of bytes encrypted

(7)

This result is fundamental especially for postcompression algorithms that perform encryption on entropy coded data. Since entropy coders can be considered, to a certain extent, as perfect compressors, it is required to encrypt at least bytes. This minimum value gives the optimal encryption ratio while achieving cryptographic security. Such a result could be used to optimize encryption ratio in some proposals for JPEG2000 selective encryption, where selected packet data are encrypted [37, 3942]. As codeblock contributions to packets (CCPs) are compressed independently and each CCP can be considered as "perfectly compressed," it is then required to encrypt only bytes per CCP to achieve the same visual degradation while still guaranteeing cryptographic security. An important encryption ratio reduction could then be achieved.

4.1.3. Error Tolerance

A main challenge in selective encryption algorithms is to design secure schemes that are error tolerant. Since most standard ciphers have strong avalanche effect, they provide poor error tolerance. Indeed, in networks prone to errors, a single bit error in the encrypted part will result in many erroneous bytes in the decrypted part. This is due to diffusion property of ciphers. Error tolerance and security seem to have antagonistic behaviors.

As a consequence, it is important to trade off security and error tolerance. It is then appreciated to avoid chaining modes of encryption algorithms [37, 41]. AES in CTR mode or any other cipher that encrypts data blocks independently offer a good balance between security and error tolerance.

4.2. Perspectives and Future Works

Although an important and rich variety of selective encryption algorithms have been proposed in the literature, we believe that many research areas remain open in this field.

  1. (i)

    Can we design a selective encryption for any compression algorithm? We believe that some compression algorithms are more cooperative and could be better candidates for selective encryption. For example, compared to MPEG, JPEG2000 is a very good candidate to selective encryption; this is due to its flexibility (embedded encoding, block-based encryption, many progression orders, local region access, etc.). These properties can be very useful in designing a flexible selective encryption algorithm in order to meet a larger set of requirements and target more applications. In future works, we will focus on designing selective encryption algorithms for JPEG2000.

  2. (ii)

    Can we build a rule of thumb to design a good selective encryption algorithm? The study we make here shows the bad choices to avoid when trying to design a selective encryption algorithm. For example, a selective encryption that relies only on random permutations is totally insecure since it is easily breakable by chosen-plaintext attacks. Energy concentration does not mean intelligibility concentration, and therefore, selectively encrypting low-frequency coefficients does not necessarily give a sufficient level of security or visual degradation.

  3. (iii)

    Can we design a selective encryption that can be used in any kind of application? We believe that it is feasible to design a flexible selective encryption algorithm that is tunable and allows to trade off a certain number of parameters in order to target a large set of applications. The algorithm proposed in [26, 37] good examples.

References

  1. Shannon CE: Communication theory of secrecy systems. Declassified Report, 1946

    Google Scholar 

  2. Lookabaugh T, Sicker DC, Keaton DM, Guo WY, Vedula I: Security analysis of selectively encrypted MPEG-2 streams. Multimedia Systems and Applications VI, September 2003, Orlando, Fla, USA, Proceedings of SPIE 5241: 10-21.

    Article  Google Scholar 

  3. Lookabaugh T: Selective encryption, information theory, and compression. Proceedings of the 38th Asilomar Conference on Signals, Systems and Computers, November 2004, Pacific Grove, Calif, USA 1: 373-376.

    Google Scholar 

  4. Lookabaugh T, Sicker DC: Selective encryption for consumer applications. IEEE Communications Magazine 2004,42(5):124-129.

    Article  Google Scholar 

  5. Tang L: Methods for encrypting and decrypting MPEG video data efficiently. Proceedings of the 4th ACM International Multimedia Conference and Exhibition, November 1996, Boston, Mass, USA 219-229.

    Google Scholar 

  6. Qiao L, Nahrstedt K, Tam M-C: Is MPEG encryption by using random list instead of zigzag order secure? Proceedings of the IEEE International Symposium on Consumer Electronics (ISCE '97), December 1997, Singapore 226-229.

    Google Scholar 

  7. Uehara T, Safavi-Naini R: Chosen DCT coefficients attack on MPEG encryption scheme. Proceedings of IEEE Pacific Rim Conference on Multimedia, December 2000, Sydney, Australia 316-319.

    Google Scholar 

  8. Socek D, Kalva H, Magliveras SS, Marques O, Culibrk D, Furht B: New approaches to encryption and steganography for digital videos. Multimedia Systems 2007,13(3):191-204. 10.1007/s00530-007-0083-z

    Article  Google Scholar 

  9. Baumgartner J: Deciphering the CA conundrum. Communications Engineering and Design March 2003.

    Google Scholar 

  10. Giachetti J-L, Lenoir V, Codet A, Cutts D, Sager J: Common conditional access interface for digital video broadcasting decoders. IEEE Transactions on Consumer Electronics 1995,41(3):836-841. 10.1109/30.468076

    Article  Google Scholar 

  11. Wu C-P, Kuo C-CJ: Fast encryption methods for audiovisual data confidentiality. Multimedia Systems and Applications III, November 2001, Boston, Mass, USA, Proceedings of SPIE 4209: 284-295.

    Article  Google Scholar 

  12. Shi C, Bhargava B: A fast MPEG video encryption algorithm. Proceedings of the 6th ACM International Conference on Multimedia, September 1998, Bristol, UK 81-88.

    Google Scholar 

  13. Shi C, Bhargava B: An efficient MPEG video encryption algorithm. Proceedings of the 17th IEEE Symposium on Reliable Distributed Systems (SRDS '98), October 1998, West Lafayette, Ind, USA 381-386.

    Google Scholar 

  14. Shi C, Wang SY, Bhargava B: MPEG video encryption in real-time using secret key cryptography. Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA '99), June-July 1999, Las Vegas, Nev, USA 191-201.

    Google Scholar 

  15. Podesser M, Schmidt HP, Uhl A: Selective bitplane encryption for secure transmission of image data in mobile environments. Proceedings of the 5th Nordic Signal Processing Symposium (NORSIG '02), October 2002, Tromsø, Norway

    Google Scholar 

  16. Zeng W, Lei S: Efficient frequency domain selective scrambling of digital video. IEEE Transactions on Multimedia 2003,5(1):118-129. 10.1109/TMM.2003.808817

    Article  Google Scholar 

  17. Van de Ville D, Philips W, Van de Walle R, Lemahieu I: Image scrambling without bandwidth expansion. IEEE Transactions on Circuits and Systems for Video Technology 2004,14(6):892-897. 10.1109/TCSVT.2004.828325

    Article  MathSciNet  Google Scholar 

  18. Li S, Li C, Lo K-T, Chen G: Cryptanalysis of an image scrambling scheme without bandwidth expansion. IEEE Transactions on Circuits and Systems for Video Technology 2008,18(3):338-349.

    Article  Google Scholar 

  19. Meyer J, Gadegast F: Security mechanisms for multimedia data with the example MPEG-1 video. Project Description of SECMPEG, Technical University of Berlin, Germany, May 1995

  20. Qiao L, Nahrstedt K: A new algorithm for MPEG video encryption. Proceedings of the 1st International Conference on Imaging Science, Systems and Technology (CISST '97), July 1997, Las Vegas, Nev, USA 21-29.

    Google Scholar 

  21. Wu C-P, Kuo C-CJ: Efficient multimedia encryption via entropy codec design. Security and Watermarking of Multimedia Contents III, January 2001, San Jose, Calif, USA, Proceedings of SPIE 4314: 128-138.

    Article  Google Scholar 

  22. Gillman DW, Rivest RL: On breaking a Huffman code. IEEE Transactions on Information Theory 1996,42(3):972-976. 10.1109/18.490558

    Article  MATH  MathSciNet  Google Scholar 

  23. Zhou J, Liang Z, Chen Y, Au OC: Security analysis of multimedia encryption schemes based on multiple Huffman table. IEEE Signal Processing Letters 2007,14(3):201-204.

    Article  Google Scholar 

  24. Wen J, Severa M, Zeng W, Luttrell MH, Jin W: A format-compliant configurable encryption framework for access control of video. IEEE Transactions on Circuits and Systems for Video Technology 2002,12(6):545-557. 10.1109/TCSVT.2002.800321

    Article  Google Scholar 

  25. Pommer A, Uhl A: Selective encryption of wavelet-packet encoded image data: efficiency and security. Multimedia Systems 2003,9(3):279-287. 10.1007/s00530-003-0099-y

    Article  Google Scholar 

  26. Lian S, Sun J, Wang Z: Perceptual cryptography on JPEG2000 compressed images or videos. Proceedings of the 4th International Conference on Computer and Information Technology (CIT '04), September 2004, Wuhan, China 78-83.

    Google Scholar 

  27. Bertlisson M, Brickell EF, Ingemarsson I: Cryptanalysis of video encryption based on space-filling curves. In Proceedings of the Workshop on the Theory and Application of Cryptographic Techniques on Advances in Cryptology (EUROCRYPT '89), April 1989, Houthalen, Belgium, Lecture Notes in Computer Science. Volume 434. Springer; 403-411.

    Google Scholar 

  28. Massoudi A, Lefèbvre F, Joye M: Cryptanalysis of a video scrambling based on space filling curves. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '07), July 2007, Beijing, China 1683-1686.

    Google Scholar 

  29. Grangetto M, Magli E, Olmo G: Multimedia selective encryption by means of randomized arithmetic coding. IEEE Transactions on Multimedia 2006,8(5):905-917.

    Article  Google Scholar 

  30. Bergeron C, Lamy-Bergot C: Compliant selective encryption for H.264/AVC video streams. Proceedings of the 7th IEEE Workshop on Multimedia Signal Processing (MMSP '05), October 2005, Shanghai, China 1-4.

    Google Scholar 

  31. Engel D, Uhl A: Lightweight JPEG2000 encryption with anisotropic wavelet packets. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '06), July 2006, Toronto, Canada 2177-2180.

    Google Scholar 

  32. Spanos GA, Maples TB: Performance study of a selective encryption scheme for the security of networked, real-time video. Proceedings of the 4th International Conference on Computer Communications and Networks (ICCCN '95), September 1995, Las Vegas, Nev, USA 2-10.

    Google Scholar 

  33. Agi I, Gong L: An empirical study of secure MPEG video transmissions. Proceedings of the Symposium on Network and Distributed System Security, February 1996, San Diego, Calif, USA 137-144.

    Chapter  Google Scholar 

  34. Alattar AM, Al-Regib GI: Evaluation of selective encryption techniques for secure transmission of MPEG-compressed bit-streams. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '99), May-June 1999, Orlando, Fla, USA 4: 340-343.

    Google Scholar 

  35. Cheng H, Li X: Partial encryption of compressed images and videos. IEEE Transactions on Signal Processing 2000,48(8):2439-2451. 10.1109/78.852023

    Article  Google Scholar 

  36. Van Droogenbroeck M, Benedett R: Techniques for a selective encryption of uncompressed and compressed images. Proceedings of Advanced Concepts for Intelligent Vision Systems (ACIVS '02), September 2002, Ghent, Belgium 90-97.

    Google Scholar 

  37. Sadourny Y, Conan V: A proposal for supporting selective encryption in JPSEC. IEEE Transactions on Consumer Electronics 2003,49(4):846-849. 10.1109/TCE.2003.1261164

    Article  Google Scholar 

  38. ISO/IEC : JPSEC commission draft 2.0. ISO/IEC/JTC1/SC29/WG 1, N3397, 2004

  39. Wu Y, Deng RH: Compliant encryption of JPEG2000 codestreams. Proceedings of the International Conference on Image Processing (ICIP '04), October 2004, Singapore 5: 3439-3442.

    Google Scholar 

  40. Stütz T, Uhl A: On format-compliant iterative encryption of JPEG2000. Proceedings of the 8th IEEE International Symposium on Multimedia (ISM '06), December 2006, San Diego, Calif, USA 985-990.

    Chapter  Google Scholar 

  41. Norcen R, Uhl A: Selective encryption of the JPEG2000 bitstream. In Communications and Multimedia Security, Lecture Notes in Computer Science. Volume 2828. Springer, Berlin, Germany; 2003:194-204. 10.1007/978-3-540-45184-6_16

    Google Scholar 

  42. Engel D, Stütz T, Uhl A: Format-compliant JPEG2000 encryption with combined packet header and packet body protection. Proceedings of the Multimedia and Security Workshop (MM&Sec '07), September 2007, Dallas, Tex, USA 87-96.

    Chapter  Google Scholar 

  43. Lundin R, Lindskog S, Brunstrom A, Fischer-Hübner S: Measuring confidentiality of selectively encrypted messages using guesswork. Proceedings of the 3rd Swedish National Computer Networking Workshop (SNCNW '05), November 2005, Halmstad, Sweden 99-102.

    Google Scholar 

  44. Pliam JO: Ciphers and their products: group theory in private key cryptography, Ph.D. thesis. University of Minnesota, Minneapolis, Minn, USA; 1999.

    Google Scholar 

  45. Malone D, Sullivan WG: Guesswork and entropy. IEEE Transactions on Information Theory 2004,50(3):525-526. 10.1109/TIT.2004.824921

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Massoudi.

Electronic supplementary material

13635_2008_45_MOESM1_ESM.pdf

Table 1: Summary of related work with respect to each criterion; grayed boxes indicate unsatisfied criteria.(PDF 525 KB)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Massoudi, A., Lefebvre, F., De Vleeschouwer, C. et al. Overview on Selective Encryption of Image and Video: Challenges and Perspectives. EURASIP J. on Info. Security 2008, 179290 (2008). https://doi.org/10.1155/2008/179290

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2008/179290

Keywords