Counter The Adversarial Attack Of Double JPEG Compression

Abstract

Many Internet users, especially the underage, suffer from unwanted or unauthorized browsing of pornographic images. Therefore, automatically detecting of pornographic images is developed to mitigate the situation. However, the pornographic content posters may create adversarial images to evade the detection. The attack and defence turn into a more general rivalry, which is the detection of unilateral modification of images. In this report, from some example pornographic images detection techniques, the adversarial attack strategies are derived. Double JPEG compression is a typical method to create adversarial composite images by pasting a small part of images onto the original images. The proposed detection is based on discrete cosine transform (DCT) coefficients as the input of convolutional neural networks (CNN), other than the pixels of images. The result of proposed method contains the localisation of the modified area with high accuracy.

Introduction

The rapid development of the Internet brings increasing content to everyone surfing online. Unfortunately, some of the content may be illegal or unsuited for particular viewers. Pornographic images are examples of the inappropriate information which are not suit for underage children. Therefore, for protecting the Internet users and forensics propose, detection and isolation of the pornographic images for children protection is necessary. Accurate detection is the basis of filtering the inappropriate content. Originally, detecting the pornographic images is operated manually. The manual labelling processing is slow and may bring mental pressure for the operators. Therefore, the image classification based automatic detection techniques are introduced into the pornographic images filtering. Nian et al. reviewed several traditional automatic detection techniques with a new CNN based proposal.

Traditional detection

Most traditional techniques are based on visual modality. There are approximately three generations, rule-based, image-retrieval-based and deep-learning-based techniques. Rule-based techniques are straight-forward in design that all the rules are set by human. E. g. Yin et al. demonstrate a rule based on skin area. The algorithms filter the non-skin colour area, then extract the potential skin area according to texture feature and fractal dimension. In general, the false positive and negative of rule-based technique are obvious to humans, e. g. there is no exact relationship between high naked skin area and pornographic content for the method. For retrieval-based methods, e. g. Shih et al. proposed an idea of retrieving similar pictures in database to detect pornographic images. The algorithms extract the shape, texture, colour and other features to search the database for 100 similar labelled images and count the number of pornographic images. If an image is similar to many pornographic images over a threshold, it will be reported as pornographic.

CNN-based detection

CNN is an improvement based on the traditional neural networks. In image recognition, each pixel may have some relationship with the adjacent pixels. The adjacent pixels may form a certain pattern, which improves the accuracy of the recognition. The input and output layers of CNN are same to normal neural networks. For the hidden layer, there are few pairs of convolution layers and pooling layers connected to the input layer, while the rest part of hidden layer is similar with normal neural networks. The sub-squares in original images (generally 3x3 or 5x5 pixels) are “filtered” in convolutional layer by mapping them with filter boxes with the same size of sub-squares, then calculate the inner product of each box. The pooling layers are simply select the largest number in each square. Therefore, the calculation of CNN is much more than normal neural networks. Nian improve the normal CNN by re-using the structure of CaffeNet to reduce the training time, while adding auxiliary training set and random modification of input images to enhance the accuracy. The input layer is also restricted to 227x227 pixels with a sliding window through the image. The improved CNN networks provide 98. 6% of accuracy, which is better than the previous methods in pornographic detecting.

Adverserial attack

Although neural networks related machine learning techniques provide high accuracy in image classification, there are several ways to construct the adversarial image to mislead the neural networks. Currently, the attacks are roughly divided by the target. Akhtar et al has reviewed several typical attack modes.

  • For attacking classification and recognition, there are box-constrained L-BFGS, fast gradient sign method (FGSM), one pixel attack, deepfool, Houdini, etc.
  • For attacking beyond classification and recognition, there are attacks on autoencoders and generative models, attacks on recurrent neural networks (RNN), attacks on deep reinforcement learning, attack on facial attributes, etc.

Digital forensics

Digital forensics is the study and methods to digital evidence is “collected, preserved, examined, or transferred in a manner safeguarding the accuracy and reliability of the evidence, law enforcement and digital forensic units must establish and maintain an effective quality assurance system. ” According to Lampson’s summary of computer security, there are three aspects to study a secure system: “specification/policy, implementation/mechanism, correctness/assurance”. Unlike the software or network security, pornographic images detecting is insensitive to most of the common security features, including secrecy, availability, isolate, exclude, restrict, etc. Although some operations like access control or authenticating may help forensics in anti-tamper and other related areas in a moderate level, the “Integrity” aspects impact the forensics with a higher level. Integrity, “controlling how information changes or resources are used”, ensures that the information is original and has not been modified unilaterally. For image adversary, since the operation of the advisor is not restricted to any particular computer, the authenticating and auditing system in any target computer will not help with detecting any attack on the image modification for misleading the classification. The defender should retrieve the clue from the potential modified image, which is the only source information. More specific, double JPEG compression is one of the attack method focusing on modifying the content of images to mislead the forensics systems. The modification may easily deceive the eye of human. Therefore, computer aided techniques may help to assist us to find out the unilateral modification.

Double JPEG compression

A Double JPEG compressed images detection is one of the typical application of blind forensics, which provides no original source information other than the final images. Typically, blind forensics is based on “statistical and geometrical features, interpolation effects, or feature inconsistencies” to “verify the authenticity of images”. The previous double JPEG compression detection techniques are mainly based on detecting if the images are double compressed or not, e. g. analysing the primary quantization matrix and generalized Benford's law based analysis. Wang et al. develop a new approach to detect the modified region of double compressed images with CNN.

Attack strategies

The basic principle of double JPEG compression is paste a small part of image onto a larger one. In this section QF1 always refers to the quality of decompression of original image, and QF2 always refers to the quality of compression of the composite image.

  • Cut and copy a region A1 from image A (of any format).
  • Decompress a JPEG image B, whose quality factor is QF1, and insert A1 into B. Let B1 denote the unchanged background region of B.
  • Resave the new composite image C in the JPEG format, with a JPEG quality factor QF2.

Detection strategies

The proposed detection strategy is based on discrete cosine transform (DCT), as shown in figure 2, the frequency of DCT coefficient histograms roughly follows generalized Gaussian distribution, while the histograms of double compression of QF1 < QF2 has periodically missing values, with QF1 > QF2 having periodic pattern of peaks and valleys. There will be a 1/64 chance for false negative that the pasted area (A1 in figure 1) exactly aligned to the 8x8 pixels compression box, which is very low and ignored in this approach. By utilising the characteristic of the histograms, the proposed method is using the values of DCT coefficient as the input of CNN, which is a 11x9 feature of each block.

The result of detection is shown in figure 4. For brief conclusion, higher QF2 and higher resolution of the images result in higher accuracy. When QF1 > QF2 the performance will be better. The proposed method can automatically localise the modified section and has good capability for small blocks with the deep CNN model.

Related works

The defences of adversarial attacks for neural networks based image classification and recognition techniques are roughly divided into 3 main categories, using modified training or test set, using modified neural networks core, using external models. Unlike the proposed method, which only accept the DCT coefficient as the input of CNN, the general input for the defence strategies summarised is the image (or part of the original image). The pornographic images detection from Nian et al. is a typical data randomisation training with modified training set. Nian applied flipping, Gaussian blur, light change, random lighting noise and RGB channel change to the original image. All those modification will not remove the pornographic suggestion to human but may mislead the CNN. By applying the modification to the training set, Nian indicate that the model has higher accuracy to detect the pornographic images with those modifications. Typically, the data randomisation approach provides better accuracy and training efficiency than brute-force adversarial training. However, the drawback of the data randomisation training is obvious that it costs much longer time for training since the size of training set increases significantly, while the model cannot resist any modification not applied to the training set.

Bayar et al. propose a deep learning based universal image manipulation detection method by adding a new convolutional layer. This approach is a typical defence strategy of modifying the neural networks without any processing on the training set. Comparing to the method, Bayar’s proposal focuses on detection of different type of modification with a universal model without detailed localisation. Although Bayar’s proposal cannot localise the modified region, it could be the first round of detection. The result of Bayar’s method could be applied with other specific detection like Wang’s proposal. Liu et al. propose a more specific detection method, which is focus on detecting copy-move forgery, a more specific modification of the images. The proposed idea is based on colour filter array (CFA), which is a grid-based 2D arrays, with green grids in diagonal stripes. The input is 32x32 prediction error patches, while the convolutional layer has input of 50 kernel 7x7 blocks with 1pixel stripe. The pooling layer consists with both max and min pooling. The proposed method has roughly 95% accuracy. Comparing to the method, the input images are all in very high resolution, which increases the accuracy.

Conclusion

From a CNN based pornographic images detection method, more general image modification detection methods are analysed and evaluated in this report. The main method provides an accurate detection of double JPEG compression, which improves the credibility in digital forensics aspect. However, limited by the current technique level, most of the proposals are focus on specific kind of adversarial attacks. The combination of different tools and techniques is necessary in real world practices.

Future works

The methods analysed in this repot are based on detecting the modification of the images. However, there is another type of attack focus on the neural networks model. Although protecting the models from unauthorised access helps to prevent the adversary to analyse the models, in many practices models will be deployed on end-devices without serious protect. Upgradable models or dynamic models may help to reduce the potential of breaking into the core of neural networks. Furthermore, there are new types of adversarial examples may mislead both human and the computer. Since most of the current methods require manually label the images in training set, the new designed adversarial examples may directly impact the accuracy of training set before training. To resist the new types of adversarial attack, new techniques with better resistance are necessary for digital forensics.

15 July 2020
close
Your Email

By clicking “Send”, you agree to our Terms of service and  Privacy statement. We will occasionally send you account related emails.

close thanks-icon
Thanks!

Your essay sample has been sent.

Order now
exit-popup-close
exit-popup-image
Still can’t find what you need?

Order custom paper and save your time
for priority classes!

Order paper now