Using Color Texture Sparsity for Facial Expression Recognition

	2012. May ~

1. Introduction

l This research presents a new facial expression recognition (FER) which exploits the effectiveness of color information and sparse representation.

l Proposed FER method

ž For extracting face feature, we compute color vector differences between color pixels so that they can effectively capture change of face appearance (e.g., skin texture).

ž With the extracted features, sparse representation based classification framework is adopted for FER.

l Through comparative and extensive experiment using two public FER databases (DBs), we validate that our color texture features are suited to the sparse representation for improving FER accuracy.

ž Our color texture features can considerably improve the recognition accuracy obtained by sparse representation compared to other features (e.g., local binary pattern (LBP)) under realistic recognition condition (e.g., low-resolution faces).

ž The use of our features can yield high discrimination capability and sparsity, justifying the high recognition accuracies obtained. Further, the proposed FER outperforms five other state-of-the-art FER methods.

2. Facial Expression Recognition using Sparsity of Color Texture Features

l In this section, we describe the way of applying color texture features to sparse representation for FER.

l For this purpose, the following figure illustrates the generic framework of creating dictionary with color texture features.

ž Color conversion: a given RGB training face image is converted into the face images of different color bands (e.g., YIQ color space).

ž From the color band images, we assume that different kinds of color texture features are obtained.

ž We extract histogram-based features known to be robust to face registration error in sparse representation based classification.

ž After reducing the dimensionality of each feature vector, the dictionary can be generated by fusing the complementary feature vectors at feature-level.

Fig.1. Generic framework of creating dictionary with color texture features. YIQ color space is used as an example.

A. Extracting Color Texture Features for Dictionary

ü Our feature extraction is based on computing color vector differences between pixels.

ž The most distinctive characteristic: the use of contrast information of color pixel differences for extracting skin textures.

ü Feature extraction schemes: color magnitude patterns and color directional patterns

ž The two kinds of patterns can be helpful for more accurately characterizing local change of facial expression since they encode strength of texture complementary with spatial structure of the texture.

ž Color magnitude patterns (differences of color vector norm): color sign pattern (CSP) and color magnitude pattern (CMP).

ž Color directional patterns (differences of color vector direction): color directional pattern (CDP) and color directional arc pattern (CDAP).

A color vector denoted by c is defined at a center pixel location in each local region image.

(a) Color Vector Representation

(b) Four Types of Features

Fig. 2. Color Vector Representation

B. FER Using Dictionary of Color Texture Features

ü In this section, we describe the creation of a dictionary with our color texture features and FER using the dictionary.

ü Dictionary construction given the individual pattern histogram vectors (denoted by and ).

ž In order to satisfy the under-determined condition of dictionary in SRC: Fisher Linear Discriminant Analysis (FLDA) for individual features.

ž Normalization to be zero-mean and unit variance.

ž Concatenation of individual features: feature-level fusion.

ü Since both micro and macro skin textures can be used for distinguishing between different facial expressions, we proposed to use multi-resolution analysis for our color texture features.

Fig. 3. Two important characteristics (i.e., existences of fine wrinkle between the eyebrows and principal lines on both sides of nose) to distinguish ‘Disgust’ from ‘Squint’.

ü To this end, we define the multiple face features each of which is generated by using a particular radius size denoted by .

ž Here, m is the index of radius and M is the number of radiuses used.

ž The multi-resolution face feature vector f can be obtained by concatenation of the single-resolution face feature vectors :

Fig. 4. Framework of Sparse Representation based Classification for FER

3. Experiment

l Used facial expression DBs: CMU Multi-PIE DB and BU-3DFE DB

l For comparative evaluation, the three sparse representation based FER methods using other features.

ž Raw pixels, LBP feature, and color based feature called Local Color Vector Binary Pattern (LCVBP)

ž For the FER methods that use LBP and LCVBP features, we adopted uniform LBP operations of P=8 and R=2.

ž For our color texture features, we use LBP operations of P=8 and ={1,3,5,7,9} for multi-resolution face feature, which were empirically chosen.

ž To guarantee fair comparison, the low-dimensional feature extraction (i.e., FLDA) was applied to raw pixel data, LBP, LCVBP, and our proposed color texture features.

l In order to investigate the stability of the proposed method in terms of FER accuracy under different color representations,

ž Three widely used color spaces were used in our experiments: RGB, YCbCr, and CIELuv.

Fig. 5. (a) CMU Multi-PIE DB and (b) BU-3DFE DB

A. Experiment1: Comparisons of Recognition Accuracy Under Different FER Challenges

ü The objective of this experiment is to present comparative results for demonstrating the effectiveness of our proposed FER under challenging FER conditions, illumination and face resolution changes.

ü Under illumination change with Multi-PIE DB

o Multi-PIE DB: contain severe illumination change due to different illumination settings (e.g., with or without flash, and different illumination directions, etc.).

o Results:

§ The sparse representation using color texture features performs better than that using grayscale based feature.

§ The use of our color texture features can achieve the highest recognition accuracy to illumination change.

§ The use of multi-resolution analysis for our feature is proven to be effective for improving FER accuracy.

Table 1. Identification rates obtained using CMU Multi-PIE DB.

ü Under face resolution change with BU-3DFE DB

o Evaluation of robustness of the proposed FER under change in face image resolution.

o BU-3DFE DB: very challenging due to a variety of ethnic/racial ancestries and expression intensity.

o Results:

§ SRC using raw pixels fails to work with the BU-3DFE DB.

§ In case of LBP, the recognition performance significantly drops with the decrease of the face resolution.

§ However, our features can yield the feasible recognition performance in the extremely low-resolution.

§ This shows that SRC using our features relying on color is much more robust to low-resolution face.

Table 2. Identification rates obtained using BU-3DFE DB. The face images with 120x120, 60x60, and 30x30 pixels were obtained via down-sampling from the face image of 120x120 pixels. To obtain enough features, the low resolution images are resized to the size of the original images (120x120 pixels) by interpolation.

B. Experiment2: Separability and Sparsity Analysis of Proposed Color Texture

ü For successful sparse representation classification: dictionary needs to be discriminative (i.e., training samples of dictionary should have small within-class variation but large between-class variation).

ü Investigation of discrimination capability of the dictionary

o We compared the distribution (visualized in 3D feature space) of our features with those of other three features (raw pixel, LBP, and LCVBP features).

o For the analysis, Principal Component Analysis (PCA) is separately applied to each of the face features.

ü According to the following figures, it demonstrates that our features have sufficiently high expression discrimination capability even in low-dimensional feature space (see Fig. 6).

Fig. 6. Illustration of 3D distributions for four distinct features. (a) Raw Pixel, (b) LBP, (c) LCVBP, and (d) Our Color Texture Feature.

ü Analysis of Sparsity

o From the above-mentioned observation, it is naturally expected that when using the color texture features, given a test face image can be mostly represented as training face images of true expression class, which is desirable for classification.

o For this purpose, the following measure, termed as sparse coefficient concentration in true class (SCCTC), was devised in this research to quantify the usefulness of features based on sparsity in terms of class.

o where indicates the characteristic function that selects the coefficients associated with the true class.

o If SCCTC value is 1, non-zero sparse coefficients are only concentrated on the true class.

o If SCCTC value is 0, none of non-zero sparse coefficients exist in the true class.

ü Since the measure SCCTC quantifies how much the coefficients are concentrated on true class for a given test face image, it reflects the probability that the true class will be the same as the class suggested by the recognition.

Table 3. Comparisons of SCCTC between different features, which corresponds to the result in Table 1. The SCCTC value for each feature type was computed by averaging all SCCTCs over all the test face images. Single-resolution face features were used for the proposed color texture.

C. Experiment3: Comparison of Recognition Accuracy among Different Pattern Combinations

ü We have performed an additional experiment to evaluate the different combinations of our color texture features.

o Color sign patterns achieves the best identification rates.

o The recognition performance can be improved by combining color magnitude patterns.

o The combination of the four kinds of patterns is able to yield better recognition results.

Table 4. Identification rates obtained for different pattern combinations. RGB color space was used for the color representation.

D. Experiment4: Comparison with State-Of-The-Art FER Methods using BU-3DFE DB

ü In this experiment, we used the BU-3DFE DB for comparisons with other state-of-the-art FER results.

o Context method, Gabor method, Local Gabor Binary Pattern (LGBP), RGB Tensor, and Tensor Perceptual Color Framework (TPCF).

Table 5. Comparisons of identification rate between the proposed method and the state-of-the-art methods on BU-3DFE DB.

4. Conclusion

l This research investigated the contribution of using color information to sparse representation for improving facial expression recognition (FER) performance.

l For making use of sparsity of color information for FER, we presented the method of accurately capturing facial skin textures derived from color vectors.

l Experimental results showed that the proposed FER relying on the color texture sparsity is highly robust to realistic recognition conditions.

l Our future work includes comparison of our features with other state-of-the-art features, and investigation on the systematic way for selecting effective color spaces for further improving sparsity based FER.

* Contact Person: Prof. Yong Man Ro (ymro@kaist.ac.kr)

1. Seung Ho Lee, Konstantinos N. Plataniotis, and Yong Man Ro, “Intra-class Variation Reduction Using Training Expression Images for Sparse Representation Based Facial Expression Recognition,” IEEE Transactions on Affective Computing (Under Revision), 2014.

2. Seung Ho Lee, Hyungil Kim, Konstantinos N. Plataniotis, and Yong Man Ro, “Using Color Texture Sparsity for Facial Expression Recognition,” IEEE International Conference on Automatic Face and Gesture Recognition, 2013.

3. Seung Ho Lee, Jae Young Choi, Yong Man Ro, and Konstantinos N. Plataniotis, “Local Color Vector Binary Patterns from Multichannel Face Images for Face Recognition,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 2347-2353, 2012.

ü Video: Facial Expression Recognition in video.