Intra-Class Variation Reduction Using Training Expression Images for Subject Independent Facial Expression Recognition

2013. 08. 01 ~



1. Introduction

l  Facial expressions provide a plenty of information about emotions, intentions, and other internal states.

ž   The ability to recognize personss facial expressions could give rise to a wide range of applications


l  Sparse Representation Classifier (SRC) has proved to be superior to other widely used classifiers (e.g., support vector machine (SVM)) for facial expression recognition (FER).


l  Many methods use SRC with appearance features (such as local binary pattern (LBP), Gabor wavelet, local phase quantization (LPQ) etc.).

ž   However, the use of appearance features may not be straightforward.

ž   Facial identity of subject is often confused with facial expression (Fig. 1).

Fig. 1.  Example of sparse representation to show the inappropriateness of directly using appearance features for FER.


l  One of the most straightforward ways to remove confusion between facial identity and facial expression:

ž   Using difference information between query face and its neutral face (of same subject)

ž   Face is normalized è the effect of facial identity is reduced and facial expression is emphasized.




Fig. 2.  Two kinds of difference information between query face and its neutral face

(a) Image difference between query face and its neutral face

(b) Feature point displacement between query face and its neutral face


l  Problems that can be encountered in real applications

ž   Problem 1) The subject in query is not always present in training.

ü  In real applications, the faces of many unknown subjects can be the inputs to a FER system.


ž   Problem 2) Neutral state of a face is not always available in training data.


l  Our method to solve the problems

ž   To solve the problem 1, we generate imaginary face image for normalization of query face image

ü  The imaginary face image is called intra-class variation (ICV) image.

ü  Face expression images of various subjects in training data are combined to generate the ICV image

ü  Using an approximation, the ICV image is similar in identity to query face.


ž   To solve the problem 2, ICV images are made by using non-neutral (or expressive) training face images (see Fig. 3(c)).

ü  We investigate that ICV images obtained by non-neutral face images achieve similar effectiveness to those obtained by neutral face images.






Fig. 3.  Examples of generated ICV images.

(a) Query face image. (b) ICV image of neutral expression. (c) ICV image of non-neutral (happy) expression.


2. Extracting expression features of query face image using ICV image

l  ICV image generation

ž   Training expression images are linearly combined to approximate query face image.

ž   The approximated image is the ICV image that looks similar to the query face image in identity and illumination (Fig. 4).

Fig. 4. ICV image generation by combining training expression images


l  Expression feature extraction: Subtracting ICV image from query face image

ž   Appearance of facial identity is reduced and expression difference is emphasized (Fig. 5).

ž   The expression feature is used for FER.


Fig. 5. Expression feature obtained by difference between query face image and ICV image.



3. Sparsity Based Facial Expression Recognition


Fig. 6. The frameworks for SRC using ICV images.



l  Sparse representation classification (SRC) to deal with a variety of expressions

ž   In expression features, noisy information such as identity and illumination are reduced.

ž   However, there are still some variations in expression (e.g., expression intensity variation).

ž   Sparsity based classification is adopted, which is robust to a variety of expressions.


l  Sparsity based classification using multiple expression features

ž   Multiple expression features are obtained by using different ICV images (refer to Fig. 6).

ž   Multiple expression features are independently analyzed by sparse representation.

ž   The individual sparse solutions are combined and the expression label is determined

ü  by finding the expression class with the largest average sparse coefficient (in the combined sparse solution).



4. Result

<Databases (DBs)>

l  CK+ and CMU Multi-PIE DBs for subject-independent recognition experiment



Fig. 7.  Examples of used face expression images. (a) CK+ DB. (b) CMU Multi-PIE DB (with severe illumination change).


<Organization of experiment>

l  Experiment 1) To see how many subjects are needed in training for subject-independent FER with ICV images

l  Experiment 2) To see the effectiveness of using non-neutral state of ICV images

l  Experiment 3) To see the effectiveness of the proposed method in comparisons with recent advances


<Experimental result>

l  Experiment 1) Recognition rates with varying number of subjects used for generating ICV images (on CMU Multi-PIE DB)

ž   Feasible for the subject-independent recognition even with the limited number of subjects available in training set.

ž   Clearly better than SRC with appearance features.

l  Experiment 2) Comparisons of recognition rate with single expression used for generating intra-class variation images (on CMU Multi-PIE DB)

ž   The use of non-neutral expression for intra-class variation image is able to yield similar recognition performance to the case of using neutral expression.

ž   The intra-class variation image of an expression (e.g., Neutral) are complementary with those of the other expressions (e.g., Smile), leading to improvement of FER.

l  Experiment 3) Comparisons with state-of-the-arts (on CK+ DB)


[R1] A. R. Rivera, J. R. Castillo, O. Chae, Local Directional Number Pattern for Face Analysis, IEEE Trans. Image Processing, vol. 22, no. 5, pp. 1740-1752, 2013.

[R2] C. L. S. Naika, S. S. Jha, P. K. Das, and S. B. Nair, Automatic Facial Expression Recognition Using Extended AR-LBP, Wireless Network and Computational Intelligence, vol. 292, Part. 3, pp.

244-252, 2012.

[R3] X. Huang, G. Zhao, W. Zheng, and M. Pietikäinen, Spatiotemporal Local Monogenic Binary Patterns for Facial Expression Recognition, IEEE Signal Processing Letters, vol. 19, no. 5, pp. 243-

246, 2012.

[R4] Y. Li, S. Wang, Y. Zhao, and Q. Ji, Simultaneous Facial Feature Tracking and Facial Expression Recognition, IEEE Trans. Image Processing, vol. 22, no. 7, pp. 2559-2573, 2013.

[R5] S. Yang and B. Bhanu, Understanding Discrete Facial Expressions in Video Using an Emotion Avatar Image, IEEE Trans. Syst., Man, Cybern, B, vol. 42, no. 4, pp. 980-992, 2012.

[R6] S. W. Chew, S. Lucey, P. Lucey, S. Sridharan, and J. F. Cohn, Improved Facial Expression Recognition via Uni-Hyperplane Classification, IEEE Intl Conf. on Computer Vision and Pattern

Recognition (CVPR), 2012.


5. Conclusions

l  For a query face image, the proposed method generated an intra-class variation (ICV) image using training face images of each expression.

ž   The ICV image had similar appearance with the query face image in identity and illumination

ž   The ICV image represented the same expression as the associated training face images.

l  Through experiment, the proposed method was proved to be effective under subject-independent recognition with illumination variation.


* Contact Person: Prof. Yong Man Ro (

1.     S. H. Lee, K. N. Plataniotis, and Y. M. Ro, Intra-Class Variation Reduction Using Training Expression Images for Sparse Representation Based Facial Expression Recognition, IEEE Transactions on Affective Computing, 2014.

2.     S. H. Lee, S. Y. Park, H. Kim, and Y. M. Ro, Partial Matching of Face Sequence Using Over-complete Dynamic Dictionary for Facial Expression Recognition, IEEE International Conference on Image Processing, 2015 (submitted).

3.     S. H. Lee, H. Kim, K. N. Plataniotis, and Y. M. Ro, Using Color Texture Sparsity for Facial Expression Recognition, IEEE International Conference on Automatic Face and Gesture Recognition, 2013.