|
Using Color Texture Sparsity for Facial
Expression Recognition |
|
1. Introduction
l This research
presents a new facial expression recognition (FER)
which exploits the effectiveness of color information and sparse
representation.
l Proposed FER
method
For extracting face feature, we compute color vector
differences between color pixels so that they can effectively capture change of
face appearance (e.g., skin texture).
With the extracted features, sparse representation based
classification framework is adopted for FER.
l Through
comparative and extensive experiment using two public FER databases (DBs), we
validate that our color texture features are suited to the sparse
representation for improving FER accuracy.
Our color texture features can considerably improve the
recognition accuracy obtained by sparse representation compared to other
features (e.g., local binary pattern (LBP)) under realistic recognition
condition (e.g., low-resolution faces).
The use of our features can yield high discrimination
capability and sparsity, justifying the high
recognition accuracies obtained. Further, the proposed FER outperforms five
other state-of-the-art FER methods.
2. Facial Expression Recognition using Sparsity of Color Texture Features
l In this section,
we describe the way of applying color texture features to sparse representation
for FER.
l For this
purpose, the following figure illustrates the generic framework of creating
dictionary with color texture features.
Color conversion: a given RGB training face image is
converted into the face images of different color bands (e.g., YIQ color
space).
From the color band images, we assume that different
kinds of color texture features are obtained.
We extract histogram-based features known to be robust to
face registration error in sparse representation based classification.
After reducing the dimensionality of each feature vector,
the dictionary can be generated by fusing the complementary feature vectors at
feature-level.
Fig.1. Generic
framework of creating dictionary with color texture features. YIQ color space
is used as an example.
A. Extracting Color
Texture Features for Dictionary
ü Our feature
extraction is based on computing color vector differences between pixels.
The most distinctive characteristic: the use of contrast
information of color pixel differences for extracting skin textures.
ü Feature
extraction schemes: color magnitude patterns and color directional patterns
The two kinds of patterns can be helpful for more
accurately characterizing local change of facial expression since they encode
strength of texture complementary with spatial structure of the texture.
Color magnitude patterns (differences of color vector
norm): color sign pattern (CSP) and color magnitude pattern (CMP).
Color directional patterns (differences of color vector
direction): color directional pattern (CDP) and color directional arc pattern
(CDAP).
A color vector denoted by c is
defined at a center pixel location in each local region image.
(a) Color Vector Representation
(b) Four Types of Features
Fig. 2. Color Vector Representation
B. FER Using
Dictionary of Color Texture Features
ü In this section,
we describe the creation of a dictionary with our color texture features and
FER using the dictionary.
ü Dictionary
construction given the individual pattern histogram vectors (denoted by and ).
In order to satisfy the under-determined condition of dictionary
in SRC: Fisher Linear Discriminant Analysis (FLDA) for individual features.
Normalization to be zero-mean and unit variance.
Concatenation of individual features: feature-level
fusion.
ü Since both micro
and macro skin textures can be used for distinguishing between different facial
expressions, we proposed to use multi-resolution analysis for our color texture
features.
Fig. 3. Two important characteristics
(i.e., existences of fine wrinkle between the eyebrows and principal lines on
both sides of nose) to distinguish ¡®Disgust¡¯ from ¡®Squint¡¯.
ü To this end, we
define the multiple face features each of which is
generated by using a particular radius size denoted by .
Here, m is the
index of radius and M is the number
of radiuses used.
The multi-resolution face feature vector f can be obtained by concatenation of
the single-resolution face feature vectors :
Fig. 4. Framework of Sparse Representation based
Classification for FER
3. Experiment
l Used facial expression DBs: CMU Multi-PIE DB and BU-3DFE DB
l For comparative evaluation, the three sparse representation
based FER methods using other features.
Raw pixels, LBP feature, and color based feature called Local
Color Vector Binary Pattern (LCVBP)
For the FER methods that use LBP and LCVBP features, we adopted
uniform LBP operations of P=8 and R=2.
For our color texture features, we use LBP operations of P=8 and ={1,3,5,7,9} for multi-resolution face feature, which were
empirically chosen.
To guarantee fair comparison, the low-dimensional feature extraction
(i.e., FLDA) was applied to raw pixel data, LBP, LCVBP, and our proposed color
texture features.
l In order to investigate the stability of the proposed method in
terms of FER accuracy under different color representations,
Three widely used color spaces were used in our experiments:
RGB, YCbCr, and CIELuv.
Fig. 5. (a) CMU Multi-PIE DB and (b) BU-3DFE DB
A. Experiment1:
Comparisons of Recognition Accuracy Under Different FER Challenges
ü The objective of this experiment is to present comparative
results for demonstrating the effectiveness of our proposed FER under challenging
FER conditions, illumination and face resolution changes.
ü Under illumination change with Multi-PIE DB
o Multi-PIE DB: contain severe illumination change due to
different illumination settings (e.g., with or without flash, and different illumination
directions, etc.).
o Results:
¡× The sparse representation using color texture features performs
better than that using grayscale based feature.
¡× The use of our color texture features can achieve the highest
recognition accuracy to illumination change.
¡× The use of multi-resolution analysis for our feature is proven
to be effective for improving FER accuracy.
Table 1. Identification rates
obtained using CMU Multi-PIE DB.
ü Under face resolution change with BU-3DFE DB
o Evaluation of robustness of the proposed FER under change in
face image resolution.
o BU-3DFE DB: very challenging due to a variety of ethnic/racial
ancestries and expression intensity.
o Results:
¡× SRC using raw pixels fails to work with the BU-3DFE DB.
¡× In case of LBP, the recognition performance significantly drops
with the decrease of the face resolution.
¡× However, our features can yield the feasible recognition
performance in the extremely low-resolution.
¡× This shows that SRC using our features relying on color is much
more robust to low-resolution face.
Table 2. Identification rates
obtained using BU-3DFE DB. The face images with 120x120, 60x60, and 30x30
pixels were obtained via down-sampling from the face image of 120x120 pixels.
To obtain enough features, the low resolution images are resized to the size of
the original images (120x120 pixels) by interpolation.
B. Experiment2: Separability and Sparsity
Analysis of Proposed Color Texture
ü For successful sparse representation classification: dictionary
needs to be discriminative (i.e., training samples of dictionary should have
small within-class variation but large between-class variation).
ü Investigation of discrimination capability of the dictionary
o We compared the distribution (visualized in 3D feature space) of
our features with those of other three features (raw pixel, LBP, and LCVBP
features).
o For the analysis, Principal Component Analysis (PCA) is
separately applied to each of the face features.
ü According to the following figures, it demonstrates that our
features have sufficiently high expression discrimination capability even in
low-dimensional feature space (see Fig. 6).
Fig. 6. Illustration of 3D
distributions for four distinct features. (a) Raw
Pixel, (b) LBP, (c) LCVBP, and (d) Our Color Texture Feature.
ü Analysis of Sparsity
o
From the above-mentioned
observation, it is naturally expected that when using the color texture
features, given a test face image can be mostly represented as training face
images of true expression class, which is desirable for classification.
o
For this purpose, the
following measure, termed as sparse coefficient concentration in true class
(SCCTC), was devised in this research to quantify the usefulness of features
based on sparsity in terms of class.
o
where indicates the characteristic function that selects the coefficients
associated with the true class.
o
If SCCTC value is 1, non-zero
sparse coefficients are only concentrated on the true class.
o
If SCCTC value is 0, none of
non-zero sparse coefficients exist in the true class.
ü Since the measure SCCTC quantifies how much the coefficients are
concentrated on true class for a given test face image, it reflects the
probability that the true class will be the same as the class suggested by the
recognition.
Table 3. Comparisons of SCCTC between
different features, which corresponds to the result in Table 1. The
SCCTC value for each feature type was computed by averaging all SCCTCs over all
the test face images. Single-resolution face features were used for the
proposed color texture.
C. Experiment3:
Comparison of Recognition Accuracy among Different Pattern Combinations
ü We have performed an additional experiment to evaluate the
different combinations of our color texture features.
o Color sign patterns achieves
the best identification rates.
o The recognition performance can be improved by combining color
magnitude patterns.
o The combination of the four kinds of patterns is able to yield
better recognition results.
Table 4. Identification rates obtained
for different pattern combinations. RGB color space was used for the color
representation.
D. Experiment4:
Comparison with State-Of-The-Art FER Methods using BU-3DFE DB
ü In this experiment, we used the BU-3DFE DB for comparisons with
other state-of-the-art FER results.
o Context method, Gabor method, Local Gabor Binary
Pattern (LGBP), RGB Tensor, and Tensor Perceptual Color Framework (TPCF).
Table 5. Comparisons of identification
rate between the proposed method and the state-of-the-art methods on BU-3DFE
DB.
4. Conclusion
l This research investigated the contribution of using color information
to sparse representation for improving facial expression recognition (FER)
performance.
l For making use of sparsity of color
information for FER, we presented the method of accurately capturing facial
skin textures derived from color vectors.
l Experimental results showed that the proposed FER relying on the
color texture sparsity is highly robust to realistic
recognition conditions.
l Our future work includes comparison of our features with other
state-of-the-art features, and investigation on the systematic way for
selecting effective color spaces for further improving sparsity
based FER.
* Contact Person: Prof. Yong Man Ro (ymro@kaist.ac.kr) |
|
1. Seung Ho Lee, Konstantinos N. Plataniotis, and Yong Man Ro, ¡°Intra-class Variation Reduction Using Training Expression Images for Sparse Representation Based Facial Expression Recognition,¡± IEEE Transactions on Affective Computing (Under Revision), 2014.
2. Seung Ho Lee, Hyungil Kim, Konstantinos N. Plataniotis, and Yong Man Ro, ¡°Using Color Texture Sparsity for Facial Expression Recognition,¡± IEEE International Conference on Automatic Face and Gesture Recognition, 2013.
3. Seung Ho Lee, Jae Young Choi, Yong Man Ro, and Konstantinos N. Plataniotis, ¡°Local Color Vector Binary Patterns from Multichannel Face Images for Face Recognition,¡± IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 2347-2353, 2012.
ü Video: Facial Expression
Recognition in video.