Human Parsing' & ‘Human Pose Estimation’ - Two Major Areas Of Analysis Of Human Body
‘Human Parsing’ and ‘Human Pose Estimation’ are two major areas of analysis of human body which is used for many-a modern day applications. IN spite of there being a strong co-relation between these two tasks, there has been any study on the systematic inter-linkage between the two.
In this paper, the author Prof. Dong and his colleagues have tried to discuss a single framework uniting these two tasks. They have proposed the concept of ‘Semantic regions’ and ‘Parselets’ which convert the parsing and estimation tasks into a mathematical algorithm.There has been several related works on Pose Estimation that have used an amalgamation of models and templates.
However, as the number of human poses increases, the parameters also keep on increasing which the earlier model cannot resolve. In case of human parsing too, the related studies have not been able to explore the task completely as the linkage between human structure and appearance has not been successfully established.In this paper, ‘Hybrid Parsing Model’ (HPM) is proposed to unify both the tasks. It uses the Mixture of Joint-Group Templates (MJGT) framework and MAP estimation to provide a strong correlation between the positions.
The Parselets which are generated by algorithm of segmentation, having a robust semantics helps to integrate both human parsing and human pose tasks by using ‘And-Or’ graph. For pose estimation, fourteen joints are converted into five groups as there is no direct correspondence between Parselets and joints.There exists three pair-wise geometrical relations, namely: ‘Parselet-MJGT’, ‘Parselet-Parselet’ and Parent-Child. To address these geometrical complexities, Grid Layout Feature (GLF) model is incorporated, where spatial distribution of pixels is used to measure its mask. In this manner the centroid can be computed and the linkage between two tasks can be established.
For calculating the positions and scaling of upper level nodes, the initial layout can be changed into a tree model where all the ‘And’ and ‘Leaf’ nodes are considered as super nodes. In such scenario, in spite of cycles being present in first and second layer, the algorithm can be computed faster. Here the ‘Learning Framework’ is used for Parselet selection and enables in unifying pose estimation and human parsing.Experiments are conducted on two of the recent data sets using probability of a correct pose (PCP) evaluation criteria.
Complementary metrics are used for comparing the output with previous studies. The result shows that by the use of GLF and Parselets, this method helps in successfully detecting the joint positions even if there are obstacles from clothing. This method is hence superior to all other previous models across all metrics.