<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0717-5000</journal-id>
<journal-title><![CDATA[CLEI Electronic Journal]]></journal-title>
<abbrev-journal-title><![CDATA[CLEIej]]></abbrev-journal-title>
<issn>0717-5000</issn>
<publisher>
<publisher-name><![CDATA[Centro Latinoamericano de Estudios en Informática]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0717-50002014000200011</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Accuracy and Efficiency Performance of the ICP Procedure Applied to Sign Language Recognition]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Paulino da Silva]]></surname>
<given-names><![CDATA[Juarez]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Lamar]]></surname>
<given-names><![CDATA[Marcus Vinicius]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Bordim]]></surname>
<given-names><![CDATA[Jacir Luiz]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,University of Brasilia Computer Science Department ]]></institution>
<addr-line><![CDATA[Bras´&#305;lia DF]]></addr-line>
<country>Brazil</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>08</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>08</month>
<year>2014</year>
</pub-date>
<volume>17</volume>
<numero>2</numero>
<fpage>11</fpage>
<lpage>11</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_arttext&amp;pid=S0717-50002014000200011&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_abstract&amp;pid=S0717-50002014000200011&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_pdf&amp;pid=S0717-50002014000200011&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[This work addresses the problem of recognizing the American Sign Language (ASL) hand alphabet relying only on depth information acquired from an RGB-D sensor. To accomplish this goal, a novel Iterative Closest Point (ICP) based recognition methodology is proposed where it comprehensively analyzes the inputs and outputs of the alignment as efficiency and accuracy determinants. Next, a novel classification technique, denoted Approximated KB-fit, is proposed to efficiently handle the space complexity of the database template matching. The overall accuracy of the recognition reached a performance of 99.04% in a cross-validation workbench with 520 distinct input depth images. The achieved frame rate was 7.41 FPS performed on a 2:4 GHz single processor based machine]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[3D Shape Congruence]]></kwd>
<kwd lng="en"><![CDATA[ASL Hand Alphabet Recognition]]></kwd>
<kwd lng="en"><![CDATA[ICP Alignment]]></kwd>
<kwd lng="en"><![CDATA[Pattern Recognition]]></kwd>
<kwd lng="en"><![CDATA[Template Matching Architecture]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <div class="maketitle">                                                                 <b><font face="Verdana" size="4">Accuracy and Efficiency Performance of the ICP Procedure Applied to Sign Language Recognition</font></b>    <div class="author" >   <font face="Verdana" size="2"><span  class="ptmb7t-x-x-120">Juarez Paulino da Silva Júnior</span> <br /> <span  class="ptmr7t-x-x-120">University of Brasília, Computer Science Department,</span> <br />            <span  class="ptmr7t-x-x-120">Brasília, DF, Brazil, 70910-900</span> <br />  <span  class="ptmri7t-x-x-120"><a href="mailto:juarez.paulino@gmail.com">juarez.paulino@gmail.com</a></span><br class="and" /><span  class="ptmb7t-x-x-120">Marcus Vinicius Lamar</span> <br /> <span  class="ptmr7t-x-x-120">University of Brasília, Computer Science Department,</span> <br />            <span  class="ptmr7t-x-x-120">Brasília, DF, Brazil, 70910-900</span> <br />          <span  class="ptmri7t-x-x-120"><a href="mailto:lamar@unb.br">lamar@unb.br</a> </span><br class="and" /><span  class="ptmb7t-x-x-120">Jacir Luiz Bordim</span> <br /> <span  class="ptmr7t-x-x-120">University of Brasília, Computer Science Department,</span> <br />            <span  class="ptmr7t-x-x-120">Brasília, DF, Brazil, 70910-900</span> <br />                   <span  class="ptmri7t-x-x-120"><a href="mailto:bordim@unb.br">bordim@unb.br</a> </span> </font></div><font face="Verdana" size="2"><br /> </font>     <div class="date" ></div>    </div>        <div  class="abstract"  >     <div class="center"  > <!--l. 81-->    <p >     <div class="minipage">    <div class="center"  > <!--l. 81-->    <p > <font face="Verdana" size="2"> <!--l. 81--></font>    <p ><font face="Verdana" size="2"><span  class="ptmb7t-">Abstract</span></font></div> <!--l. 82-->    ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2">This work addresses the problem of recognizing the American Sign Language (ASL) hand alphabet relying only on depth information acquired from an RGB-D sensor. To accomplish this goal, a novel Iterative Closest Point (ICP) based recognition methodology is proposed where it comprehensively analyzes the inputs and outputs of the alignment as efficiency and accuracy determinants. Next, a novel classification technique, denoted <span  class="ptmri7t-">Approximated KB-fit</span>, is proposed to efficiently handle the space complexity of the database template matching. The overall accuracy of the recognition reached a performance of 99.04% in a cross-validation workbench with 520 distinct input depth images. The achieved frame rate was 7.41 FPS performed on a <img  src="/img/revistas/cleiej/v17n2/2a110x.png" alt="2.4 "  class="math" >&#x00A0;GHz single processor based machine.<br  class="newline" /> </font>     <div class="center"  > <!--l. 81-->    <p > <font face="Verdana" size="2"> <!--l. 81--></font>    <p ><font face="Verdana" size="2"><span  class="ptmb7t-">Portuguese Abstract</span></font></div>    <font face="Verdana" size="2">Este trabalho aborda o problema do reconhecimento do alfabeto manual da Lngua de Sinais Americana (do Inglês, American Sign Language - ASL) utilizando apenas a informa cão de profundidade adquirida por um sensor RGB-D. Para atingir este objetivo, é proposta uma nova metodologia de reconhecimento baseada no algoritmo ICP (do Inglês, Iterative Closest Point). As entradas e saídas do alinhamento ICP são amplamente analisadas como determinantes da eficiência e acurácia do método. Em seguida, uma nova técnica de classifica cão, denotada por Ajuste Aproximado por K-Baldes é proposta para a estratégia de casamento de modelos. Esta técnica permite lidar eficientemente com a elevada complexidade computacional da busca no espa co da base de dados. A acurácia geral do reconhecimento atingiu um desempenho de 99,04% em um ambiente de valida cão cruzada com 520 entradas diferentes de imagens de profundidade. A taxa de quadros alcan cada foi de 7,41 FPS realizada em uma máquina baseada em um processador único de 2,4 GHz.</font></div></div> </div> <!--l. 101-->    <p ><font face="Verdana" size="2"><span  class="ptmb7t-">Keywords:  </span>3D Shape Congruence, ASL Hand Alphabet Recognition, ICP Alignment, Pattern Recognition, Template Matching Architecture. <br  class="newline" />Portuguese Keywords: Alinhamento ICP, Arquitetura de Casamento de Modelos, Congru&#322;ncia de Formas 3D, Reconhecimento de Padres, Reconhecimento do Alfabeto Manual da ASL<br  class="newline" />Received 2013-11-15, Revised 2014-02-20, Accepted 2014-02-20 </font>                                                                                                                                                                                            <p><font face="Verdana" size="2"><span class="titlemark">1    </span> <a   id="x1-10001"></a>Introduction</font></p> <!--l. 111-->    <p ><font face="Verdana" size="2">The recent introduction of low-cost sensor devices, empowered with real-time RGB-D image acquisition mechanisms, favored the appearance of many innovative computer vision works. In particular, the use of depth data can robustly simplify typical and difficult tasks such as image segmentation, occlusion handling or data interpretation in environments with poor illumination properties&#x00A0;<span class="cite"><a name="br1">[</a><a  href="#Xkinfu:2011:2">1</a><a name="br2">,</a>&#x00A0;<a  href="#Xsuarez:2012">2</a>]</span>. <!--l. 118--></font>    <p >   <font face="Verdana" size="2">Furthermore, the spatial information of the viewed scenes substantially increases the application possibilities of expressive vision-based algorithms in research areas like 3D reconstruction, augmented reality and 3D body and object pose tracking&#x00A0;<span class="cite"><a name="br1">[</a><a  href="#Xkinfu:2011:2">1</a><a name="br3">,</a>&#x00A0;<a  href="#Xkinfu:2011">3</a><a name="br4">,</a>&#x00A0;<a  href="#Xoikonomidis:2012">4</a>]</span>. <!--l. 123--></font>    <p >   <font face="Verdana" size="2">Depth data is also of particular interest to hand gesture recognition. Although one can find accurate and efficient solutions in the literature, they often require additional hardware or accessories which may be expensive or demand a complex setup. Moreover, these solutions present constraints which reduce the natural user interface with the system&#x00A0;<span class="cite"><a name="br5">[</a><a  href="#Xmitra:2007">5</a>]</span>. Alternatively, solutions based only on intensity images usually lack robustness when applied to different light conditions or have low accuracy to distinguish gestures images with similar colors and shapes even when they present distinct tridimensional compositions&#x00A0;<span class="cite"><a name="br2">[</a><a  href="#Xsuarez:2012">2</a>]</span>. Thus, it should be expected that spatial data acquisition through a low cost RGB-D sensor may be a more natural and robust way to handle the aforementioned problems. <!--l. 133--></font>    <p >   <font face="Verdana" size="2">Sign language recognition is considered the most complex category in the gesture recognition domain&#x00A0;<span class="cite"><a name="br6">[</a><a  href="#Xalahdal:2012">6</a>]</span>, since it deals with a large amount of static and dynamic postures which can be very similar in shape and involve not only hands, but also face, torso and arms. This work is restricted to the recognition of the American Sign Language (ASL) alphabet letters, a set consisting of 26 static hand posture gestures (in order to adapt the temporal sequence gestures from letters &#8216;J&#8217; and &#8216;Z&#8217; to a static form, it is presumed that they are captured on their final posture positions), as shown in Figure <a  href="#x1-1001r1">1<!--tex4ht:ref: fig:alphabet --></a>. <!--l. 140--></font>    ]]></body>
<body><![CDATA[<p >   <hr class="figure">    <div class="figure"  > <!--l. 142-->    <p >     <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-1001r1" href="/img/revistas/cleiej/v17n2/2a11f1.jpg">Figure&#x00A0;1:</a> </span><span   class="content">ASL hand alphabet. Reproduced from&#x00A0;<span class="cite"><a name="br7">[</a><a  href="#Xwikipedia_alphabet">7</a>]</span>.</span></font></div><!--tex4ht:label?: x1-1001r1 -->                                                                                                                                                                                     <!--l. 145-->    <p >   </div><hr class="endfigure"> <!--l. 148-->    <p >   <font face="Verdana" size="2">In fact, to solve this less-general problem, by using spatial information, a system is required to process an input depth image from the sensor and assign it to one of the 26 possible representative classes of the ASL letter alphabet. One simple, but effective, approach is to apply the template matching architecture&#x00A0;<span class="cite"><a name="br8">[</a><a  href="#Xzabulis:2009">8</a><a name="br9">,</a>&#x00A0;<a  href="#Xrau:2012">9</a>]</span> as a classification tool which consists in: (i) <span  class="ptmri7t-">matching </span>the test data to a selected representative set of the database models; and (ii) <span  class="ptmri7t-">comparing</span> these images in a pairwise fashion to deduce the level of similarity between both images. The reference models that provide the best values for a predefined <span  class="ptmri7t-">metric of correspondence </span>are used to identify the class to which the test data belongs. Though it is not the best process for fast image recognition, this architecture allows a full and detailed analysis of a given matching procedure&#x00A0;<span class="cite"><a name="br2">[</a><a  href="#Xsuarez:2012">2</a>]</span>. From this scope, a natural matching procedure is to directly compare the aligned test and model images. Iterative Closest Point&#x00A0;<span class="cite"><a name="br10">[</a><a  href="#Xbesl:1992">10</a>]</span> (ICP) is an algorithm that performs such alignment and allows the quantification of the correspondence between the test and model pair. Other works have already tried the ICP as a matching procedure&#x00A0;<span class="cite"><a name="br11">[</a><a  href="#Xtrindade:2012">11</a>]</span> in the same context, but have also discarded it, claiming that the ICP was not suitable to retrieve good comparison metrics while performing the classification. <!--l. 163--></font>    <p >   <font face="Verdana" size="2">This work proposes an investigation of the ICP as a 3D shape matching algorithm applied to the recognition of the ASL alphabet letters. The presented results and methodology aim to explore the inputs and outputs of the ICP procedure and to identify a set of parameters to improve <span  class="ptmri7t-">accuracy </span>(how many correct matches it can identify) and <span  class="ptmri7t-">efficiency </span>(average recognition speed). <!--l. 169--></font>    <p >   <font face="Verdana" size="2">Once the ICP has properly performed the data alignment, the next step is to establish a comparison mechanism to infer similarities. As it will be shown, the correspondence metrics choice directly affects the recognition accuracy. Experimental results show that the proposed methodology attains an accuracy level over <img  src="/img/revistas/cleiej/v17n2/2a111x.png" alt="99% "  class="math" >. According to Table <a  href="#x1-4001r1">1<!--tex4ht:ref: tab:0 --></a>, this accuracy surpasses other similar works in ASL alphabet recognition in at least 10 percentage points. Besides that, most real applications of the hand alphabet recognition should be consistent with the notion of an almost real-time processing (<img  src="/img/revistas/cleiej/v17n2/2a112x.png" alt="&#x2248; 15 "  class="math" > FPS). In fact, the ICP is a slow iterative technique and, unless it is somewhat improved, it is not recommended to real-time applications&#x00A0;<span class="cite"><a name="br12">[</a><a  href="#Xrusinkiewicz:2001">12</a>]</span>. By imposing performance constraints as input parameters to the ICP, the proposed implementation obtained an improvement of <img  src="/img/revistas/cleiej/v17n2/2a113x.png" alt="10 "  class="math" > fold while maintaining a reasonable accuracy. Likewise, based on the described approach, the template matching architecture should be efficiently modified since its naive brute force (denoted here as <span  class="ptmri7t-">best-fit</span>) runs only in <img  src="/img/revistas/cleiej/v17n2/2a114x.png" alt="0.20 "  class="math" > FPS (Table <a  href="#x1-23002r5">5<!--tex4ht:ref: tab:3 --></a>). In this aspect, the proposed Approximated <img  src="/img/revistas/cleiej/v17n2/2a115x.png" alt="K "  class="math" >Bucket<span  class="ptmri7t-">-fit </span>classification limits the space complexity of the applied database and contributes with a 40 time improvement in the recognition speed. <!--l. 185--></font>    <p >   <font face="Verdana" size="2">Regarding the considerations above, the main contributions in this paper are: </font>      <dl class="enumerate">        <dd><font face="Verdana" size="2">1. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2">The  proposal  and  analysis  of  <img  src="/img/revistas/cleiej/v17n2/2a116x.png" alt="5 "  class="math" > output  metrics  (<span  class="ptmri7t-">Alignment  Matrix  Norm</span>,  <span  class="ptmri7t-">Maximum  Distance  Threshold</span>,      <span  class="ptmri7t-">Minimum Inliers</span>, <span  class="ptmri7t-">Maximum Inliers </span>and <span  class="ptmri7t-">Mean Inliers</span>), </font>      </dd>        <dd><font face="Verdana" size="2">2. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2">Evaluation of the ICP input parameters and their effects on accuracy and recognition speed;        </font>      </dd>        <dd><font face="Verdana" size="2">3. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2">The proposal of an efficient algorithm for the template matching (<span  class="ptmri7t-">Approximated</span> <img  src="/img/revistas/cleiej/v17n2/2a117x.png" alt="K "  class="math" ><span  class="ptmri7t-">Bucket</span>-fit) as a classification      tool.</font></dd></dl> <!--l. 194-->    <p >   <font face="Verdana" size="2">The rest of the paper is organized as follows. Section&#x00A0;<a  href="#x1-20002">2<!--tex4ht:ref: sec:dois --></a> presents the related works while Section&#x00A0;<a  href="#x1-50003">3<!--tex4ht:ref: sec:background --></a> gives an overview of the ICP. Section&#x00A0;<a  href="#x1-150005">5<!--tex4ht:ref: sec:two --></a> details the ICP enhancements for ASL recognition. Section&#x00A0;<a  href="#x1-190006">6<!--tex4ht:ref: sec:three --></a> presents the experimental scenarios along with their respective simulations and detailed results. Finally, Section&#x00A0;<a  href="#x1-240007">7<!--tex4ht:ref: sec:four --></a> concludes the paper, including future directions. </font>        ]]></body>
<body><![CDATA[<p><font face="Verdana" size="2"><span class="titlemark">2    </span> <a   id="x1-20002"></a>Related Work</font></p> <!--l. 203-->    <p ><font face="Verdana" size="2">Many related works share common problems and objectives with the present paper. Although sign language recognition is a well studied topic in computer vision research&#x00A0;<span class="cite"><a name="br2">[</a><a  href="#Xsuarez:2012">2</a><a name="br6">,</a>&#x00A0;<a  href="#Xalahdal:2012">6</a>]</span>, most of the early solid works did not establish complete solutions to the problem when it requires accuracy, efficiency and natural user interaction altogether. In this context, the use of depth data with real-time acquisition allows not only the fast object segmentation, as it might be naively used, but also a significant improvement on the accuracy of the recognition. This section presents a brief compilation of related works which successfully apply depth data on the recognition of hand and sign language gestures.                                                                                                                                                                                     <!--l. 213--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">2.1    </span> <a   id="x1-30002.1"></a>General Techniques</font></p> <!--l. 215-->    <p ><font face="Verdana" size="2">Some techniques explore the properties of 3D data acquisition, applying them on the gesture recognition. In <span class="cite"><a name="br13">[</a><a  href="#Xfujimura:2006">13</a>]</span>, the authors acquire the depth stream data, analyze the morphology, position and orientation of hand and fingers, and apply a constraint-matching to recognize a few common hand shapes in Japanese Sign Language. Although it is possible to find optimal constraints that improve system accuracy on small sets, it is still a hard task to scale this type of classification in larger scenarios. <!--l. 222--></font>    <p >   <font face="Verdana" size="2">Another set of approaches exists to deal with sign recognition through depth images, which includes Support Vector Machines (SVM)&#x00A0;<span class="cite"><a name="br14">[</a><a  href="#Xkeskin:2011">14</a>]</span>, K-Nearest Neighbors (KNN)&#x00A0;<span class="cite"><a name="br15">[</a><a  href="#Xvan:2011">15</a>]</span>, Average Neighborhood Margin Maximization (ANMM)&#x00A0;<span class="cite"><a name="br16">[</a><a  href="#Xuebersax:2011">16</a>]</span>, Hidden Markov Models (HMM)&#x00A0;<span class="cite"><a name="br17">[</a><a  href="#Xliwicki:2009">17</a>]</span>, and Artificial Neural Networks (ANN)&#x00A0;<span class="cite"><a name="br14">[</a><a  href="#Xkeskin:2011">14</a><a name="br18">,</a>&#x00A0;<a  href="#Xkonda:2012">18</a>]</span>. These approaches rely on specific feature extraction, extensive offline training and produce fast estimation for real-time applications. However, augmenting or diversifying the number of gestures often requires a new training process, and a poor feature specification tends to be a main drawback in obtaining high accuracy. <!--l. 232--></font>    <p >   <font face="Verdana" size="2">A recent employed technique is due to the use of <span  class="ptmri7t-">Random Decision Forests </span>(RDF)&#x00A0;<span class="cite"><a name="br14">[</a><a  href="#Xkeskin:2011">14</a><a name="br19">,</a>&#x00A0;<a  href="#Xpugeault:2011">19</a>]</span>. It performs classification in real-time frame rates but retrieves only a confidence level for each trained class, which, most of the times, is not accurate enough given a defined set of features. Pugeault and Bowden&#x00A0;<span class="cite"><a name="br19">[</a><a  href="#Xpugeault:2011">19</a>]</span> built an interactive system that uses Gabor Filter to extract features prior to RDF classification. Their results presented interesting ambiguity classes for the ASL alphabet letters (Figure <a  href="#x1-3001r2">2<!--tex4ht:ref: fig:aemnst --></a>), highlighted by the sets <img  src="/img/revistas/cleiej/v17n2/2a118x.png" alt="{A, E,M, N,S,T } "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a119x.png" alt="{U,R } "  class="math" >, from which they got low confidence levels and, even with the use of depth data, could not be addressed. <!--l. 240--></font>    <p >   <hr class="figure">    <div class="figure"  > <!--l. 242-->    <p >     ]]></body>
<body><![CDATA[<div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-3001r2" href="/img/revistas/cleiej/v17n2/2a11f2.jpg">Figure&#x00A0;2:</a> </span><span   class="content">Illustration of ambiguous classes in the ASL alphabet. Letters are represented by a closed fist, and differ only by the thumb position, leading to higher confusion levels. Adapted from&#x00A0;<span class="cite"><a name="br19">[</a><a  href="#Xpugeault:2011">19</a>]</span>.</span></font></div><!--tex4ht:label?: x1-3001r2 -->                                                                                                                                                                                     <!--l. 246-->    <p >   </div><hr class="endfigure">        <p><font face="Verdana" size="2"><span class="titlemark">2.2    </span> <a   id="x1-40002.2"></a>ICP Specific Related Methodologies</font></p> <!--l. 251-->    <p ><font face="Verdana" size="2">Besl and McKay&#x00A0;<span class="cite"><a name="br10">[</a><a  href="#Xbesl:1992">10</a>]</span> figured out that ICP could be useful in the congruence of two 3D distinct geometric forms and not only as an alignment procedure. However, due mainly to the slow iterative process, few researches have investigated the indirect outputs of the ICP procedure and applied them on hand shape matching. <!--l. 257--></font>    <p >   <font face="Verdana" size="2">A more remarkable set of contributions from the ICP procedure may be borrowed from researches exploring biometrics. Amor <span  class="ptmri7t-">et al.</span>&#x00A0;<span class="cite"><a name="br20">[</a><a  href="#Xamor:2006">20</a>]</span> built a classical probe-and-gallery model and successfully recognized face depth images from arbitrary points of view using the ICP shape matching. In <span class="cite"><a name="br21">[</a><a  href="#Xping:2005">21</a>]</span>, concerned with efficiency, the authors proposed a spatial voxel indexation model associated with a database repository. This association was then used to perform constant time computation of the closest points in an ICP iteration. Both works use only the common mean square distance error to identify shape congruences. Moreover, ICP studies on 3D shape biometrics often rely on body parts which are more statically fixed when compared to the high degree of freedom (DOF) context of hand gestures, so it is simpler to achieve higher accuracy (Table <a  href="#x1-4001r1">1<!--tex4ht:ref: tab:0 --></a>). <!--l. 267--></font>    <p >   <font face="Verdana" size="2">On a more recent work, Trindade <span  class="ptmri7t-">et al.</span>&#x00A0;<span class="cite"><a name="br11">[</a><a  href="#Xtrindade:2012">11</a>]</span> developed a system to recognize manual alphabet letters in the Portuguese Sign Language. The authors conducted experiments with acquired depth data and used them on shape matching applying a naive ICP approach. They stated that, because of the limited information from the acquired point cloud, their ICP implementation was not well suited to perform same class sign matching. However, they did not present concrete results of this analysis and neither described their experiments in depth. </font>        <div class="table"> <!--l. 275-->    <p >  <hr class="float">    <div class="float"  >                                                             <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-4001r1" href="/img/revistas/cleiej/v17n2/2a1110x.png">Table&#x00A0;1:</a> </span><span   class="content">Comparative results between related works and techniques.</span></font></div><!--tex4ht:label?: x1-4001r1 -->    </div><hr class="endfloat" />    </div> <!--l. 298-->    ]]></body>
<body><![CDATA[<p >   <font face="Verdana" size="2">Table <a  href="#x1-4001r1">1<!--tex4ht:ref: tab:0 --></a> summarizes the main works discussed in this section. The underlying approaches of these works are presented along with the recognition scenario where they were applied and the accuracy reported by their experiments. <!--l. 302--></font>    <p >   <font face="Verdana" size="2">From the discussion above, none of the related works have systematically analyzed the ICP as a possible matching procedure for 3D hand shape recognition. Also, only a few works explore the ICP efficiency on near real-time contexts, as it requires optimization improvements on the naive processing. None of them, however, are directly applied to the sign language recognition. </font>        <p><font face="Verdana" size="2"><span class="titlemark">3    </span> <a   id="x1-50003"></a>Background Overview</font></p> <!--l. 312-->    <p ><font face="Verdana" size="2">Iterative Closest Point (ICP)&#x00A0;<span class="cite"><a name="br10">[</a><a  href="#Xbesl:1992">10</a>]</span> is the dominant fine registration algorithm in the literature and it aims at the retrieval of an accurate solution to the Euclidean rigid motion between two 3D point surfaces. The rigid motion (transformation) is usually recovered by means of the scale, rotation and translation components that bring the two point sets to a same spatial orientation. This way, the ICP algorithm works by iteratively minimizing the cost function of the distances computed between selected corresponding points in the two surfaces (Figure <a  href="#x1-5001r3">3<!--tex4ht:ref: fig:ICP --></a>). In turn, the cost function is usually associated with a mean square distance metric which it is mathematically stated to always converge to the nearest local minimum&#x00A0;<span class="cite"><a name="br10">[</a><a  href="#Xbesl:1992">10</a>]</span>. <!--l. 324--></font>    <p >   <hr class="figure">    <div class="figure"  > <!--l. 326-->    <p >    <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-5001r3" href="/img/revistas/cleiej/v17n2/2a11f3.jpg"> Figure&#x00A0;3:</a> </span><span   class="content">Illustration of the rigid motion acquisition from corresponding points.</span></font></div><!--tex4ht:label?: x1-5001r3 -->                                                                                                                                                                                     <!--l. 329-->    <p >   </div><hr class="endfigure"> <!--l. 331-->    <p >   <font face="Verdana" size="2">In its most simple approach, and assuming a cost function based on point-to-point distance between correspondences, an iterative step of the alignment includes the following: </font>      <ul class="itemize1">      <li class="itemize"><font face="Verdana" size="2">Starting with an estimate <img  src="/img/revistas/cleiej/v17n2/2a1111x.png" alt="T "  class="math" > of the rigid transformation of a previous iteration, each point of the first point set,      <img  src="/img/revistas/cleiej/v17n2/2a1112x.png" alt="pi &#x2208; P "  class="math" >, is brought to an induced point <img  src="/img/revistas/cleiej/v17n2/2a1113x.png" alt="T(pi) "  class="math" > in the same oriented system of the second point set. Therefore,      along one iterative step, the method needs to search for another set of points on the second model, <img  src="/img/revistas/cleiej/v17n2/2a1114x.png" alt="mi &#x2208; M "  class="math" >,      which minimizes the distance cost function between <img  src="/img/revistas/cleiej/v17n2/2a1115x.png" alt="T (pi) "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a1116x.png" alt="mi  "  class="math" > (Equation <a  href="#x1-5002r1">1<!--tex4ht:ref: icp_custo --></a>). </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5002r1"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1117x.png" alt="                  &#x250C;&#x2502; ---L-&#x2225;--------------&#x2225;2 point-to-point     = &#x2502;&#x2218; 1-&#x2211;  &#x2225;&#x2225;-m&#x2192; - (R-&#x2192;p + -&#x2192;t )&#x2225;&#x2225; ,           rms-error    L i=1&#x2225;  i     i     &#x2225;      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(1)</font></td></tr></table>      <!--l. 348-->    ]]></body>
<body><![CDATA[<p >      <font face="Verdana" size="2">      <!--l. 350--></font>    <p ><font face="Verdana" size="2">where <img  src="/img/revistas/cleiej/v17n2/2a1118x.png" alt="L "  class="math" > is the number of assigned correspondences, <img  src="/img/revistas/cleiej/v17n2/2a1119x.png" alt="pi  "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a1120x.png" alt="mi  "  class="math" > are the corresponding pairs selected from both test      and model images, <img  src="/img/revistas/cleiej/v17n2/2a1121x.png" alt="R "  class="math" > is the rotation matrix, and <img  src="/img/revistas/cleiej/v17n2/2a1122x.png" alt="-&#x2192;t "  class="math" > is the translation vector, both obtained by the transformation <img  src="/img/revistas/cleiej/v17n2/2a1123x.png" alt="T "  class="math" >      acquired in the last iteration. </font>      </li>      <li class="itemize"><font face="Verdana" size="2">One of the possible solutions to minimize the previous equation, by means of unit quaternions&#x00A0;<span class="cite"><a name="br22">[</a><a  href="#Xhorn:1987">22</a>]</span>, is to build a      symmetric matrix <img  src="/img/revistas/cleiej/v17n2/2a1124x.png" alt="Q(&#x03A3;pm ) "  class="math" >, of size <img  src="/img/revistas/cleiej/v17n2/2a1125x.png" alt="4&#x00D7; 4 "  class="math" >:      </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5003r2"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1126x.png" alt="          [tr(&#x03A3;   )          &#x0394;T          ] Q (&#x03A3;pm) =    &#x0394; pm    &#x03A3;   +&#x03A3;T  - tr(&#x03A3;  )I  ,                      pm    pm      pm  3      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(2)</font></td></tr></table>      <!--l. 363-->    <p >      <font face="Verdana" size="2">      <!--l. 365--></font>    <p ><font face="Verdana" size="2">where <img  src="/img/revistas/cleiej/v17n2/2a1127x.png" alt="tr "  class="math" > is the <span  class="ptmri7t-">trace </span>function, <img  src="/img/revistas/cleiej/v17n2/2a1128x.png" alt="&#x0394;  = [A23A31A12]T  "  class="math" > is computed from the skew-symmetric matrix given by      <img  src="/img/revistas/cleiej/v17n2/2a1129x.png" alt="Aij = (&#x03A3;pm - &#x03A3;T )               pm  ij  "  class="math" >; <img  src="/img/revistas/cleiej/v17n2/2a1130x.png" alt="&#x0394;T  "  class="math" > is the transpose of <img  src="/img/revistas/cleiej/v17n2/2a1131x.png" alt="&#x0394; "  class="math" >; <img  src="/img/revistas/cleiej/v17n2/2a1132x.png" alt="I3  "  class="math" > is the identity matrix; and <img  src="/img/revistas/cleiej/v17n2/2a1133x.png" alt="&#x03A3;pm  "  class="math" > is the cross-variance matrix      of the point sets <img  src="/img/revistas/cleiej/v17n2/2a1134x.png" alt="p  i  "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a1135x.png" alt="m   i  "  class="math" > given by: </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5004r3"></a>                                                                                                                                                                                              </font>                                                                                                                                                                                          <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1136x.png" alt="         Np       -1-&#x2211;  [-&#x2192;  -&#x2192; ]   -&#x2192; -&#x2192; &#x03A3;pm = Np i=1  pimi  - &#x03BC; p&#x03BC;m,      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(3)</font></td></tr></table>      <!--l. 376-->    <p >      <font face="Verdana" size="2">      <!--l. 378--></font>    <p ><font face="Verdana" size="2">with: </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5005r4"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1137x.png" alt="         Np -&#x2192;&#x03BC; p =-1-&#x2211;  -&#x2192;pi,      Np i=1      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(4)</font></td></tr></table>      <!--l. 383-->    <p >      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5006r5"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1138x.png" alt="-&#x2192;      1  N&#x2211;m -&#x2192;  &#x03BC;m = Nm-    mi,           i=1      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(5)</font></td></tr></table>      <!--l. 388-->    <p >      <font face="Verdana" size="2">      <!--l. 390--></font>    <p ><font face="Verdana" size="2">the centroids of the point sets <img  src="/img/revistas/cleiej/v17n2/2a1139x.png" alt="pi  "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a1140x.png" alt="mi  "  class="math" >, respectively.      </font>      </li>      <li class="itemize"><font face="Verdana" size="2">The unit eigenvector <img  src="/img/revistas/cleiej/v17n2/2a1141x.png" alt="-&#x2192;     [       ]T  q R = q0q1q2q3  "  class="math" > correlated to the greatest eigenvalue of the matrix <img  src="/img/revistas/cleiej/v17n2/2a1142x.png" alt="Q "  class="math" > is chosen as the      new rotation expressed in terms of a quaternion. The rotation matrix <img  src="/img/revistas/cleiej/v17n2/2a1143x.png" alt="R "  class="math" > can, then, be retrieved and the      new translation vector <img  src="/img/revistas/cleiej/v17n2/2a1144x.png" alt="t "  class="math" > is easily computed by the difference vector between the centroids as shown      below: </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5007r6"></a>                                                                                                                                                                                              </font>                                                                                                                                                                                          <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1145x.png" alt="     &#x230A; 2   2   2   2                                &#x230B;      &#x2308;q0 + q1 - q2 - q3 22(q1q22 - q02q3)2  2(q1q3 + q0q2) &#x2309; R =    2(q1q2 + q0q3)   q0 - q1 + q2 - q3 22(q2q32 - q02q1)2 ,        2(q1q3 - q0q2)    2(q2q3 - q0q1)   q0 - q1 - q2 + q3      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(6)</font></td></tr></table>      <!--l. 408-->    <p >      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5008r7"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1146x.png" alt="    -&#x2192;      -&#x2192; t = &#x03BC;m - R &#x03BC; p.      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(7)</font></td></tr></table>      <!--l. 413-->    ]]></body>
<body><![CDATA[<p >      </li>      <li class="itemize"><font face="Verdana" size="2">The method should iterate until it converges to an optimal solution, where the cost function is minimized, in other words,      is below a cutoff value (<span  class="ptmri7t-">threshold</span>). </font>      </li>    </ul> <!--l. 421-->    <p >   <font face="Verdana" size="2">Once the iterative procedure has been completed, the last acquired alignment transformation (<img  src="/img/revistas/cleiej/v17n2/2a1147x.png" alt="T "  class="math" >) should give a correspondence map between the two different views (coordinate systems) of the aligned point surfaces. It means that, given a point <img  src="/img/revistas/cleiej/v17n2/2a1148x.png" alt="p &#x2208; P  i "  class="math" >, it is corresponded to the best point <img  src="/img/revistas/cleiej/v17n2/2a1149x.png" alt="p&#x2032; "  class="math" > given by the alignment <img  src="/img/revistas/cleiej/v17n2/2a1150x.png" alt="T "  class="math" > on the surface of <img  src="/img/revistas/cleiej/v17n2/2a1151x.png" alt="M "  class="math" >, as shown by Equation&#x00A0;<a  href="#x1-5009r8">8<!--tex4ht:ref: correspond --></a>: </font>    <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5009r8"></a>        </font>    <center class="math-display" > <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a1152x.png" alt=" &#x2032;    -&#x2192;       -&#x2192;   -&#x2192; p = T(pi) = R (pi) + t . " class="math-display" ></font></center></td><td class="equation-label">        <font face="Verdana" size="2">(8)</font></td></tr></table> <!--l. 429-->    <p > <font face="Verdana" size="2"> <!--l. 433--></font>    <p >   <font face="Verdana" size="2">As a fine alignment mechanism, the ICP supposes that a good initial rigid motion between the shapes has been provided. This way, it is possible to either increase the convergence speed as to achieve a global minimum for the cost function. It is necessary to highlight that when a reasonable estimate is not provided, the processing may incorrectly converge to a local minimum and fail to retrieve a satisfactory solution. Coarse registration techniques&#x00A0;<span class="cite"><a name="br23">[</a><a  href="#Xpaulino:2011">23</a><a name="br24">,</a>&#x00A0;<a  href="#Xzhiyuan:2012">24</a>]</span> are frequently used to find rough estimates. In the proposed methodology, a rough estimate is simply taken from the translation vector between the two shape centroids.                                                                                                                                                                                     <!--l. 442--></font>    <p >   <font face="Verdana" size="2">An important work on the ICP procedure is due to Rusinkiewicz and Levoy <span class="cite"><a name="br12">[</a><a  href="#Xrusinkiewicz:2001">12</a>]</span> who optimized the algorithm for efficiency. They divided the procedure in stages and analyzed many of the proposed variants to achieve the best convergence speed. From their set of variants definitions, this work applies: </font>     <dl class="enumerate">       <dd><font face="Verdana" size="2">i) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">a classical select-match-minimize type of iteration;       </font>     </dd>       <dd><font face="Verdana" size="2">ii) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">a kd-tree data structure, with normal point compatibility, is used to find corresponding points in logarithmic time;       </font>     </dd>       <dd><font face="Verdana" size="2">iii) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">threshold rejection for the distant corresponded pairs of points;       </font>     </dd>       <dd><font face="Verdana" size="2">iv) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">point-to-point minimization in the first steps to assure stability;       </font>     </dd>       <dd><font face="Verdana" size="2">v) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">point-to-plane iterations in the main loop body;       </font>     </dd>       <dd><font face="Verdana" size="2">vi) </font></dd>       <dd  class="enumerate"><font face="Verdana" size="2">uniform random subsampling from both the point clouds.</font></dd></dl> <!--l. 456-->    <p >   <font face="Verdana" size="2">The overall complexity of this proposed ICP variant is in the order of: </font>    <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-5016r9"></a>        </font>    <center class="math-display" > <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a1153x.png" alt="O(KL (logNp + logNm )), " class="math-display" ></font></center></td><td class="equation-label">        <font face="Verdana" size="2">(9)</font></td></tr></table> <!--l. 461-->    <p > <font face="Verdana" size="2"> <!--l. 463--></font>    <p ><font face="Verdana" size="2">where <img  src="/img/revistas/cleiej/v17n2/2a1154x.png" alt="K "  class="math" > is the performed number of iterations, <img  src="/img/revistas/cleiej/v17n2/2a1155x.png" alt="L "  class="math" > is the required number of corresponding pairs selected in each iteration, <img  src="/img/revistas/cleiej/v17n2/2a1156x.png" alt="N   p  "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a1157x.png" alt="N   m  "  class="math" > are the number of vertices from test and model data, respectively. </font>        <p><font face="Verdana" size="2"><span class="titlemark">4    </span> <a   id="x1-60004"></a>ICP Enhancement for Sign Language Recognition</font></p> <!--l. 470-->    ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2">One simple observation when performing a template matching recognition is that there must be some mechanism to correspond the test and model data&#x00A0;<span class="cite"><a name="br8">[</a><a  href="#Xzabulis:2009">8</a>]</span>. In this sense, it is natural to consider that if both data are aligned, i.e. they are in a same global orientation, it should be easier to compare them. This is a clear motivation to apply the ICP registration, but there still must be a way to relate both the aligned test and model information. In fact, this type of correspondence may be deduced from the ICP processing itself. <!--l. 478--></font>    <p >   <font face="Verdana" size="2">Another important concern to the design of the recognition system is how long the ICP procedure will take to correspond the test data with the entire model database. That is, how much it is possible to improve the ICP <span  class="ptmri7t-">efficiency</span> without compromising its <span  class="ptmri7t-">accuracy </span>performance and how to efficiently handle the number of comparisons under the database size. In this case, the overall procedure to perform a simple recognition is the application of <img  src="/img/revistas/cleiej/v17n2/2a1158x.png" alt="M "  class="math" > instances (related with database size) of the ICP alignment, as in Equation <a  href="#x1-5016r9">9<!--tex4ht:ref: icpcomp --></a>, resulting in a total time complexity of: </font>    <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-6001r10"></a>        </font>    <center class="math-display" >                                                                                                                                                                                     <font face="Verdana" size="2">                                                                                                                                                                                     <img  src="/img/revistas/cleiej/v17n2/2a1159x.png" alt="O(M KL (log Np + logNm )). " class="math-display" ></font></center></td><td class="equation-label">        <font face="Verdana" size="2">(10)</font></td></tr></table> <!--l. 488-->    <p > <font face="Verdana" size="2"> <!--l. 490--></font>    <p >   <font face="Verdana" size="2">This section introduces the main contributions of this work, starting with the ICP enhancements to quickly produce reliable metrics for the sign language recognition (Section&#x00A0;<a  href="#x1-70004.1">4.1<!--tex4ht:ref: aliana --></a>), and presenting, later, an efficient design for classification techniques based on template matching (Section&#x00A0;<a  href="#x1-120004.2">4.2<!--tex4ht:ref: classi --></a>). <!--l. 496--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">4.1    </span> <a   id="x1-70004.1"></a>Alignment Analysis</font></p> <!--l. 499-->    <p ><font face="Verdana" size="2">The first contribution is the improvement of the ICP alignment processing. Such improvement can be achieved by manipulating the variable inputs (the iterative elements and procedure modifiers) and inspecting what may be measured (correspondence metrics) in a single instance of the registration. The hypothesis is that each of these selected properties may directly contribute to the accuracy or efficiency performance of the ICP shape matching. <!--l. 506--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">4.1.1    </span> <a   id="x1-80004.1.1"></a>Input Parameters</font></p> <!--l. 509-->    <p ><font face="Verdana" size="2"><span class="paragraphHead"><a   id="x1-90004.1.1"></a><span  class="ptmb7t-">Iterative Elements:</span></span>    As already mentioned, the cost of the ICP algorithm is a main problem when applying it to real-time systems. This fact can be derived from the ICP algorithm complexity, as shown in Equation <a  href="#x1-5016r9">9<!--tex4ht:ref: icpcomp --></a>. For instance, with large <img  src="/img/revistas/cleiej/v17n2/2a1160x.png" alt="K "  class="math" > or <img  src="/img/revistas/cleiej/v17n2/2a1161x.png" alt="L "  class="math" > values, the ICP running time is not expected to complete its task in a reasonable time to support real-time applications. So there are basically two identified types of iterative elements in the proposed implementation which have clear influence in the ICP efficiency: <!--l. 518--></font>    ]]></body>
<body><![CDATA[<p >      <dl class="enumerate">        <dd><font face="Verdana" size="2">1. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">The maximum allowed number of main loop iterations</span>:      <!--l. 520--></font>    <p ><font face="Verdana" size="2">As previously presented, the main loop body of the implemented ICP procedure is composed of point-to-plane      steps  of  the  select-match-minimize  iterations.  If  the  two  point  surfaces  are  already  closely  related,  the  ICP      processing can find the best alignment in fewer iterations; </font>      </dd>        <dd><font face="Verdana" size="2">2. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">The maximum allowed number of correspondence points selected</span>:      <!--l. 525--></font>    <p ><font face="Verdana" size="2">The restriction of the points selection in each iteration is straight correlated to a subsampling of the original      ICP problem. As already stated, an uniform random subsampling is implemented. The proposed implementation      includes a parameter to set the chosen number of selected points to correspond in each iteration, so only a      reduced size of the original alignment problem needs to be solved.</font></dd></dl> <!--l. 531-->    <p >   <font face="Verdana" size="2">From both the input parameters, it is expected that a restriction on their values contributes to speed the recognition process without affecting the shape matching accuracy. <!--l. 534--></font>    <p ><font face="Verdana" size="2"><span class="paragraphHead"><a   id="x1-100004.1.1"></a><span  class="ptmb7t-">Modifiers:</span></span>    The basic ICP implementation retrieves only the rigid transformation (rotation and translation) from two shape geometries. In this sense, the procedure modifiers are implementation flags which decide whether or not to compute an approximate scale in the minimization stage&#x00A0;<span class="cite"><a name="br22">[</a><a  href="#Xhorn:1987">22</a>]</span>. Three types of scale transformation are investigated: </font>      <ul class="itemize1">      <li class="itemize"><font face="Verdana" size="2">rigid motion with no modifier;      </font>                                                                                                                                                                                          </li>      <li class="itemize"><font face="Verdana" size="2">motion with a computed uniform scaling;      </font>      </li>      <li class="itemize"><font face="Verdana" size="2">motion with a computed non-uniform scaling.</font></li>    </ul> <!--l. 547-->    <p >   <font face="Verdana" size="2">It is expected that the applied scale modifiers can achieve a slightly better accuracy since it can handle different hand shapes and positions from the acquired images. <!--l. 550--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">4.1.2    </span> <a   id="x1-110004.1.2"></a>Correspondence Metrics</font></p> <!--l. 553-->    <p ><font face="Verdana" size="2">The definition of correspondence metrics emerges from the need to compare and match different alignments of the test data with the template models. In ICP-based sign language recognition, the choice of a reliable metric may contribute to identify the right matching. <!--l. 557--></font>    ]]></body>
<body><![CDATA[<p >   <font face="Verdana" size="2">The first chosen metric (point-to-point RMS Error) is usually known and, in most of the works, is the only metric used to implement <span  class="ptmri7t-">correspondence evaluations</span>. The second metric (point-to-plane RMS Error) is common in ICP works but is mainly related to alignment convergence and is not usually applied in recognition. <!--l. 563--></font>    <p >      <dl class="enumerate">        <dd><font face="Verdana" size="2">1. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Root Mean Square </span><span  class="ptmbi7t-">Point-to-point </span><span  class="ptmb7t-">Error (Point-to-point RMS Error)</span>      <!--l. 565--></font>    <p ><font face="Verdana" size="2">This metric has been employed in a number of related works&#x00A0;<span class="cite"><a name="br11">[</a><a  href="#Xtrindade:2012">11</a><a name="br12">,</a>&#x00A0;<a  href="#Xrusinkiewicz:2001">12</a><a name="br20">,</a>&#x00A0;<a  href="#Xamor:2006">20</a><a name="br21">,</a>&#x00A0;<a  href="#Xping:2005">21</a>]</span>. The Point-to-point RMS error      computation is performed after the final iteration with the same function required for the minimization step      (Equation <a  href="#x1-5002r1">1<!--tex4ht:ref: icp_custo --></a>); </font>      </dd>        <dd><font face="Verdana" size="2">2. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Root Mean Square </span><span  class="ptmbi7t-">Point-to-plane </span><span  class="ptmb7t-">Error (Point-to-plane RMS Error)</span>      <!--l. 571--></font>    <p ><font face="Verdana" size="2">The <span  class="ptmri7t-">point-to-plane error </span>is usually indicated to align flat surfaces and frequently has better convergence than the      point-to-point minimization&#x00A0;<span class="cite"><a name="br12">[</a><a  href="#Xrusinkiewicz:2001">12</a>]</span>. Besides its importance to convergence and efficiency criteria, it is not usually      found in works that evaluates correspondence between two shapes.</font></dd></dl> <!--l. 577-->    <p >   <font face="Verdana" size="2">These commonly applied <span  class="ptmri7t-">root mean square </span>metrics are derived directly from the error value of the iterative steps in ICP processing, and may not be the most appropriate parameters to achieve high accuracy in the ASL recognition. In this context, this work proposes another set of metrics for evaluating correspondences, as follows: <!--l. 583--></font>    <p >      <dl class="enumerate">        <dd><font face="Verdana" size="2">3. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Alignment Matrix Norm</span>      <!--l. 586--></font>    <p ><font face="Verdana" size="2">The <span  class="ptmri7t-">alignment matrix norm </span>used in this text is relative to the Fobrenius norm, which is a natural extension of      the vector norms applied to matrices. This metric quantifies how much the last minimization step proceeded in      the direction of a local minimum.      <!--l. 591--></font>    <p ><font face="Verdana" size="2">Given the 4 x 4 transformation matrix <img  src="/img/revistas/cleiej/v17n2/2a1162x.png" alt="Taligned   "  class="math" >  of the last step and assuming <img  src="/img/revistas/cleiej/v17n2/2a1163x.png" alt="I4  "  class="math" > as the identity matrix, the      required computation is done by: </font>      <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-11004r11"></a>          </font>      <center class="math-display" >      <font face="Verdana" size="2">      <img  src="/img/revistas/cleiej/v17n2/2a1164x.png" alt="          &#x250C;&#x2502; ---------------------           &#x2502;&#x2218; &#x2211;4 &#x2211;4              2 &#x2225;Tresidual&#x2225; =         |(Taligned - I4)ij| ;             i=1 j=1      " class="math-display" ></font></center></td><td class="equation-label">          <font face="Verdana" size="2">(11)</font></td></tr></table>      <!--l. 597-->    <p >                                                                                                                                                                                          </dd>        <dd><font face="Verdana" size="2">4. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Maximum Distance Threshold (Max-dist Pairs)</span>      <!--l. 601--></font>    <p ><font face="Verdana" size="2">In the proposed ICP implementation, a <span  class="ptmri7t-">maximum distance threshold </span>is iteratively computed while performing the      correspondence rejection. This metric starts with a rough estimation (the minimum of the diagonal bounding box      distance from both the point sets) and is updated by a factor of the median distance of the corresponding points in each      iteration.      <!--l. 605--></font>    ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2">In practical terms, as the data images are acquired from a fixed system, it is expected that the smaller the maximum      distance estimate is, the better the match between the given shapes will be;        </font>      </dd>        <dd><font face="Verdana" size="2">5. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Minimum Inliers (Min inliers)</span>      <!--l. 610--></font>    <p ><font face="Verdana" size="2">A <span  class="ptmri7t-">minimum inliers </span>metric is proposed as a similarity metric which estimates the spatial overlapping parts in the final      matched surfaces. It may be computed efficiently through a proper voxel indexation algorithm or, with less performance,      by using kd-trees. The search procedure identifies a given overlapping point if its nearest neighbor on the other image is      under the max distance threshold.      <!--l. 615--></font>    <p ><font face="Verdana" size="2">The semantic of overlapping metrics is trivial since the more the two shapes overlap, the better the match is expected to      be. The word &#8220;minimum&#8221; in this metric indicates that the overlap computation is conducted from the test data or from the      model data, applying the acquired rigid transformation <img  src="/img/revistas/cleiej/v17n2/2a1165x.png" alt="T "  class="math" > (for test data) or its inverse <img  src="/img/revistas/cleiej/v17n2/2a1166x.png" alt="Tinv  "  class="math" > (for model data), choosing      the type of data which gives the minimum fraction of correlated features (pessimistic view of the correspondence      overlap); </font>      </dd>        <dd><font face="Verdana" size="2">6. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Maximum Inliers (Max Inliers)</span>      <!--l. 623--></font>    <p ><font face="Verdana" size="2">The <span  class="ptmri7t-">maximum inliers </span>is similar to the minimum inliers value but it acquires the greatest fraction of correlated features      from both directions of overlapping computation (optimistic view of the correspondence overlap);        </font>      </dd>        <dd><font face="Verdana" size="2">7. </font></dd>        <dd  class="enumerate"><font face="Verdana" size="2"><span  class="ptmb7t-">Mean Inliers</span>      <!--l. 628--></font>    <p ><font face="Verdana" size="2">The <span  class="ptmri7t-">mean inliers </span>metric is just the acquisition of the mean value of the overlapping represented by the      minimum and maximum inliers. In this way it is expected from statistics that a more representative value is      obtained. </font>      </dd></dl> <!--l. 633-->    <p >   <font face="Verdana" size="2">The correspondence metrics are a good mechanism to estimate the 3D shape congruence between two depth images. By extracting such values, it is stated that a sign language system based on ICP processing can accurately identify the 26 ASL alphabet letters given an enrolled set of template models. <!--l. 638--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">4.2    </span> <a   id="x1-120004.2"></a>Classifier</font></p> <!--l. 641-->    <p ><font face="Verdana" size="2">In a basic verification scenario, a test data <img  src="/img/revistas/cleiej/v17n2/2a1167x.png" alt="P "  class="math" > is matched against a set of template models <img  src="/img/revistas/cleiej/v17n2/2a1168x.png" alt="{Mi} "  class="math" > uniformly distributed in the evaluated classes so that a classifier can infer which of the ASL letters the test belongs to. To perform such deduction, a classification technique should give a consistent interpretation of the extracted correspondence metrics. <!--l. 646--></font>    <p >   <font face="Verdana" size="2">This work proposes, as contributions, the <span  class="ptmri7t-">best-fit </span>(Section <a  href="#x1-130004.2.1">4.2.1<!--tex4ht:ref: bestfit --></a>) and the <span  class="ptmri7t-">Approximate</span> <img  src="/img/revistas/cleiej/v17n2/2a1169x.png" alt="K "  class="math" ><span  class="ptmri7t-">Bucket-fit </span>(Section <a  href="#x1-140004.2.2">4.2.2<!--tex4ht:ref: kbfit --></a>) techniques to the classification stage. The <span  class="ptmri7t-">best-fit </span>was primarily stated to verify the accuracy of the 3D shape matching. The <span  class="ptmri7t-">Approximate</span> <img  src="/img/revistas/cleiej/v17n2/2a1170x.png" alt="K "  class="math" ><span  class="ptmri7t-">Bucket-fit</span>, alternatively, has a strong bias to improve the methodology efficiency, reducing the space complexity for database comparisons. <!--l. 652--></font>    ]]></body>
<body><![CDATA[<p >        <p><font face="Verdana" size="2"><span class="titlemark">4.2.1    </span> <a   id="x1-130004.2.1"></a>Best-fit Classification</font></p> <!--l. 655-->    <p ><font face="Verdana" size="2">A first simple approach is to consider, in each instance, the complete set of database models and to run the ICP procedure to each possible enrolled image. The template model which provides the best fit value given a selected correspondence metric is chosen as the one that labels the test data in its respective class. <!--l. 659--></font>    <p >   <font face="Verdana" size="2">This strategy could be related to a form of 1-NN classification&#x00A0;<span class="cite"><a name="br25">[</a><a  href="#Xmalassiotis:2002">25</a><a name="br9">,</a>&#x00A0;<a  href="#Xrau:2012">9</a>]</span>, where a test sample is recognized with its nearest neighbor data given a set of features. However, in the context of the ICP, there is no proper feature vector from a single image, since the evaluation metrics are acquired in a pairwise manner. So, there is no possible nearest-neighbor context and the                                                                                                                                                                                     complete search of the best similarity must be employed. From now on, this first procedure is referred as the <span  class="ptmri7t-">best-fit</span> technique. <!--l. 666--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">4.2.2    </span> <a   id="x1-140004.2.2"></a>Approximate <img  src="/img/revistas/cleiej/v17n2/2a1171x.png" alt="K "  class="math" >Bucket-fit Classification</font></p> <!--l. 669-->    <p ><font face="Verdana" size="2">One of the obvious drawbacks of the <span  class="ptmri7t-">best-fit </span>technique is its slow comparison process of one test data against all the template models. That is, before the ICP registration, there is no <span  class="ptmri7t-">a priori </span>knowledge of the correspondence metrics between the input instance and the enrolled database. Concerned with the efficiency requirements, this work proposes the <span  class="ptmri7t-">approximate</span> <img  src="/img/revistas/cleiej/v17n2/2a1172x.png" alt="K "  class="math" ><span  class="ptmri7t-">Bucket-fit </span>(<img  src="/img/revistas/cleiej/v17n2/2a1173x.png" alt="K "  class="math" >B-fit) algorithm. <!--l. 673--></font>    <p >   <font face="Verdana" size="2">A sketch of the procedure is listed in Algorithm <a  href="#x1-14001r1">1<!--tex4ht:ref: alg:1 --></a>. In this technique, a subset of the database models are selected in a non-deterministic fashion so that <img  src="/img/revistas/cleiej/v17n2/2a1174x.png" alt="K "  class="math" > samples from each class are chosen (line&#x00A0;<a  href="#x1-14005r4">4<!--tex4ht:ref: kbfit:subset --></a>). Next, processing involves <img  src="/img/revistas/cleiej/v17n2/2a1175x.png" alt="&#x2225;U &#x2225; "  class="math" > instances of the ICP alignment (lines&#x00A0;<a  href="#x1-14006r5">5<!--tex4ht:ref: kbfit:align1 --></a>-<a  href="#x1-14008r7">7<!--tex4ht:ref: kbfit:align3 --></a>), which are responsible for retrieving similarity values regarding a predetermined correspondence metric. The final computation consists in obtaining, for each class, the mean similarity value of each class (lines&#x00A0;<a  href="#x1-14009r8">8<!--tex4ht:ref: kbfit:class1 --></a>-<a  href="#x1-14015r14">14<!--tex4ht:ref: kbfit:classend --></a>). The recognition process terminates, in line&#x00A0;<a  href="#x1-14016r15">15<!--tex4ht:ref: kbfit:return --></a>, when the class with the best mean value is found. </font>        <div class="algorithm">                                                                                                                                                                                     <!--l. 680-->    <p >   <font face="Verdana" size="2">   <a   id="x1-14001r1"></a></font><hr class="float">    ]]></body>
<body><![CDATA[<div class="float"  >                                                                                                                                                                                          <div class="caption"  ><font face="Verdana" size="2"><span class="id">Algorithm 1: </span><span   class="content">Approximate <img  src="/img/revistas/cleiej/v17n2/2a1176x.png" alt="K "  class="math" >Bucket-fit.</span></font></div><!--tex4ht:label?: x1-14001r1 -->     <div class="algorithmic"> <font face="Verdana" size="2"> <span class="ALCitem"><span  class="ptmb7t-">Require:</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1177x.png" alt="M  "  class="math" >: Full set of models;<br />    &#x00A0;&#x00A0;&#x00A0;              <img  src="/img/revistas/cleiej/v17n2/2a1178x.png" alt="P "  class="math" >: Test data instance;<br />    &#x00A0;&#x00A0;&#x00A0;              <img  src="/img/revistas/cleiej/v17n2/2a1179x.png" alt="K "  class="math" >: Class bucket size. <span class="ALCitem"><span  class="ptmb7t-">Ensure:</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1180x.png" alt="&#x2102;  "  class="math" >: Recognized class. <a   id="x1-14002r1"></a>  <span class="ALCitem"><span  class="ptmr7t-x-x-90">1.</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1181x.png" alt="U  &#x2190; &#x2205; "  class="math" >&#x00A0;&#x00A0;&#x00A0;&#x00A0;&#x00A0;                                                                     <span class="ALC-comment"><span  class="cmsy-10">{</span>Stores random samples from the 26 classes<span  class="cmsy-10">}</span></span><a   id="x1-14003r2"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">2.</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1182x.png" alt="B [26] &#x2190; {&#x2205;} "  class="math" > &#x00A0;&#x00A0;&#x00A0;&#x00A0;&#x00A0;                                                            <span class="ALC-comment"><span  class="cmsy-10">{</span>Bucket  lists  with  <img  src="/img/revistas/cleiej/v17n2/2a1183x.png" alt="K "  class="math" > samples  of  metric  values  per    class<span  class="cmsy-10">}</span></span><a   id="x1-14004r3"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">3.</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1184x.png" alt="M  [26] &#x2190; {0} "  class="math" >&#x00A0;&#x00A0;&#x00A0;&#x00A0;&#x00A0;                                                          <span class="ALC-comment"><span  class="cmsy-10">{</span>Mean metric value per class<span  class="cmsy-10">}</span></span><a   id="x1-14005r4"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">4.</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1185x.png" alt="U  &#x2190; GenerateSubset  "  class="math" >(<img  src="/img/revistas/cleiej/v17n2/2a1186x.png" alt="M, K  "  class="math" >)  <a   id="x1-14006r5"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">5.</span></span><span style="width:5.0pt;">&nbsp;</span> <span  class="ptmb7t-">for all</span>&#x00A0; model <img  src="/img/revistas/cleiej/v17n2/2a1187x.png" alt="u &#x2208; U "  class="math" > &#x00A0;<span  class="ptmb7t-">do</span><span class="for-body"> <a   id="x1-14007r6"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">6.</span></span><span style="width:15.00002pt;">&nbsp;</span>   <img  src="/img/revistas/cleiej/v17n2/2a1188x.png" alt="B [u.class] &#x2190; B [u.class]&#x222A;EvaluateICPMetric (P,u) "  class="math" >    </span><a   id="x1-14008r7"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">7.</span></span><span style="width:5.0pt;">&nbsp;</span> <span  class="ptmb7t-">end</span>&#x00A0;<span  class="ptmb7t-">for</span> <a   id="x1-14009r8"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">8.</span></span><span style="width:5.0pt;">&nbsp;</span> <img  src="/img/revistas/cleiej/v17n2/2a1189x.png" alt="&#x2102; &#x2190;  &#x2205; "  class="math" >, <img  src="/img/revistas/cleiej/v17n2/2a1190x.png" alt="BestM ean &#x2190; 0 "  class="math" > <a   id="x1-14010r9"></a>  <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">9.</span></span><span style="width:5.0pt;">&nbsp;</span> <span  class="ptmb7t-">for all</span>&#x00A0; class <img  src="/img/revistas/cleiej/v17n2/2a1191x.png" alt="c &#x2208; {A-Z} "  class="math" >&#x00A0;<span  class="ptmb7t-">do</span><span class="for-body"> <a   id="x1-14011r10"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">10.</span></span><span style="width:15.00002pt;">&nbsp;</span>   <img  src="/img/revistas/cleiej/v17n2/2a1192x.png" alt="M [c] &#x2190; ComputeMeanValue  (B [c]) "  class="math" >; <a   id="x1-14012r11"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">11.</span></span><span style="width:15.00002pt;">&nbsp;</span>   <span  class="ptmb7t-">if</span>&#x00A0;<img  src="/img/revistas/cleiej/v17n2/2a1193x.png" alt="&#x2102; = &#x2205; "  class="math" > <span  class="ptmb7t-">or</span> <img  src="/img/revistas/cleiej/v17n2/2a1194x.png" alt="BestM ean &#x003C; M [c] "  class="math" >&#x00A0;<span  class="ptmb7t-">then</span><span class="if-body"> <a   id="x1-14013r12"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">12.</span></span><span style="width:25.00003pt;">&nbsp;</span>     <img  src="/img/revistas/cleiej/v17n2/2a1195x.png" alt="&#x2102; &#x2190;  c "  class="math" >, <img  src="/img/revistas/cleiej/v17n2/2a1196x.png" alt="B estMean &#x2190; M [c] "  class="math" >      </span><a   id="x1-14014r13"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">13.</span></span><span style="width:15.00002pt;">&nbsp;</span>   <span  class="ptmb7t-">end</span>&#x00A0;<span  class="ptmb7t-">if</span>    </span><a   id="x1-14015r14"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">14.</span></span><span style="width:5.0pt;">&nbsp;</span> <span  class="ptmb7t-">end</span>&#x00A0;<span  class="ptmb7t-">for</span> <a   id="x1-14016r15"></a> <br /><span class="ALCitem"><span  class="ptmr7t-x-x-90">15.</span></span><span style="width:5.0pt;">&nbsp;</span> <span  class="ptmb7t-">return </span>&#x00A0;<img  src="/img/revistas/cleiej/v17n2/2a1197x.png" alt="&#x2102;; "  class="math" > </font> </div>                                                                                                                                                                                        </div><hr class="endfloat" />    </div> <!--l. 718-->    <p >   <font face="Verdana" size="2">As the <img  src="/img/revistas/cleiej/v17n2/2a1198x.png" alt="K "  class="math" >B-fit technique relies in a randomized and statistical procedure, an empirical set of experiments are conducted in Section <a  href="#x1-190006">6<!--tex4ht:ref: sec:three --></a> to analyze its accuracy. </font>        <p><font face="Verdana" size="2"><span class="titlemark">5    </span> <a   id="x1-150005"></a>Proposed Methodology</font></p> <!--l. 728-->    <p ><font face="Verdana" size="2">The specific paradigm, applied throughout the work, consists of a template matching architecture where classification is performed by comparing a given &#8220;testcase&#8221; against an enrolled set of depth images models. The complete diagram of the proposed methodology is shown in Figure <a  href="#x1-15001r4">4<!--tex4ht:ref: fig:methodology --></a>. Recognition is performed by a series of three stages, each of them with its own requirements and providing the respective output elements to be used in the subsequent stage. <!--l. 734--></font>    <p >   <hr class="figure">    <div class="figure"  >                                      <!--l. 736-->    <p >    <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-15001r4" href="/img/revistas/cleiej/v17n2/2a11f4.jpg">Figure&#x00A0;4:</a> </span><span   class="content">Diagram of the applied recognition methodology.</span></font></div><!--tex4ht:label?: x1-15001r4 -->                                                                                                                                                                                     <!--l. 739-->    ]]></body>
<body><![CDATA[<p >   </div><hr class="endfigure"> <!--l. 742-->    <p >   <font face="Verdana" size="2">An initial sequence of pre-processing steps are executed to prepare the raw depth images acquired from the sensor device (<img  src="/img/revistas/cleiej/v17n2/2a1199x.png" alt="1st   "  class="math" > Stage). Next, the ICP procedure aligns a given test image with a database models subset which is evenly distributed from the ASL alphabet classes (<img  src="/img/revistas/cleiej/v17n2/2a11100x.png" alt="2nd   "  class="math" > Stage). Then, the sub-products of the ICP alignment are used as evaluation metrics in the proposed classification scheme (<img  src="/img/revistas/cleiej/v17n2/2a11101x.png" alt="3rd   "  class="math" >  Stage). The following sections describes each of these stages in detail. </font>        <p><font face="Verdana" size="2"><span class="titlemark">5.1    </span> <a   id="x1-160005.1"></a>Pre-processing Steps</font></p> <!--l. 751-->    <p ><font face="Verdana" size="2">The image acquisition is always performed with an off-the-shelf Kinect sensor&#x00A0;<span class="cite"><a name="br26">[</a><a  href="#Xkinect:2013">26</a>]</span>, OpenNI SDK&#x00A0;<span class="cite"><a name="br27">[</a><a  href="#Xopenni:2013">27</a>]</span> and NITE middleware&#x00A0;<span class="cite"><a name="br28">[</a><a  href="#Xnite:2013">28</a>]</span>. The sensor is fixed at a half body height position and only the acquired depth channel is used to apply the proposed techniques. <!--l. 756--></font>    <p >   <font face="Verdana" size="2">Hand image segmentation has always been a crucial step on the sign language recognition&#x00A0;<span class="cite"><a name="br2">[</a><a  href="#Xsuarez:2012">2</a>]</span>. By using the spatial localization, depth data based algorithms simplify this hard task providing robust and accurate results. Since segmentation is not the main focus, this paper applies a common depth thresholding approach, which defines a fixed boxed frame from where all its contained points are determined as hands. The segmented images are acquired as a <img  src="/img/revistas/cleiej/v17n2/2a11102x.png" alt="128&#x00D7; 128 "  class="math" > pixel window where each pixel represents the hand depth distance ranging from <img  src="/img/revistas/cleiej/v17n2/2a11103x.png" alt="70 "  class="math" >cm to <img  src="/img/revistas/cleiej/v17n2/2a11104x.png" alt="110 "  class="math" >cm of the sensor device. To normalize the acquisition and to obtain a more uniform image representation, the user of the system is required to perform the hand gesture trying to fill the entire box frame during segmentation. <!--l. 765--></font>    <p >   <font face="Verdana" size="2">The coordinates in the acquired <img  src="/img/revistas/cleiej/v17n2/2a11105x.png" alt="128 &#x00D7; 128 "  class="math" > window are given in pixels for the <img  src="/img/revistas/cleiej/v17n2/2a11106x.png" alt="x "  class="math" >,<img  src="/img/revistas/cleiej/v17n2/2a11107x.png" alt="y "  class="math" >-axis and in millimeters for the depth information (<img  src="/img/revistas/cleiej/v17n2/2a11108x.png" alt="z "  class="math" >-axis). The <span  class="ptmri7t-">depth coordinate system </span>is the Kinect&#8217;s native data representation. Such representation, however, is not appropriate for recognition purposes. Hence, a <span  class="ptmri7t-">world coordinate system </span>must be defined to match with a true 3D Cartesian coordinate system where at least the distance metrics are compatible with the three spatial axis. In this world system, every point is specified by 3 axis values  <img  src="/img/revistas/cleiej/v17n2/2a11109x.png" alt="x "  class="math" >, <img  src="/img/revistas/cleiej/v17n2/2a11110x.png" alt="y "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a11111x.png" alt="z "  class="math" >. The <img  src="/img/revistas/cleiej/v17n2/2a11112x.png" alt="x "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a11113x.png" alt="y "  class="math" > axis run along a line in the same direction and with the same origin as the <img  src="/img/revistas/cleiej/v17n2/2a11114x.png" alt="x "  class="math" >,<img  src="/img/revistas/cleiej/v17n2/2a11115x.png" alt="y "  class="math" >-axis of the projected pixel image but with a proper metric space. The <img  src="/img/revistas/cleiej/v17n2/2a11116x.png" alt="z "  class="math" > axis runs into the scene, perpendicular to both the <img  src="/img/revistas/cleiej/v17n2/2a11117x.png" alt="x "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a11118x.png" alt="y "  class="math" > axis, with the same semantics as the old depth coordinate system. Although the OpenNI SDK&#x00A0;<span class="cite"><a name="br27">[</a><a  href="#Xopenni:2013">27</a>]</span> provides conversion procedures to precisely estimate the coordinate metrics from the camera point of view, the use of its functions require expensive computation which should not be practical for real-time applications. Due to this condition, it is proposed a simpler conversion function from the native <span  class="ptmri7t-">depth coordinate system </span>to an approximate <span  class="ptmri7t-">world coordinate</span> <span  class="ptmri7t-">system</span>. The conversion retrieves the horizontal and vertical maximum field of view extension (Figure <a  href="#x1-15001r4">4<!--tex4ht:ref: fig:methodology --></a>) from the camera and performs an image scaling, as defined in Equation&#x00A0;<a  href="#x1-16001r12">12<!--tex4ht:ref: eq:1 --></a>. From this conversion, it is expected that the metric space becomes more consistent and the Euclidean properties are preserved without compromising efficiency. </font>    <table  class="equation"><tr><td><font face="Verdana" size="2"><a   id="x1-16001r12"></a>        </font>    <center class="math-display" > <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a11119x.png" alt="(                                 )  xworld = xdepth &#x00D7; HHoorirzizonotnatalPliFixeelldROefsoVlieuwtion ||                                 || || yworld = ydepth &#x00D7; VeVrertticicaalPliFixeelldROefsoVlieuwtion || (                                 )             zworld = zdepth " class="math-display" ></font></center></td><td class="equation-label">        <font face="Verdana" size="2">(12)</font></td></tr></table> <!--l. 803-->    <p > <font face="Verdana" size="2"> <!--l. 805--></font>    <p >   <font face="Verdana" size="2">The output of the pre-processing stage is a <img  src="/img/revistas/cleiej/v17n2/2a11120x.png" alt="128&#x00D7; 128 "  class="math" > (<img  src="/img/revistas/cleiej/v17n2/2a11121x.png" alt="16 "  class="math" >K) 3D point image consisting of the essential hand information from where all the other recognition steps are applied. No additional pre-processing is performed. <!--l. 809--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">5.2    </span> <a   id="x1-170005.2"></a>ICP Processing</font></p>                                                                                                                                                                                     <!--l. 811-->    ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2">This stage is responsible to apply the ICP alignment between the pre-processed test image and the template models in the database, retrieving the correspondence metrics required to perform the final classification. The most significant part of the processing time in the methodology is spent in this stage, therefore the majority of the efficiency issues are also dealt here. <!--l. 817--></font>    <p >   <font face="Verdana" size="2">To instantiate the ICP processing, it is required to choose: (i) a set of the input parameters (Section&#x00A0;<a  href="#x1-80004.1.1">4.1.1<!--tex4ht:ref: inputpar --></a>); and (ii) an appropriate correspondence metric (Section&#x00A0;<a  href="#x1-110004.1.2">4.1.2<!--tex4ht:ref: cormetric --></a>) to keep track of the performed alignments. The main purpose of selecting the input parameters is to enhance the overall efficiency by limiting the complexity of the procedure. In contrast, the choice of a correspondence metric (analysis in Section&#x00A0;<a  href="#x1-200006.1">6.1<!--tex4ht:ref: result:acc --></a>) will lead to better accuracy results. <!--l. 824--></font>    <p >   <font face="Verdana" size="2">The applied template database was built from 20 samples of each of the 26 ASL alphabet letter (a total of 520 model samples) acquired under different ambient light conditions, with some of the possible variants of the same hand gesture and from a single user. Even though the data was intentionally acquired under different light conditions, it did not influence the accuracy results since all the processing is produced by the depth channel (light has almost no influence in the depth data). On the other hand, the diversification of the database, by means of the accepted postures variants, was selected from minor degree rotations of each of the standard postures, as described in Figure&#x00A0;<a  href="#x1-1001r1">1<!--tex4ht:ref: fig:alphabet --></a>. <!--l. 833--></font>    <p >   <font face="Verdana" size="2">Finally, the use of the proposed template database should consider the specific classification technique that will be applied in the next stage. The <span  class="ptmri7t-">best-fit </span>classification (Section&#x00A0;<a  href="#x1-130004.2.1">4.2.1<!--tex4ht:ref: bestfit --></a>) tries to perform the alignment of the test image against all the samples in the database, whereas the <span  class="ptmri7t-">KB-fit </span>technique (Section&#x00A0;<a  href="#x1-140004.2.2">4.2.2<!--tex4ht:ref: kbfit --></a>) smartly selects a database subsampling, thus improving efficiency. <!--l. 840--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">5.3    </span> <a   id="x1-180005.3"></a>Classification</font></p> <!--l. 842-->    <p ><font face="Verdana" size="2">Classification is the last stage in the recognition methodology. It basically gathers all the correspondence metrics from each test-model pair and indicates the best class of representation for the test image. This stage is efficiently computed in the template matching architecture, and it is even faster than image acquisition from the pre-processing stage. It runs in <img  src="/img/revistas/cleiej/v17n2/2a11122x.png" alt="O (M ) "  class="math" >, where <img  src="/img/revistas/cleiej/v17n2/2a11123x.png" alt="M "  class="math" > is the number of pairwise alignments performed in the previous ICP processing stage. <!--l. 849--></font>    <p >   <font face="Verdana" size="2">In terms of accuracy, the <span  class="ptmri7t-">best-fit </span>technique (Section&#x00A0;<a  href="#x1-130004.2.1">4.2.1<!--tex4ht:ref: bestfit --></a>) presents the best results since it searches the entire database samples. Its use is justified by the hypothesis that the closest template model should have a more valuable correspondence metric with the test data. This way, if the database has a larger number of samples, there will also be a better chance for the <span  class="ptmri7t-">best-fit </span>classification finding the right output class. <!--l. 855--></font>    <p >   <font face="Verdana" size="2">In cases where efficiency matters, like in real-time applications, the <span  class="ptmri7t-">KB-fit </span>technique (Section&#x00A0;<a  href="#x1-140004.2.2">4.2.2<!--tex4ht:ref: kbfit --></a>) can be employed. It downsizes the database samples required for alignment in the previous stage and attempts to maintain the accuracy level. In the context of this technique, a specific analysis is performed, indicating the best tradeoff between accuracy and efficiency while varying the bucket size values (Section&#x00A0;<a  href="#x1-190006">6<!--tex4ht:ref: sec:three --></a>). <!--l. 860--></font>    <p >   <font face="Verdana" size="2">The next section encompasses a full set of experimental scenarios and their results are discussed so the expected hypothesis are verified. <!--l. 865--></font>    ]]></body>
<body><![CDATA[<p >        <p><font face="Verdana" size="2"><span class="titlemark">6    </span> <a   id="x1-190006"></a>Experiments and Discussion</font></p> <!--l. 868-->    <p ><font face="Verdana" size="2">The proposed methodology is evaluated based on (<img  src="/img/revistas/cleiej/v17n2/2a11124x.png" alt="i "  class="math" >) <span  class="ptmri7t-">accuracy</span>: the percentage of correct matches for a given test scenario; and (<img  src="/img/revistas/cleiej/v17n2/2a11125x.png" alt="ii "  class="math" >) <span  class="ptmri7t-">efficiency</span>: average ICP runtime and average recognition frame rate. For this purpose, both <span  class="ptmri7t-">offline</span> simulations and <span  class="ptmri7t-">online </span>experiments have been conduced. While offline simulations were used to compute the average ICP runtime and accuracy, online experiments were used to compute the average frame processing rate. <!--l. 878--></font>    <p >   <font face="Verdana" size="2">In order to restrict the scope of the analysis, and avoid a combinatorial explosion in the number of possibilities, a baseline combination for the studied parameters has been defined. These values are shown in Table <a  href="#x1-19001r2">2<!--tex4ht:ref: tab:1 --></a> and the reasoning behind their choice is discussed in the following sections. </font>     <div class="table">                                                                                                                                                                                     <!--l. 883-->    <p >   <font face="Verdana" size="2">   <a   id="x1-19001r2"></a></font><hr class="float">    <div class="float"  >                                                                                                                                                                                          <div class="caption"  ><font face="Verdana" size="2"><span class="id">Table&#x00A0;2: </span><span   class="content">Baseline values for measuring the ICP performance.</span></font></div><!--tex4ht:label?: x1-19001r2 -->     <div class="pic-tabular"> <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a11126x.png"></font></div>                                                                                              </div><hr class="endfloat" />    </div> <!--l. 906-->    <p >   <hr class="figure">    ]]></body>
<body><![CDATA[<div class="figure"  >                                                                       <!--l. 908-->    <p >    <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-19002r5" href="/img/revistas/cleiej/v17n2/2a11f5.jpg">Figure&#x00A0;5:</a> </span><span   class="content">Successful ICP matching of the acquired test images (aligned on gray-scale) with the <img  src="/img/revistas/cleiej/v17n2/2a11127x.png" alt="26 "  class="math" > ASL template models (referenced in the RGB channel).</span></font></div><!--tex4ht:label?: x1-19002r5 -->                                                                                                                                                                                     <!--l. 912-->    <p >   </div><hr class="endfigure">        <p><font face="Verdana" size="2"><span class="titlemark">6.1    </span> <a   id="x1-200006.1"></a>Accuracy Verification</font></p> <!--l. 918-->    <p ><font face="Verdana" size="2">As can be observed in Figure <a  href="#x1-19002r5">5<!--tex4ht:ref: fig:alphabet:aligned --></a>, the proposed ICP procedure can effectively match any of the 26 shapes of the ASL hand alphabet. This result provides evidence that alignment can be properly established under the given capture and pre-processing circumstances. As discussed in Section&#x00A0;<a  href="#x1-50003">3<!--tex4ht:ref: sec:background --></a>, computation of coarse registration was not necessary prior to ICP application. The results in Figure&#x00A0;<a  href="#x1-19002r5">5<!--tex4ht:ref: fig:alphabet:aligned --></a> also contradict one of the statements in the work of Trindade <span  class="ptmri7t-">et al.</span>&#x00A0;<span class="cite"><a name="br11">[</a><a  href="#Xtrindade:2012">11</a>]</span> which says that ICP fails to match some of the ASL letters given the incomplete point surfaces acquired from the Kinect sensor. In fact, it may be empirically verified that the depth map usually lacks information where the observed extension of an object is parallel to the <img  src="/img/revistas/cleiej/v17n2/2a11128x.png" alt="z "  class="math" >-axis of the sensor coordinate system (e.g., a forefinger pointing to the camera). However, this lack of information does not degenerate the retrieval of the ICP rigid alignment for the purposes of ASL alphabet recognition. This behavior can be seen from the presented alignments where incomplete depth maps, shown in grayscale, did not cover the entire model, shown in the RGB channel, but were correctly aligned in the proposed environment. <!--l. 932--></font>    <p >   <font face="Verdana" size="2">Although promising, these results do not show the actual rate of correct matches (accuracy). Thus, a more detailed verification is needed before taking further conclusions. <!--l. 935--></font>    <p >        <p><font face="Verdana" size="2"><span class="titlemark">6.1.1    </span> <a   id="x1-210006.1.1"></a>Handling Ambiguities</font></p> <!--l. 937-->    <p ><font face="Verdana" size="2">In order to verify the input parameters and the correspondence metrics, a template matching scenario is established where each model data is aligned with all the others in a quadratic complexity fashion. Figure <a  href="#x1-21001r6">6<!--tex4ht:ref: fig:confusion --></a> presents the confusion matrix of the raw <span  class="ptmri7t-">mean-inliers </span>metric values given the baseline configuration in Table&#x00A0;<a  href="#x1-19001r2">2<!--tex4ht:ref: tab:1 --></a>. In this confusion matrix, the classes are divided in <img  src="/img/revistas/cleiej/v17n2/2a11129x.png" alt="26 &#x00D7;26 "  class="math" > clusters and each cluster is equivalent to the confusion values between two different letters. The analysis of each cluster is given from the decomposition of <img  src="/img/revistas/cleiej/v17n2/2a11130x.png" alt="20&#x00D7; 20 "  class="math" > elements, where each element means the comparison of a specific letter model (<img  src="/img/revistas/cleiej/v17n2/2a11131x.png" alt="L  1  "  class="math" >) with another letter model (<img  src="/img/revistas/cleiej/v17n2/2a11132x.png" alt="L  2  "  class="math" >). The expressive dark cross marks denote the poses of the letters &#8216;P&#8217; and &#8216;Q&#8217; which were distinctly acquired with significant portion of the forearms. Acquisition of the forearms in the input data makes the models of these letters particularly different from any other in the database. <!--l. 948--></font>    ]]></body>
<body><![CDATA[<p >   <hr class="figure">    <div class="figure"  >                                       <!--l. 950-->    <p >       <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-21001r6" href="/img/revistas/cleiej/v17n2/2a11f6.jpg">Figure&#x00A0;6:</a> </span><span   class="content">Confusion matrix for a simulated scenario with the entire database.</span></font></div><!--tex4ht:label?: x1-21001r6 -->                                                                                                                                                                                     <!--l. 953-->    <p >   </div><hr class="endfigure">        <div class="table">                                                                                                                                                                                     <!--l. 958-->    <p >   <hr class="float">    <div class="float"  >                                                             <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-21002r3" href="/img/revistas/cleiej/v17n2/2a11133x.png">Table&#x00A0;3:</a> </span><span   class="content">Averaged similarity values (in %) for the sets of the most ambiguous clusters in Figure <a  href="#x1-21001r6">6<!--tex4ht:ref: fig:confusion --></a>.</span></font></div><!--tex4ht:label?: x1-21002r3 -->     </div><hr class="endfloat" />    </div> <!--l. 1016-->    <p >   <font face="Verdana" size="2">In the figure, The squared-marked clusters in the anti-diagonal represents the <span  class="ptmri7t-">Mean Inliers </span>metric values of the elements within the same class. Thus, these clusters have the brightest colors compared to others on their respective rows. On the other hand, the remaining sparse squared-marked clusters in the figure illustrate the most ambiguous correspondences values for incorrect matches. These clusters still have a bright color but but they have less intensity than those on the anti-diagonal. Table <a  href="#x1-21002r3">3<!--tex4ht:ref: tab:confusion --></a> shows the average similarity for these elements. With some small differences from the ambiguities suggested in <span class="cite"><a name="br19">[</a><a  href="#Xpugeault:2011">19</a>]</span>, the most ambiguous sets in terms of <span  class="ptmri7t-">Mean Inliers </span>similarities are <img  src="/img/revistas/cleiej/v17n2/2a11134x.png" alt="{A,M, N, S,T} "  class="math" >, <img  src="/img/revistas/cleiej/v17n2/2a11135x.png" alt="{E,O } "  class="math" >, <img  src="/img/revistas/cleiej/v17n2/2a11136x.png" alt="{G, H } "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a11137x.png" alt="{R,U, V} "  class="math" > (shown in Table <a  href="#x1-21002r3">3<!--tex4ht:ref: tab:confusion --></a>). As can be verified, even in the worst case, the computed values for a correct matching differ in at least <img  src="/img/revistas/cleiej/v17n2/2a11138x.png" alt="5 "  class="math" > percentage points. Such difference allows for correct recognition of the output classes while handling ambiguities. </font>        ]]></body>
<body><![CDATA[<p><font face="Verdana" size="2"><span class="titlemark">6.1.2    </span> <a   id="x1-220006.1.2"></a>Methodology Evaluation</font></p> <!--l. 1042-->    <p ><font face="Verdana" size="2">The results presented so far have shown that ICP can produce correspondence metrics and represent them as similarities among the possible classes. These results are achieved during the <img  src="/img/revistas/cleiej/v17n2/2a11139x.png" alt="1st   "  class="math" >  and <img  src="/img/revistas/cleiej/v17n2/2a11140x.png" alt="2nd   "  class="math" >  stages of the proposed methodology (Figure&#x00A0;<a  href="#x1-15001r4">4<!--tex4ht:ref: fig:methodology --></a>). The <img  src="/img/revistas/cleiej/v17n2/2a11141x.png" alt="3rd   "  class="math" > stage&#8217;s role is to take advantage of the positive difference between correct and incorrect class similarities. This difference is what the classifier implicit uses to correctly assign a given data to its representative class. <!--l. 1052--></font>    <p >   <font face="Verdana" size="2">The <span  class="ptmri7t-">best-fit </span>technique is applied to count the number of correct classifications by varying the ICP input parameters and evaluation metrics. In Figure <a  href="#x1-22001r7">7<!--tex4ht:ref: fig:accuracy:iterative-elements --></a>, the iterative elements are considered with respect to their achieved accuracies under different evaluation metrics. A first observation is that, for the majority of the proposed metrics, the reported accuracy is weakly correlated to either the number of maximum iterations and the iterative subsampling. These results show that the <span  class="ptmri7t-">Alignment</span> <span  class="ptmri7t-">Matrix Norm </span>gives the worst and most unstable results under the studied elements. On the other hand, the <span  class="ptmri7t-">Min</span> <span  class="ptmri7t-">Inliers </span>and <span  class="ptmri7t-">Mean Inliers </span>attain an accuracy rate of <img  src="/img/revistas/cleiej/v17n2/2a11142x.png" alt="&#x2248; 100% "  class="math" >, with a slight advantage in favor of the <span  class="ptmri7t-">Mean</span> <span  class="ptmri7t-">Inliers</span>. <!--l. 1060--></font>    <p >   <hr class="figure">    <div class="figure"  >                                      <!--l. 1064-->    <p >     <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-22001r7" href="/img/revistas/cleiej/v17n2/2a11f7.jpg">Figure&#x00A0;7:</a> </span><span   class="content">Accuracy performance through iterative elements.</span></font></div><!--tex4ht:label?: x1-22001r7 -->                                                                                                                                                                                     <!--l. 1068-->    <p >   </div><hr class="endfigure">        <div class="table">                                                                                                                                                                                     <!--l. 1071-->    <p >   <font face="Verdana" size="2">   <a   id="x1-22002r4"></a></font><hr class="float">    ]]></body>
<body><![CDATA[<div class="float"  >                                                                                                                                                                                          <div class="caption"  ><font face="Verdana" size="2"><span class="id">Table&#x00A0;4: </span><span   class="content">Overall accuracy relating metrics and modifiers.</span></font></div><!--tex4ht:label?: x1-22002r4 -->     <div class="pic-tabular"> <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a11143x.png" alt="\-------------------|----------|-----------|--------------| | \ \\ \ \M odifiers   | no-modifier |uniform -scale |non-uniform scale | |Metrics-----\-\-\\---|----------|-----------|--------------| |MeanInliers          |  98.85%   |  98.85%    |   99.04%      | |MinInliers           |  97.88%   |  98.46%    |   98.46%      | |MaxInliers           |  89.81%   |  89.23%    |   89.81%      | |                   |          |           |              | |Point-to-pointRM SError |  86.73%   |  87.12%    |   86.35%      | |Point-to-planeRM SError |  80.96%   |  80.96%    |   80.96%      | |Max-distPairs         |  81.15%   |  82.50%    |   81.15%      | |AlignmentMatrix-Norm---|--28.08%------25.58%--------24.81%------- " ></font></div>                                                                                                                                                                                        </div><hr class="endfloat" />    </div> <!--l. 1093-->    <p >   <font face="Verdana" size="2">Such observations regarding the evaluation metrics are also confirmed when investigating the overall accuracy of the proposed ICP modifiers (Table <a  href="#x1-22002r4">4<!--tex4ht:ref: tab:2 --></a>). With <img  src="/img/revistas/cleiej/v17n2/2a11144x.png" alt="99.04% "  class="math" > of correct matches, the <span  class="ptmri7t-">Mean Inliers </span>metric successfully recognized <img  src="/img/revistas/cleiej/v17n2/2a11145x.png" alt="515 "  class="math" > cross-validation inquiries, mismatching only <img  src="/img/revistas/cleiej/v17n2/2a11146x.png" alt="5 "  class="math" > instances, all of them related to the alignment from letter models <img  src="/img/revistas/cleiej/v17n2/2a11147x.png" alt="&#8216;A &#x2032; "  class="math" > and <img  src="/img/revistas/cleiej/v17n2/2a11148x.png" alt="&#8216;T&#x2032; "  class="math" >. Furthermore, it is verified that, at least for the given sample space, the proposed modifiers have no such interesting contribution to accuracy improvements. <!--l. 1099--></font>    <p >   <font face="Verdana" size="2">To correctly evaluate accuracy when applying the <img  src="/img/revistas/cleiej/v17n2/2a11149x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>technique, a different set of simulations was performed. To deal with the randomization factor of the non-deterministically selections, <img  src="/img/revistas/cleiej/v17n2/2a11150x.png" alt="100 "  class="math" > distinct experiments were conducted with <img  src="/img/revistas/cleiej/v17n2/2a11151x.png" alt="1..19 "  class="math" > possible <img  src="/img/revistas/cleiej/v17n2/2a11152x.png" alt="K "  class="math" > values for the bucket size. On each experiment, for a fixed instance of bucket samples (training data), the accuracy of the <img  src="/img/revistas/cleiej/v17n2/2a11153x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>algorithm was verified by classifying all the out of the bucket database images (test data). Figure <a  href="#x1-22003r8">8<!--tex4ht:ref: fig:accuracy:bucket --></a> presents the average accuracy for all the performed experiments. The first bar on the <img  src="/img/revistas/cleiej/v17n2/2a11154x.png" alt="x "  class="math" > axis, labeled as <span  class="ptmri7t-">best-fit</span>, represents the comparable value of accuracy using the baseline parameters (Table&#x00A0;<a  href="#x1-19001r2">2<!--tex4ht:ref: tab:1 --></a>). </font> <hr class="figure">    <div class="figure"  >  <!--l. 1111-->    <p >      <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-22003r8" href="/img/revistas/cleiej/v17n2/2a11f8.jpg">Figure&#x00A0;8:</a> </span><span   class="content">Average accuracy with varying <img  src="/img/revistas/cleiej/v17n2/2a11155x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>bucket sizes.</span></font></div><!--tex4ht:label?: x1-22003r8 -->                                                                                                                                                                                     <!--l. 1114-->    <p >   </div><hr class="endfigure"> <!--l. 1116-->    <p >   <font face="Verdana" size="2">The results show that, even for <img  src="/img/revistas/cleiej/v17n2/2a11156x.png" alt="K &#x2248; 1 "  class="math" >, the proposed <img  src="/img/revistas/cleiej/v17n2/2a11157x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>technique attains a remarkable performance for all of the proposed metrics. In this sense, the statistical analysis applied in <img  src="/img/revistas/cleiej/v17n2/2a11158x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit</span>, through the mean value computation, allows it to build a representative metric value for each class prior to the recognition. This presents a significant improvement even for unstable metrics, such as <span  class="ptmri7t-">Alignment Matrix Norm</span>. <!--l. 1122--></font>    ]]></body>
<body><![CDATA[<p >   <font face="Verdana" size="2">The accuracy results allow to state an order of the studied correspondence metrics, determined as follows: (<img  src="/img/revistas/cleiej/v17n2/2a11159x.png" alt="1st   "  class="math" >) <span  class="ptmri7t-">Mean</span> <span  class="ptmri7t-">Inliers</span>, (<img  src="/img/revistas/cleiej/v17n2/2a11160x.png" alt="2nd   "  class="math" >) <span  class="ptmri7t-">Minimum Inliers</span>, (<img  src="/img/revistas/cleiej/v17n2/2a11161x.png" alt="3rd   "  class="math" >) <span  class="ptmri7t-">Maximum Inliers</span>, (<img  src="/img/revistas/cleiej/v17n2/2a11162x.png" alt="4th   "  class="math" >) <span  class="ptmri7t-">Root Mean Square Point-to-point Error</span>, (<img  src="/img/revistas/cleiej/v17n2/2a11163x.png" alt="5th   "  class="math" >) <span  class="ptmri7t-">Root Mean</span> <span  class="ptmri7t-">Square Point-to-plane Error</span>, (<img  src="/img/revistas/cleiej/v17n2/2a11164x.png" alt="6th   "  class="math" >) <span  class="ptmri7t-">Maximum Distance Threshold</span>, and (<img  src="/img/revistas/cleiej/v17n2/2a11165x.png" alt="7th   "  class="math" >) <span  class="ptmri7t-">Alignment Matrix Norm</span>. The obtained order indicates that the proposed inliers based metrics outperform in accuracy all other RMS error based metrics, commonly applied when using ICP&#x00A0;<span class="cite"><a name="br11">[</a><a  href="#Xtrindade:2012">11</a><a name="br20">,</a>&#x00A0;<a  href="#Xamor:2006">20</a><a name="br21">,</a>&#x00A0;<a  href="#Xping:2005">21</a>]</span>. </font>        <p><font face="Verdana" size="2"><span class="titlemark">6.2    </span> <a   id="x1-230006.2"></a>Efficiency Performance</font></p> <!--l. 1131-->    <p ><font face="Verdana" size="2">Through a dimensional point of view, the efficiency of the ICP procedure is dominated by the change of the iterative elements of its input parameters. This is justifiable since other ICP elements, like the proposed application of modifiers, are considered a natural extension of the processing, with their time complexities dominated by the overall procedure. Figure&#x00A0;<a  href="#x1-23001r9">9<!--tex4ht:ref: fig:processing-time --></a> presents the correlation between the iterative elements and the average processing time of an instance of the ICP alignment in a 2.4 GHz single processor based machine. It can be verified that the maximum allowed number of iterations has a stronger driven force when compared to the maximum allowed number of iteration pairs. Thus, increasing the number of iterations has a worse impact on the processing time. When both variable parameters are selected with a large value, the ICP running time can be up to <img  src="/img/revistas/cleiej/v17n2/2a11166x.png" alt="10 "  class="math" > times slower (<img  src="/img/revistas/cleiej/v17n2/2a11167x.png" alt="&#x2248; 320 "  class="math" >ms). By checking the template matching time complexity in Equation&#x00A0;<a  href="#x1-6001r10">10<!--tex4ht:ref: reccomp --></a>, when <img  src="/img/revistas/cleiej/v17n2/2a11168x.png" alt="M "  class="math" > instances of the ICP alignment are required, this scenario would make the application of the proposed methodology in real-time contexts impractical. In contrast, as the accuracy results (Figure&#x00A0;<a  href="#x1-22001r7">7<!--tex4ht:ref: fig:accuracy:iterative-elements --></a>) have shown that no significant improvement was achieved by increasing the number of iterative elements, a minimal configuration value, as in Table <a  href="#x1-19001r2">2<!--tex4ht:ref: tab:1 --></a>, positively decreases the processing time of the ICP alignment (<img  src="/img/revistas/cleiej/v17n2/2a11169x.png" alt="15 "  class="math" >ms) and efficiently handles the ASL recognition needs. <!--l. 1150--></font>    <p >   <font face="Verdana" size="2">Another important analysis of the speed of the ASL recognition can be done by examining the frame rate in a practical online implementation of the proposed methodology. The results presented in Table <a  href="#x1-23002r5">5<!--tex4ht:ref: tab:3 --></a> show that the <img  src="/img/revistas/cleiej/v17n2/2a11170x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>technique may achieve almost two orders of magnitude in the recognition speed when compared to the <span  class="ptmri7t-">best-fit </span>approach. From the accuracy results, it can be verified that the <img  src="/img/revistas/cleiej/v17n2/2a11171x.png" alt="K "  class="math" ><span  class="ptmri7t-">B-fit </span>does not substantially degrades the performance, even for small bucket sizes. <!--l. 1158--></font>    <p >   <hr class="figure">    <div class="figure"  > <!--l. 1160-->    <p >     <div class="caption"  ><font face="Verdana" size="2"><span class="id"><a   id="x1-23001r9" href="/img/revistas/cleiej/v17n2/2a11f9.jpg">Figure&#x00A0;9:</a> </span><span   class="content">Average ICP processing time by varying the iterative elements. Performed in a 2.4 GHz single processor machine.</span></font></div><!--tex4ht:label?: x1-23001r9 -->                                                                                                                                                                                     <!--l. 1163-->    <p >   </div><hr class="endfigure">        <div class="table">                                                                                                                                                                                     <!--l. 1166-->    ]]></body>
<body><![CDATA[<p >   <font face="Verdana" size="2">   <a   id="x1-23002r5"></a></font><hr class="float">    <div class="float"  >                                                                                                                                                                                         <div class="center"  > <!--l. 1167-->    <p > <font face="Verdana" size="2"> <br /> </font>     <div class="caption"  ><font face="Verdana" size="2"><span class="id">Table&#x00A0;5: </span><span   class="content">Average annotated frame per second (FPS) rates for different classification parameters. Performed in a 2.4 GHz single processor machine.</span></font></div><!--tex4ht:label?: x1-23002r5 -->     <div class="pic-tabular"> <font face="Verdana" size="2"> <img  src="/img/revistas/cleiej/v17n2/2a11172x.png" alt="|---------------| |---------------| |---------------| |---------------| |Parameter--Rate-| |Parameter--Rate-| |Parameter--R-ate-| |Parameter--R-ate-| |best-fit     0.20 | |15B-fit    0.41 | |10B-fit    0.97 | |5B-fit      3.70 | |19B-fit-----0.27-| |14B-fit----0.48-| |9B-fit-----1.33-| |4B-fit------4.53-| |---------------| |---------------| |---------------| |---------------| |18B-fit-----0.29-| |13B-fit----0.55-| |8B-fit-----1.68-| |3B-fit------5.33-| |17B-fit-----0.33-| |12B-fit----0.67-| |7B-fit-----2.20-| |2B-fit------6.29-| |16B-fit-----0.36-  |11B-fit----0.81-  |6B-fit-----2.91-  |1B-fit------7.41- " ></font></div> </div>                                                                                                                                                                                        </div><hr class="endfloat" />    </div> <!--l. 1240-->    <p >   <font face="Verdana" size="2">From the results in Figure&#x00A0;<a  href="#x1-22003r8">8<!--tex4ht:ref: fig:accuracy:bucket --></a> and in Table&#x00A0;<a  href="#x1-23002r5">5<!--tex4ht:ref: tab:3 --></a>, the main correlation between accuracy and efficiency can be summarized as follows: </font>      <ul class="itemize1">      <li class="itemize"><font face="Verdana" size="2">Accuracy always achieves its best performance with the <span  class="ptmri7t-">Mean Inliers </span>metric, so it is the best choice for the ICP      correspondence metric on the proposed methodology; </font>      </li>      <li class="itemize"><font face="Verdana" size="2">If accuracy is the main focus, the <span  class="ptmri7t-">best-fit </span>classification will achieve the highest performance in exchange for a      slow recognition process (<img  src="/img/revistas/cleiej/v17n2/2a11173x.png" alt="0.20 "  class="math" >FPS): <img  src="/img/revistas/cleiej/v17n2/2a11174x.png" alt="99.04 "  class="math" >% of correct matches in the evaluated scenarios;      </font>      </li>      <li class="itemize"><font face="Verdana" size="2">If efficiency is the goal, the <span  class="ptmri7t-">1B-fit </span>classifier will outperform any of the proposed techniques for the classification      stage. As shown in the results, it will still maintain a reasonable recognition accuracy (<img  src="/img/revistas/cleiej/v17n2/2a11175x.png" alt="84.31 "  class="math" >%) while reaching      an average reported rate of <img  src="/img/revistas/cleiej/v17n2/2a11176x.png" alt="7.41 "  class="math" > FPS;      </font>      </li>      <li class="itemize"><font face="Verdana" size="2">A good balance of accuracy and efficiency is to apply the <span  class="ptmri7t-">5B-fit</span>: it will allow a high accuracy (<img  src="/img/revistas/cleiej/v17n2/2a11177x.png" alt="94.16 "  class="math" >%) with      an efficiency rate of <img  src="/img/revistas/cleiej/v17n2/2a11178x.png" alt="3.70 "  class="math" > FPS.</font></li>    </ul> <!--l. 1259-->    <p >   <font face="Verdana" size="2">In summary, the use of the proposed methodology where the ICP procedure is applied has proven to be a reliable system for recognition, achieving high accuracy performance when compared to other state of the art solutions (Table&#x00A0;<a  href="#x1-4001r1">1<!--tex4ht:ref: tab:0 --></a>). At the same time, it has a drawback, result of to the application of the template matching architecture. The proposed <img  src="/img/revistas/cleiej/v17n2/2a11179x.png" alt="K "  class="math" >Bucket<span  class="ptmri7t-">-fit</span> classification limits this problem by reducing the space complexity of the database samples, making it possible to provide good accuracy results and support online applications. </font>        <p><font face="Verdana" size="2"><span class="titlemark">7    </span> <a   id="x1-240007"></a>Conclusion and Future Works</font></p> <!--l. 1273-->    ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2">The results presented show that the ICP algorithm can be used to produce accurate matches even with a very similar set of gestures poses. With a best achieved accuracy of <img  src="/img/revistas/cleiej/v17n2/2a11180x.png" alt="99.04 "  class="math" >%, the methodology has shown to be accurate enough to the sign language recognition. However, as ICP processing is always conditioned to the pairwise data alignments, the general template matching paradigm is still a bottleneck to its application in real-time contexts (<img  src="/img/revistas/cleiej/v17n2/2a11181x.png" alt="&#x2248; 15 "  class="math" > FPS). <!--l. 1280--></font>    <p >   <font face="Verdana" size="2">As a future work, coding the ICP procedure to work in accelerated hardware, such as GPU&#8217;s, is a plausible alternative to apply this technique in real-time. Given the high reported accuracy, another possibility is to combine existing classification tools, like random decision forests, to coarsely reduce the space of possible matches and let the proposed methodology resolve only the tricky cases. <!--l. 1286--></font>    <p >   <font face="Verdana" size="2">By proposing a diversified set of correspondence metrics, it was possible to build a comprehensive analysis of the applicability of the Kinect&#8217;s depth data in the 3D shape recognition. Furthermore, the proposed <img  src="/img/revistas/cleiej/v17n2/2a11182x.png" alt="K "  class="math" >Bucket<span  class="ptmri7t-">-fit </span>algorithm has contributed to significantly increase the achieved performance when compared with the time consuption of the brute-force <span  class="ptmri7t-">best-fit </span>algorithm, without compromising accuracy. From this point of view, the <img  src="/img/revistas/cleiej/v17n2/2a11183x.png" alt="K "  class="math" >Bucket<span  class="ptmri7t-">-fit </span>also contributes to many other interesting fields, such as biometrics. Besides that, the described methodology and techniques can be promptly adapted to any other sign language hand alphabet with minor modifications, requiring only the replacement of the template models in the dataset. <!--l. 1296--></font>    <p >   <font face="Verdana" size="2">Finally, the use of the ICP procedure has a great potential to deal with dynamic gesture recognition, since it can robustly track the alignments of two spatial and temporal proximal geometry shapes. This may complete the requirements for the full sign language machine recognition and easily approach this work to applications in human computer interaction and robotics. <!--l. 2--></font>    <p >        <p><font face="Verdana" size="2"><a   id="x1-250007"></a>References</font></p> <!--l. 2-->    <p >         <div class="thebibliography">         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br1">1</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xkinfu:2011:2"></a>S.&#x00A0;Izadi,   D.&#x00A0;Kim,   O.&#x00A0;Hilliges,   D.&#x00A0;Molyneaux,   R.&#x00A0;Newcombe,   P.&#x00A0;Kohli,   J.&#x00A0;Shotton,   S.&#x00A0;Hodges,     D.&#x00A0;Freeman, A.&#x00A0;Davison, and A.&#x00A0;Fitzgibbon, &#8220;KinectFusion: Real-time 3D reconstruction and interaction using     a moving depth camera,&#8221; in <span  class="ptmri7t-">Proceedings of the 24th annual ACM symposium on User interface software and</span>     <span  class="ptmri7t-">technology</span>, ser. UIST&#x00A0;&#8217;11.    New York, NY, USA: ACM, 2011, pp. 559&#8211;568.     </font>                                                                                                                                                                                         </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br2">2</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xsuarez:2012"></a>J.&#x00A0;Suarez and R.&#x00A0;Murphy, &#8220;Hand gesture recognition with depth images: A review,&#8221; in <span  class="ptmri7t-">RO-MAN, 2012 IEEE</span>,     2012, pp. 411&#8211;417. </font>     </p>         ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br3">3</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xkinfu:2011"></a>R.&#x00A0;Newcombe,   S.&#x00A0;Izadi,   O.&#x00A0;Hilliges,   D.&#x00A0;Molyneaux,   D.&#x00A0;Kim,   A.&#x00A0;Davison,   P.&#x00A0;Kohli,   J.&#x00A0;Shotton,     S.&#x00A0;Hodges, and A.&#x00A0;Fitzgibbon, &#8220;KinectFusion: Real-time dense surface mapping and tracking,&#8221; in <span  class="ptmri7t-">Proceedings</span>     <span  class="ptmri7t-">of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality</span>, ser. ISMAR &#8217;11.  Washington,     DC, USA: IEEE Computer Society, 2011, pp. 127&#8211;136. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br4">4</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xoikonomidis:2012"></a>I.&#x00A0;Oikonomidis, N.&#x00A0;Kyriazis, and A.&#x00A0;Argyros, &#8220;Tracking the articulated motion of two strongly interacting     hands,&#8221; in <span  class="ptmri7t-">Computer Vision and Pattern Recognition</span>, Providence, Rhode Island, USA, 2012.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br5">5</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xmitra:2007"></a>S.&#x00A0;Mitra  and  T.&#x00A0;Acharya,  &#8220;Gesture  recognition:  A  survey,&#8221;  <span  class="ptmri7t-">Systems,  Man,  and  Cybernetics,  Part  C:</span>     <span  class="ptmri7t-">Applications and Reviews, IEEE Transactions on</span>, vol.&#x00A0;37, no.&#x00A0;3, pp. 311&#8211;324, 2007.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br6">6</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xalahdal:2012"></a>M.&#x00A0;Al-Ahdal and N.&#x00A0;Tahir, &#8220;Review in sign language recognition systems,&#8221; in <span  class="ptmri7t-">Computers Informatics (ISCI),</span>     <span  class="ptmri7t-">2012 IEEE Symposium on</span>, 2012, pp. 52&#8211;57. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br7">7</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xwikipedia_alphabet"></a>Wikipedia.                 American                 manual                 alphabet.                 [Online].                 Available:     <a  href="http://en.wikipedia.org/wiki/American_manual_alphabet" class="url" >http://en.wikipedia.org/wiki/American_manual_alphabet</a>     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br8">8</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xzabulis:2009"></a>X.&#x00A0;Zabulis, H.&#x00A0;Baltzakis, and A.&#x00A0;A. Argyros, <span  class="ptmri7t-">Vision-based hand gesture recognition for human computer</span>     <span  class="ptmri7t-">interaction</span>, ser. on Human Factors and Ergonomics.    Lawrence Erlbaum Associates, Inc. (LEA), 2009, ch.&#x00A0;34,     pp. 34.1&#8211;34.30. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">   [<a href="#br9">9</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xrau:2012"></a>S.&#x00A0;Rautaray and A.&#x00A0;Agrawal, &#8220;Vision based hand gesture recognition for human computer interaction: A     survey,&#8221; <span  class="ptmri7t-">Artificial Intelligence Review</span>, pp. 1&#8211;54, 2012.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br10">10</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xbesl:1992"></a>P.&#x00A0;Besl  and  N.&#x00A0;D.  McKay,  &#8220;A  method  for  registration  of  3-D  shapes,&#8221;  <span  class="ptmri7t-">Pattern  Analysis  and  Machine</span>     <span  class="ptmri7t-">Intelligence, IEEE Transactions on</span>, vol.&#x00A0;14, no.&#x00A0;2, pp. 239&#8211;256, 1992.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br11">11</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xtrindade:2012"></a>P.&#x00A0;Trindade, J.&#x00A0;Lobo, and J.&#x00A0;Barreto, &#8220;Hand gesture recognition using color and depth images enhanced     with hand angular pose data,&#8221; in <span  class="ptmri7t-">Multisensor Fusion and Integration for Intelligent Systems (MFI), 2012 IEEE</span>     <span  class="ptmri7t-">Conference on</span>, 2012, pp. 71&#8211;76. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br12">12</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xrusinkiewicz:2001"></a>S.&#x00A0;Rusinkiewicz  and  M.&#x00A0;Levoy,  &#8220;Efficient  variants  of  the  ICP  algorithm,&#8221;  in  <span  class="ptmri7t-">3-D  Digital  Imaging  and</span>     <span  class="ptmri7t-">Modeling, 2001. Proceedings. Third International Conference on</span>, 2001, pp. 145&#8211;152.     </font>                                                                                                                                                                                         </p>         ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br"13>13</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xfujimura:2006"></a>K.&#x00A0;Fujimura and X.&#x00A0;Liu, &#8220;Sign recognition using depth image streams,&#8221; in <span  class="ptmri7t-">Automatic Face and Gesture</span>     <span  class="ptmri7t-">Recognition, 2006. FGR 2006. 7th International Conference on</span>, 2006, pp. 381&#8211;386.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br14">14</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xkeskin:2011"></a>C.&#x00A0;Keskin, F.&#x00A0;Kirac, Y.&#x00A0;Kara, and L.&#x00A0;Akarun, &#8220;Real time hand pose estimation using depth sensors,&#8221; in     <span  class="ptmri7t-">Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on</span>, 2011, pp. 1228&#8211;1234.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br15">15</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xvan:2011"></a>M.&#x00A0;Van&#x00A0;den   Bergh,   D.&#x00A0;Carton,   R.&#x00A0;de&#x00A0;Nijs,   N.&#x00A0;Mitsou,   C.&#x00A0;Landsiedel,   K.&#x00A0;Kuehnlenz,   D.&#x00A0;Wollherr,     L.&#x00A0;Van&#x00A0;Gool, and M.&#x00A0;Buss, &#8220;Real-time 3D hand gesture interaction with a robot for understanding directions     from humans,&#8221; in <span  class="ptmri7t-">RO-MAN, 2011 IEEE</span>, 2011, pp. 357&#8211;362. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br16">16</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xuebersax:2011"></a>D.&#x00A0;Uebersax,  J.&#x00A0;Gall,  M.&#x00A0;Van&#x00A0;den  Bergh,  and  L.&#x00A0;Van&#x00A0;Gool,  &#8220;Real-time  sign  language  letter  and  word     recognition  from  depth  data,&#8221;  in  <span  class="ptmri7t-">Computer  Vision  Workshops  (ICCV  Workshops),  2011  IEEE  International</span>     <span  class="ptmri7t-">Conference on</span>, 2011, pp. 383&#8211;390. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br"17>17</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xliwicki:2009"></a>S.&#x00A0;Liwicki and M.&#x00A0;Everingham, &#8220;Automatic recognition of fingerspelled words in british sign language,&#8221;     <span  class="ptmri7t-">2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops</span>, vol.&#x00A0;0, pp.     50&#8211;57, 2009. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br18">18</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xkonda:2012"></a>K.&#x00A0;R.  Konda,  A.&#x00A0;Königs,  H.&#x00A0;Schulz,  and  D.&#x00A0;Schulz,  &#8220;Real  time  interaction  with  mobile  robots  using     hand  gestures,&#8221;  in  <span  class="ptmri7t-">Proceedings  of  the  seventh  annual  ACM/IEEE  international  conference  on  Human-Robot</span>     <span  class="ptmri7t-">Interaction</span>, ser. HRI&#x00A0; &#8217;12.    New York, NY, USA: ACM, 2012, pp. 177&#8211;178.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br19">19</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xpugeault:2011"></a>N.&#x00A0;Pugeault  and  R.&#x00A0;Bowden,  &#8220;Spelling  it  out:  Real-time  ASL  fingerspelling  recognition,&#8221;  in  <span  class="ptmri7t-">Computer</span>     <span  class="ptmri7t-">Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on</span>, 2011, pp. 1114&#8211;1119.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br20">20</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xamor:2006"></a>B.&#x00A0;Amor,  M.&#x00A0;Ardabilian,  and  L.&#x00A0;Chen,  &#8220;New  experiments  on  ICP-based  3D  face  recognition  and     authentication,&#8221; in <span  class="ptmri7t-">Pattern Recognition, 2006. ICPR 2006. 18th International Conference on</span>, vol.&#x00A0;3, 2006, pp.     1195&#8211;1199. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br21">21</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xping:2005"></a>P.&#x00A0;Yan and K.&#x00A0;Bowyer, &#8220;A fast algorithm for ICP-based 3D shape biometrics,&#8221; in <span  class="ptmri7t-">Automatic Identification</span>     <span  class="ptmri7t-">Advanced Technologies, 2005. Fourth IEEE Workshop on</span>, 2005, pp. 213&#8211;218.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br22">22</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xhorn:1987"></a>B.&#x00A0;K.&#x00A0;P. Horn, &#8220;Closed-form solution of absolute orientation using unit quaternions,&#8221; <span  class="ptmri7t-">Journal of the Optical</span>     <span  class="ptmri7t-">Society of America. A</span>, vol.&#x00A0;4, no.&#x00A0;4, pp. 629&#8211;642, Apr. 1987.     </font>     </p>         ]]></body>
<body><![CDATA[<p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br23">23</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xpaulino:2011"></a>J.&#x00A0;P.  da&#x00A0;Silva&#x00A0;Júnior,  D.&#x00A0;L.  Borges,  and  F.&#x00A0;de&#x00A0;Barros&#x00A0;Vidal,  &#8220;A  dynamic  approach  for  approximate     pairwise alignment based on 4-points congruence sets of 3D points,&#8221; in <span  class="ptmri7t-">Image Processing (ICIP), 2011 18th</span>     <span  class="ptmri7t-">IEEE International Conference on</span>, 2011, pp. 889&#8211;892.     </font>                                                                                                                                                                                         </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br24">24</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xzhiyuan:2012"></a>Z.&#x00A0;Zhang, S.&#x00A0;H. Ong, and K.&#x00A0;Foong, &#8220;Improved spin images for 3D surface matching using signed angles,&#8221;     in <span  class="ptmri7t-">Image Processing (ICIP), 2012 19th IEEE International Conference on</span>, 2012, pp. 537&#8211;540.     </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br25">25</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xmalassiotis:2002"></a>S.&#x00A0;Malassiotis,  N.&#x00A0;Aifanti,  and  M.&#x00A0;Strintzis,  &#8220;A  gesture  recognition  system  using  3D  data,&#8221;  in  <span  class="ptmri7t-">3D  Data</span>     <span  class="ptmri7t-">Processing  Visualization  and  Transmission,  2002.  Proceedings.  First  International  Symposium  on</span>,  2002,  pp.     190&#8211;193. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br26">26</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xkinect:2013"></a>Microsoft Corp. Redmond WA. Kinect for Xbox 360. </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br27">27</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xopenni:2013"></a>OpenNI SDK. [Online]. Available: <a  href="http://www.openni.org" class="url" >http://www.openni.org</a> </font>     </p>         <p ><font face="Verdana" size="2"><span class="biblabel">  [<a href="#br28">28</a>]<span class="bibsp">&#x00A0;&#x00A0;&#x00A0;</span></span><a   id="Xnite:2013"></a>PrimeSense.                       NITE                       Middleware.                       [Online].                       Available:     <a  href="http://www.primesense.com/solutions/nite-middleware" class="url" >http://www.primesense.com/solutions/nite-_middleware</a>     </font> </p>     </div>           ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Izadi]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Kim]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Hilliges]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Molyneaux]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Newcombe]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Kohli]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Shotton]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Hodges]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Freeman]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Davison]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Fitzgibbon]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera]]></source>
<year></year>
<conf-name><![CDATA[ Proceedings of the 24th annual ACM symposium on User interface software and technology]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc>New York NY</conf-loc>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Suarez]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Murphy]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Hand gesture recognition with depth images: A review]]></source>
<year>2012</year>
<page-range>411-417</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Newcombe]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Izad]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Hilliges]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Molyneaux]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Kim]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Davison]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Kohli]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Shotton]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Hodges]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Fitzgibbon]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;KinectFusion: Real-time dense surface mapping and tracking]]></source>
<year></year>
<conf-name><![CDATA[ International Symposium on Mixed and Augmented Reality]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc>Washington DC</conf-loc>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Oikonomidis]]></surname>
<given-names><![CDATA[I]]></given-names>
</name>
<name>
<surname><![CDATA[Kyriazis]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
<name>
<surname><![CDATA[Argyros]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Tracking the articulated motion of two strongly interacting hands]]></source>
<year>2012</year>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mitra]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Acharya]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[&ldquo;Gesture recognition: A survey]]></article-title>
<source><![CDATA[IEEE Transactions]]></source>
<year>2007</year>
<volume>37</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>311-324</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Al-Ahdal]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Tahir]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Review in sign language recognition systems]]></source>
<year></year>
<conf-name><![CDATA[ IEEE Symposium]]></conf-name>
<conf-date>2012</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="">
<collab>Wikipedia</collab>
<source><![CDATA[American manual alphabet]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zabulis]]></surname>
<given-names><![CDATA[X]]></given-names>
</name>
<name>
<surname><![CDATA[Baltzakis]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Argyros]]></surname>
<given-names><![CDATA[A. A]]></given-names>
</name>
</person-group>
<source><![CDATA[Vision-based hand gesture recognition for human computer interaction, ser. on Human Factors and Ergonomics]]></source>
<year>2009</year>
<page-range>1-34</page-range><publisher-name><![CDATA[Lawrence Erlbaum Associates]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rautaray]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Agrawal]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[&ldquo;Vision based hand gesture recognition for human computer interaction: A survey]]></article-title>
<source><![CDATA[Artificial Intelligence Review]]></source>
<year>2012</year>
<page-range>1-54</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Besl]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[McKay]]></surname>
<given-names><![CDATA[N. D]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[&ldquo;A method for registration of 3-D shapes: Pattern Analysis and Machine Intelligence]]></article-title>
<source><![CDATA[IEEE Transactions]]></source>
<year>1992</year>
<volume>14</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>239-256</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Trindade]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Lobo]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Barreto]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Hand gesture recognition using color and depth images enhanced with hand angular pose data]]></source>
<year></year>
<conf-name><![CDATA[ IEEE Conference]]></conf-name>
<conf-date>2012</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rusinkiewicz]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Levoy]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Efficient variants of the ICP algorithm]]></source>
<year></year>
<conf-name><![CDATA[ Proceedings. Third International Conference]]></conf-name>
<conf-date>2001</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fujimura]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[X]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Sign recognition using depth image streams]]></source>
<year></year>
<conf-name><![CDATA[ Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference]]></conf-name>
<conf-date>2006</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Keskin]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Kirac]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Kara]]></surname>
<given-names><![CDATA[Y]]></given-names>
</name>
<name>
<surname><![CDATA[Akarun]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Real time hand pose estimation using depth sensors]]></source>
<year></year>
<conf-name><![CDATA[ IEEE International Conference]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Van den Bergh]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Carton]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[de Nijs]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Mitsou]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
<name>
<surname><![CDATA[Landsiedel]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Kuehnlenz]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
<name>
<surname><![CDATA[Wollherr]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Van Gool]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Buss]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Real-time 3D hand gesture interaction with a robot for understanding directions from humans]]></source>
<year></year>
<conf-name><![CDATA[ IEEE]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Uebersax]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Gall]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Van den Bergh]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Van Gool]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Real-time sign language letter and word recognition from depth data]]></source>
<year></year>
<conf-name><![CDATA[ IEEE International Conference]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Liwicki]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Everingham]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Automatic recognition of fingerspelled words in british sign language]]></source>
<year></year>
<conf-name><![CDATA[ IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops]]></conf-name>
<conf-date>2009</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Konda]]></surname>
<given-names><![CDATA[K. R]]></given-names>
</name>
<name>
<surname><![CDATA[Konigs]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Schulz]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Schulz]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Real time interaction with mobile robots using hand gestures]]></source>
<year></year>
<conf-name><![CDATA[ Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction]]></conf-name>
<conf-date>2012</conf-date>
<conf-loc>New York </conf-loc>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pugeault]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
<name>
<surname><![CDATA[Bowden]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;Spelling it out: Real-time ASL fingerspelling recognition]]></source>
<year></year>
<conf-name><![CDATA[ IEEE International Conference]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Amor]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Ardabilian]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;New experiments on ICP-based 3D face recognition and authentication]]></source>
<year></year>
<conf-name><![CDATA[ International Conference]]></conf-name>
<conf-date>2006</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yan]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Bowyer]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;A fast algorithm for ICP-based 3D shape biometrics]]></source>
<year>2005</year>
<page-range>213-218</page-range></nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Horn]]></surname>
<given-names><![CDATA[B. K. P]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[&ldquo;Closed-form solution of absolute orientation using unit quaternions]]></article-title>
<source><![CDATA[Journal of the Optical Society of America]]></source>
<year>Apr.</year>
<month> 1</month>
<day>98</day>
<volume>4</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>629-642</page-range></nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[da Silva]]></surname>
<given-names><![CDATA[J. P]]></given-names>
</name>
<name>
<surname><![CDATA[Borges]]></surname>
<given-names><![CDATA[D. L]]></given-names>
</name>
<name>
<surname><![CDATA[de Barros Vidal]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
</person-group>
<source><![CDATA[A dynamic approach for approximate pairwise alignment based on 4-points congruence sets of 3D points]]></source>
<year></year>
<conf-name><![CDATA[ IEEE International Conference]]></conf-name>
<conf-date>2011</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Ong]]></surname>
<given-names><![CDATA[S. H]]></given-names>
</name>
<name>
<surname><![CDATA[Foong]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<source><![CDATA[Improved spin images for 3D surface matching using signed angles]]></source>
<year></year>
<conf-name><![CDATA[ IEEE International Conference]]></conf-name>
<conf-date>2012</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B25">
<label>25</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Malassiotis]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Aifanti]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
<name>
<surname><![CDATA[Strintzis]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[&ldquo;A gesture recognition system using 3D data]]></source>
<year></year>
<conf-name><![CDATA[ Proceedings. First International Symposium on]]></conf-name>
<conf-date>2002</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B26">
<label>26</label><nlm-citation citation-type="">
<source><![CDATA[]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B27">
<label>27</label><nlm-citation citation-type="">
<source><![CDATA[OpenNI SDK]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B28">
<label>28</label><nlm-citation citation-type="">
<source><![CDATA[PrimeSense: NITE Middleware]]></source>
<year></year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
