<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>0717-5000</journal-id>
<journal-title><![CDATA[CLEI Electronic Journal]]></journal-title>
<abbrev-journal-title><![CDATA[CLEIej]]></abbrev-journal-title>
<issn>0717-5000</issn>
<publisher>
<publisher-name><![CDATA[Centro Latinoamericano de Estudios en Informática]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S0717-50002016000200007</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Semantic Mining based on graph theory and ontologies. Case Study: Cell Signaling Pathways]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Rangel]]></surname>
<given-names><![CDATA[Carlos R.]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Altamiranda]]></surname>
<given-names><![CDATA[Junior]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Aguilar]]></surname>
<given-names><![CDATA[Jose]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Universidad de Los Andes Centro de Estudios en Microcomputación i Sistemas Distribuidos (CEMISID) ]]></institution>
<addr-line><![CDATA[Mérida ]]></addr-line>
<country>Venezuela</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>08</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>08</month>
<year>2016</year>
</pub-date>
<volume>19</volume>
<numero>2</numero>
<fpage>7</fpage>
<lpage>7</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_arttext&amp;pid=S0717-50002016000200007&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_abstract&amp;pid=S0717-50002016000200007&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.edu.uy/scielo.php?script=sci_pdf&amp;pid=S0717-50002016000200007&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality) frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease), and the main cells in each community. We analyze our approach in two cases: TGF-&#946; and the Alzheimer Disease.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[En este trabajo se utilizan los conceptos de la teoría de grafos y la biología celular representado como ontologías, para llevar a cabo tareas de minería semántica en las redes de vías de comunicación celular. En concreto, el documento describe el enriquecimiento semántico de las redes vía de comunicación celular. Una red de vía de comunicación celular describe las actividades celulares básicas y sus interacciones. La principal contribución de este trabajo es en el área de la investigación de vías de comunicación celular, se propone una nueva técnica para analizar y entender cómo los cambios en estas redes pueden afectar a la transmisión y circulación de la información, que producen enfermedades como el cáncer y la diabetes. Nuestro enfoque se basa en tres conceptos de la teoría de grafos (modularidad, clustering y de centralidad) que se utilizan con frecuencia en el análisis de redes sociales. Nuestro enfoque consiste en dos fases: la primera utiliza los conceptos de la teoría de grafos para determinar los grupos celulares en la red, lo que les llamaremos comunidades; la segunda utiliza ontologías para el enriquecimiento semántico de las comunidades celulares. Las medidas utilizadas en la teoría de grafos nos permiten determinar el conjunto de células que están cerca (por ejemplo, en una enfermedad), y las principales células en cada comunidad. Analizamos nuestro enfoque en dos casos: el TGF-ß y la Enfermedad de Alzheimer.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Bioinformatics]]></kwd>
<kwd lng="en"><![CDATA[semantic mining]]></kwd>
<kwd lng="en"><![CDATA[clustering]]></kwd>
<kwd lng="en"><![CDATA[semantic enrichment]]></kwd>
<kwd lng="en"><![CDATA[TGF-&#946;]]></kwd>
<kwd lng="en"><![CDATA[Alzheimer disease]]></kwd>
<kwd lng="es"><![CDATA[Bioinformática]]></kwd>
<kwd lng="es"><![CDATA[Minería Semántica]]></kwd>
<kwd lng="es"><![CDATA[Clustering]]></kwd>
<kwd lng="es"><![CDATA[Enriquecimiento Semántico]]></kwd>
<kwd lng="es"><![CDATA[TGF-ß]]></kwd>
<kwd lng="es"><![CDATA[Enfermedad Alzheimer]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><style="font-size: 14pt"><b>Semantic Mining based on graph theory and ontologies. Case Study: Cell Signaling Pathways</b></font></font></p>     <p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2">     <br>  </p>     <p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><b>Carlos R. Rangel and Junior Altamiranda</b></font></font></p>     <p lang="en-US" align="center" style="margin-left: 3cm; text-indent: -3cm; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-style: normal">Universidad de Los Andes, Centro de Estudios en Microcomputaci&oacute;n y Sistemas Distribuidos</span></font></font></p>     <p lang="en-US" align="center" style="margin-left: 3cm; text-indent: -3cm; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-style: normal">(CEMISID), M&eacute;rida, Venezuela, 5101</span></font></font></p>     <p lang="en-US" align="center" style="orphans: 2; widows: 2"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><i>{<a href="mailto:carlosran@ula.ve">carlosran</a> | <a href="mailto:altamira@ula.ve">altamira</a>}@ula.ve</i></font></font></p>     <p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2">     <br>  </p>     <p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">and</font></font></p>     ]]></body>
<body><![CDATA[<p lang="en-US" align="center" style="font-style: normal; orphans: 2; widows: 2">     <br>  </p>     <p lang="en-US" align="center" style="orphans: 2; widows: 2"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-style: normal"><b>Jose Aguilar</b></span></font></font></p>     <p lang="en-US" align="center" style="margin-left: 3cm; text-indent: -3cm; font-style: normal; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Universidad de Los Andes, Centro de Estudios en Microcomputaci&oacute;n y Sistemas Distribuidos</font></font></p>     <p lang="en-US" align="center" style="margin-left: 3cm; text-indent: -3cm; font-style: normal; orphans: 2; widows: 2"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">(CEMISID), M&eacute;rida, Venezuela, 5101</font></font></p>     <p lang="en-US" align="center" style="orphans: 0; widows: 0"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-EC"><span style="font-style: normal">Prometeo Researcher, Universidad T&eacute;cnica Particular Loja, Ecuador,</span></span></font></font></p>     <p lang="en-US" align="center" style="orphans: 2; widows: 2"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><i><a class="western" href="mailto:aguilar@ula.ve">aguilar@ula.ve</a>  </i></font></font> </p>      <p lang="en-US" align="left" style="margin-right: 1.59cm; margin-top: 0.42cm; margin-bottom: 0.21cm; page-break-inside: avoid; orphans: 0; widows: 0; page-break-after: avoid"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><b>Abstract</b></font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-right: 1.59cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality) frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close</span><font color="#ff0000"><span lang="en-US"> </span></font><span lang="en-US">(for example, in a disease), and the main cells in each community. We analyze our approach in two cases: TGF-&beta; and the Alzheimer Disease. </span></font></font> </p>      <p lang="es-ES" class="western" align="justify" style="margin-right: 1.59cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Abstract </b></span><span lang="es-UY"><b>in Spanish</b></span><span lang="en-US"> </span></font></font> </p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify" style="margin-right: 1.59cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">En este trabajo se utilizan los conceptos de la teor&iacute;a de grafos y la biolog&iacute;a celular representado como ontolog&iacute;as, para llevar a cabo tareas de miner&iacute;a sem&aacute;ntica en las redes de v&iacute;as de comunicaci&oacute;n celular. En concreto, el documento describe el enriquecimiento sem&aacute;ntico de las redes v&iacute;a de comunicaci&oacute;n celular. Una red de v&iacute;a de comunicaci&oacute;n celular describe las actividades celulares b&aacute;sicas y sus interacciones. La principal contribuci&oacute;n de este trabajo es en el &aacute;rea de la investigaci&oacute;n de v&iacute;as de comunicaci&oacute;n celular, se propone una nueva t&eacute;cnica para analizar y entender c&oacute;mo los cambios en estas redes pueden afectar a la transmisi&oacute;n y circulaci&oacute;n de la informaci&oacute;n, que producen enfermedades como el c&aacute;ncer y la diabetes. Nuestro enfoque se basa en tres conceptos de la teor&iacute;a de grafos (modularidad, clustering y de centralidad) que se utilizan con frecuencia en el an&aacute;lisis de redes sociales. Nuestro enfoque consiste en dos fases: la primera utiliza los conceptos de la teor&iacute;a de grafos para determinar los grupos celulares en la red, lo que les llamaremos comunidades; la segunda utiliza ontolog&iacute;as para el enriquecimiento sem&aacute;ntico de las comunidades celulares. Las medidas utilizadas en la teor&iacute;a de grafos nos permiten determinar el conjunto de c&eacute;lulas que est&aacute;n cerca (por ejemplo, en una enfermedad), y las principales c&eacute;lulas en cada comunidad. Analizamos nuestro enfoque en dos casos: el TGF-&szlig; y la Enfermedad de Alzheimer.</span></font></font></p>   <h2 lang="es-ES" class="western" align="justify" style="margin-left: 1.52cm; margin-right: 1.59cm; text-indent: -1.52cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Keywords: </span><span lang="en-US"><span style="font-weight: normal">Bioinformatics, semantic mining, clustering, semantic enrichment, TGF-&beta;, Alzheimer disease. </span></span></font></font> </h2>  <h2 lang="es-ES" class="western" align="justify" style="margin-left: 1.52cm; margin-right: 1.59cm; text-indent: -1.52cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Keywords </b></span><span lang="es-UY"><b>in Spanish</b></span><span lang="en-US"><span style="font-weight: normal"> : Bioinform&aacute;tica, Miner&iacute;a Sem&aacute;ntica, Clustering, Enriquecimiento Sem&aacute;ntico, TGF-&szlig;, Enfermedad Alzheimer</span></span></font></font></h2>      <p lang="en-US" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY">R</span>eceived:  2015-10-31 <span lang="es-UY">R</span>evised 2016-06-30 <span lang="es-UY">A</span>ccepted 2016-07-28</font></font></p>      <p lang="en-US" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">DOI: <a class="western" href="http://dx.doi.org/10.19153/cleiej.19.2.6">http://dx.doi.org/10.19153/cleiej.19.<span lang="es-UY">2</span>.<span lang="es-UY">6</span></a></font></font></p>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>1. </b></span></span><span style="font-variant: normal"><span lang="en-US"><b>Introduction</b></span></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Nowadays, there is a lot of biological knowledge embodied in information technology in different ways: in databases, ontologies, among others. A major challenge is to gather the scattered information from different sources, in order to help biologists to understand the behavior, for example, of the human body. An example of biological knowledge in information technology is the Gene Ontology (GO) <a id="br1">[</a><a href="#r1">1</a>], which is an ontological framework in biology that describes genes in terms of their molecular functions, associated biological processes, and cellular components, in an independent manner. </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">On the other hand, currently lot of biological networks are being extensively studied, such as the protein-protein interaction networks <a id="br2">[</a><a href="#r2">2</a>], the gene regulatory networks <a id="br3">[</a><a href="#r3">3</a>] and the metabolic networks <a id="br4">[</a><a href="#r4">4</a>]. Recent studies show that biological networks are dynamic; they reconfigure (appearance or disappearance of links) in response to different external signals. There are many examples showing the same list of genes with different forms of interactions in different conditions, which leads to different meanings or biological functions. </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">A cell signaling network is a type of biological network that describes the cellular activities and coordination among them, as response to their microenvironment. Particularly, the nodes describe the genes and the arcs the interaction among them. The genes have specific interactions according to their temporary functions, and can change their functions according to their interactions with different neighbors <a id="br5">[</a><a href="#r5">5</a>]. This implies that the functional analysis of genes regardless of their interactions, is not correct. Therefore, the cell signaling networks describe the genes considering at the same time their molecular functions and their interactions [6, 7, 8]. </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Cell signaling networks are studied in the context of human diseases, because they can help to treat them effectively. A signaling pathway describes a group of molecules in a cell. When the first molecule on a pathway receives a signal, it activates another molecule. This process is repeated until the last molecule is activated, and the cell function is performed. The abnormal activation of signaling pathways can lead to diseases such as cancer.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In general, gene networks in healthy people have the same list of genes that in sick people, but the connections are different, and therefore have different phenotypes. The functional analysis of these networks goes beyond from the capacities of current analysis tools, which consider only the genes individually, without the study of the link information. Thus, there is a great need to develop new methods of analysis for biological networks, which fully exploit the network topological information. In such networks, it is crucial to discover communities of genes (dense clusters) present at a given time. This problem is typical in the context of a large number of applications, such as social networks <a id="br9">[</a><a href="#r9">9</a>]. A number of techniques have been designed in the literature for the determination of dense clusters <a name="br10">[</a><a href="#r10">10</a><a name="br11">,</a><a href="#r11">11</a><a name="br12">,</a><a href="#r12">12</a><a name="br13">,</a><a href="#r13">13</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In this paper is proposed the detection of gene clusters, taking into account the topological structure of the network in the signaling pathway. The gene clusters are characterized using measures of graph theory, such as centrality and modularity. When this characterization of the clusters is carried out, our approach continues with a semantically enrichment using the GO. The main contribution of this paper is the application of graph theory and ontology mining into traditional biology, which normally is focused on studying individual parts of cell signaling pathways</span><font color="#ff0000"><span lang="en-US">.</span></font></font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Our proposal is tested in two signaling pathway networks: in the TGF-&beta; and in Alzheimer's disease. TGF-&beta; is a protein that controls cell proliferation and differentiation, which is also substantially involved in immunity and cancer. TGF-&beta; signaling pathway modulates processes such as cell invasion, immune regulation, and microenvironment modification that cancer cells may exploit to their advantage. Alzheimer's disease (AD) is a chronic disorder that slowly destroys brain cells and causes severe cognitive disabilities. The study of the AD signaling pathway allows, among other things, analyze how the disease affects the cell functions.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">This article has five sections. The first section is the introduction; the second section shows related works; the third section presents the theoretical basis of our proposal; the next section explains our approach. The fifth section presents two cases studies. Finally, the last section presents the conclusions. </span></font></font> </p>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>2. </b></span></span><span style="font-variant: normal"><span lang="en-US"><b>Related Work</b></span></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Some work related to semantic enrichment area of genes networks are described below. In recent years, have been designed experimental techniques to detect cellular molecules, such as microarray, RNA-Seq and mass spectrometry. To make biological interpretation of them, is used commonly genes grouping based on their similarities <a id="br14">[</a><a href="#r14">14</a>]. In particular, to determine the shared functions (functional similarity) between genes, one way is to incorporate biological knowledge, using knowledge bases as Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) <a name="br1">[</a><a href="#r1">1</a><a name="br15">,</a><a href="#r15">15</a>]. In this way, we can determine the prevailing biological subjects into a collection of genes, and compare biological themes among groups of genes. Basically, that is what is proposed in <a id="br14">[</a><a href="#r14">14</a>], with the &quot;ClusterProfiler&quot; tool, a tool to compare and visualize the functional profiles between groups of genes.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Moreover, <a id="br16">[</a><a href="#r16">16</a>] proposes a method for analyzing protein-protein interactions (PPI). The purpose of this method is to detect molecular interactions that might be common manifestations of Colorectal Cancer (CRC). The method described in <a id="br16">[</a><a href="#r16">16</a>], consists in the construction of a network using a set of databases publicly available of proteins, based on the utilization of mining applications. The network is characterized by its centrality values, to determine the regions of interest containing the main similarities between proteins. They find similar regions in the networks of CRC, to help to understand the molecular mechanisms of the disease <a id="br17">[</a><a href="#r17">17</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">On the other hand, NOA (Network Ontology Analysis) has been proposed as a resource for semantic enrichment of signaling pathway networks <a id="br18">[</a><a href="#r18">18</a>]. NOA is an ontology of biological links, which assigns functions to the interactions based on the annotation of known genes. NOA can capture the changes of the biological functions, by the change in the links of the networks of interactions of the proteins, something not possible with other techniques of analysis <a id="br18">[</a><a href="#r18">18</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">CePa is a R package in order to find important pathways through the network topology <a id="br19">[</a><a href="#r19">19</a>]. The package has several advantages. First, it defines the node pathway rather than defining only the gene, this is taken as the basic unit of a more complex system of genes. Second, multiple network centrality measures are applied simultaneously, to calculate the importance of the nodes based on different aspects, to have a complete view of the biological system <a id="br19">[</a><a href="#r19">19</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Cytoscape and PSIQUIC are two tools for analyzing the interactome (protein-protein interactions taking place in a cell) <a id="br20">[</a><a href="#r20">20</a>]. These tools use multiple repositories of protein interactions at the same time, and find topological groups within them. In addition, these groups are semantically enriched using GO <a name="br1">[</a><a href="#r1">1</a><a name="br20">,</a><a href="#r20">20</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">With respect to the previous works, our proposal differs in that it is based on the structure of the generated graph of the signaling pathway, uses techniques of Social Network Analysis (SNA), specifically techniques based on graph theory for the detection of clusters (known as SNA communities) and genes more significant, and ontological mining techniques for the semantic enrichment of the clusters. </span></font></font> </p>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>3. </b></span></span><span style="font-variant: normal"><span lang="en-US"><b>Theory</b></span></span></font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>3.1 </b></span></span><span lang="en-US"><b>Signaling Pathway Networks</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Signaling networks are a type of biological network, which describe the system of communication that defines the cellular activities and the interactions among them. The signaling pathway networks are complex systems, and may exhibit a number of emergent properties. The signaling networks normally integrate protein-protein interaction networks with the cellular functions.</span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><i>3.1.1</i></span><span lang="en-US"><i>General characteristics</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The activation of a pathway is a response to an external stimulus. For example, a cell A can activate a cell surface receptor that is part of a channel to get to B. The binding between A and B may contain other cells, and the stimulus that activates A opens a series of chain of activations until B. The activated receptor must first interact with other proteins inside the cell, before the ultimate physiological effect on the cell's behavior is produced. Often, the behavior of a chain of several interacting cell proteins is altered after a receptor activation. The entire set of cell changes induced by the receptor activation is called a signal transduction mechanism or pathway <a name="br21">[</a><a href="#r21">21</a><a name="br22">,</a><a href="#r22">22</a>].</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Cell signaling research involves studying the spatial and temporal dynamics of both, receptors and components of signaling pathways that are activated by receptors <a id="br21">[</a><a href="#r21">21</a>]. Cell signaling networks have been extensively studied in the context of human diseases. They help to understand the transmission and flow of cellular information. Errors in cellular information processing are responsible for diseases such as auto-immunity, diabetes and cancer.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><i>3.1.2 </i></span><span lang="en-US"><i>Components</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The signaling components are: stimulus, receptor and the response, as is shown in <a href="#f1">fig 1.A</a>. In some cases, between the receptor and the response, there are proteins and the information transmitted through the protein&ndash;protein interactions (see <a href="#f1">fig.1.B</a>). In some protein&ndash;protein interactions, there is a variety of scaffolds functional to hold together the individual components of signaling pathways, in order to create macromolecular signaling complexes (see <a href="#f1">Fig. 1.C</a>) <a id="br23">[</a><a href="#r23">23</a>]. </span></font></font> </p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> <a name="f1"> <img src="/img/revistas/cleiej/v19n2/2a07f1.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 1.</b></span><span lang="en-US"> Components of a Signaling Pathway (Image extracted from <a id="br23">[</a><a href="#r23">23</a>]). </span></font></font> </p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>3.2 </b></span><span lang="en-US"><b>Graph Theory </b></span></font></font> </p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">One of the major problems of modern biology is the discovery of knowledge in large databases. The pathway networks and metabolic networks usually can be modelled as directed graphs. In a Signaling Pathway Network, the nodes represent cells, with arcs denoting the interactions between them. This is a directed graph because, if cell A regulates cell B, then there is a natural direction in the arc between the corresponding nodes, starting at A and finishing at B <a id="br24">[</a><a href="#r24">24</a>]. The interest of the graph theory in this work is because it allows describing graphs from two points of view: What are the communities of nodes in the graph? and, What are the basic characteristic of each community?.</span></font></font></p>     <p lang="en-US" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">At the following, we present the main concepts used in this work, in order to extract information from the Signaling Pathway Networks. </font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 1.25cm; margin-top: 0.42cm; margin-bottom: 0.42cm">  <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>3.2.1 Modularity</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Modularity is a measure of the network structure. It is designed to measure the strength of the division of a network into modules (also called groups or communities). Networks with high modularity have strong connections between nodes within modules, but few connections between nodes in different modules. Modularity is used to detect the communities&rsquo; structure in networks. The modularity measure used in our work is <a id="br25">[</a><a href="#r25">25</a>]:</span></font></font>     <br>  <img src="/img/revistas/cleiej/v19n2/2a07z1.jpg">  (1)      <br></p>      <p lang="es-ES" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Where, L is the number of links in the network, A</span><sub><span lang="en-US">ij</span></sub><span lang="en-US"> is the adjacency matrix, k</span><sub><span lang="en-US">i</span></sub><span lang="en-US"> is the degree of node i, and &delta;(c</span><sub><span lang="en-US">i</span></sub><span lang="en-US">,c</span><sub><span lang="en-US">j</span></sub><span lang="en-US">)  equals 1 if the two nodes belong to the same community. </span></font></font> </p>     <p lang="es-ES" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">This measure is interesting in our work, because the concept of modularity in cell signaling pathways assumes that cellular functionality can be seamlessly partitioned into a collection of modules. Each module is a discrete entity of several elementary components, and performs an identifiable task, different from the functions of the other modules. The identification of this task is very important for the biologist. For example, the detection of the modules or groups, and their semantic enrichment, allows defining the function of this group, for instance, in a disease <a id="br26">[</a><a href="#r26">26</a>].</span></font></font></p>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><i>3.2.2 </i></span><span lang="en-US"><i>Centrality</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In graph theory and SNA, the centrality refers to a measure of a node in a graph that determines its relative importance within the graph <a id="br27">[</a><a href="#r27">27</a>]. The centrality of a node can help determining, for example, the impact of a gene involved in a series of reactions in a signaling pathway network. Several metrics of centrality are used in this work: </span></font></font> </p> <ul> 	    ]]></body>
<body><![CDATA[<li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The 	degree centrality: it is the simplest measure of centrality. It is 	the number of links of a node <a id="br28">[</a><a href="#r28">28</a>]. This can be divided into the 	centrality of the input degree and the centrality of the output 	degree, for directed graphs. This can be divided into the centrality 	of the input degree and the centrality of the output degree, for 	directed graphs. An example of this measure, for a given directed 	graph G=(V,E), of the centrality of the output degree of  a node V</span><sub><span lang="en-US">i</span></sub><span lang="en-US"> 	is:</span></font></font></p>     </ul>      <p class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"> <img src="/img/revistas/cleiej/v19n2/2a07z2.jpg"> (2) </font></font></p>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt" style="margin-left: 0.64cm">where, <i>n</i> is the number of nodes in the network. </font></font> </p>  <ul> 	    <li>     <p lang="en-US" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">The 	betweenness centrality: it is a measure that quantifies the 	frequency or number of times a node acts as a bridge along the 	shortest path between two nodes <a id="br28">[</a><a href="#r28">28</a>]. It is important because it 	determines the critical nodes in the spread of a disease or opinion 	in SNA. The betweenness of a vertex c<sub>i</sub> in a graph 	G:=(V,E) is:</font></font></p> 	    <p class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"> <img src="/img/revistas/cleiej/v19n2/2a07z3.jpg"> (3)</font></font></p>     </ul>     <p lang="es-ES" class="western" align="justify" style="margin-left: 0.64cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">where g<sub>jk</span> is the number of short paths connecting nodes j and k, and  g<sub>jk</span>(i) is the number of short paths connecting nodes j and k in which i</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> node is.</span></font></font></p>  <ul> 	    ]]></body>
<body><![CDATA[<li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The 	closeness centrality: it is based on the idea &ldquo;an important 	node is close to, and can communicate quickly with, the rest of 	nodes in the graph&rdquo;. It is a distance metric between all pairs 	of nodes with the node studied, defined by the length of its 	shortest paths. It is defined by <a id="br29">[</a><a href="#r29">29</a>]:</span></font></font></p>     </ul>     <p class="western" align="justify" style="margin-left: 0.64cm"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><img src="/img/revistas/cleiej/v19n2/2a07z4.jpg"> (4)</font></font></p>     <p lang="es-ES" class="western" align="justify">       <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY">w</span><span lang="en-US">here d(c</span><sub><span lang="en-US">i</span></sub><span lang="en-US">,c</span><sub><span lang="en-US">j</span></sub><span lang="en-US">) is the shortest path between nodes c</span><sub><span lang="en-US">i</span></sub><span lang="en-US"> and c</span><sub><span lang="en-US">j</span></sub><span lang="en-US">. </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.64cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">We use these measures of centrality to filter the most important nodes in each module/group/cluster. Other measures of centrality, which we do not use in this work, are <a id="br28">[</a><a href="#r28">28</a>]: eigenvector centrality, Katz centrality, PageRank, among others. Each one can be used to determine specific aspects of the nodes in a graph.</span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>3.3 </b></span><span lang="en-US"><b>Mining Techniques</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In this work we use several types of mining techniques, which are presented in the next sections.</span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><i>3.3.1 </i></span><span lang="en-US"><i>Semantic Mining and Ontological Mining</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The semantic mining is responsible for extracting semantic knowledge from different semantic sources, such as web pages, annotated graphs, and ontologies, among others. The semantic mining is divided into three groups <a name="br30">[</a><a href="#r30">30</a><a name="br31">,</a><a href="#r31">31</a>]: semantic data mining, web mining and ontological mining. The latter is the most interesting for this work. The Ontological Mining (OM) allows extracting knowledge from a set of ontologies <a id="br37">[</a><a href="#r37">37</a>]. Some of the OM techniques that have been developed are:</span></font></font></p>  <ul> 	    ]]></body>
<body><![CDATA[<li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The 	alignment of ontologies analyses the correspondence between the 	concepts of two or more ontologies <a name="br30">[</a><a href="#r30">30</a><a name="br32">,</a><a href="#r32">32</a>]. The alignment process 	is defined by the tuple: </span></font></font> 	</p>     </ul>     <p lang="en-US" class="western" align="center" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">A = (O1, O2, p, f)</font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-left: 0.75cm; text-indent: 0.25cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Where O1 and O2 are the ontologies to be aligned, p is the set of requirements (ontology language (i.e. OWL), the concepts vocabulary, among others), and f is the function of alignment (f is normally a similarity function to correlate the concepts). The set A symbolizes all the semantic correspondences between O1 and O2.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-left: 0.75cm; text-indent: 0.25cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">There are several algorithms of alignments, for example in <a id="br32">[</a><a href="#r32">32</a>] they proposed two algorithms of alignments: one called LMO (Linguistic Matching Ontology), based on linguistic similarity; and another based on the similarity of graphs (graph matching), called GMO (Graph Matching Ontology). In <a id="br30">[</a><a href="#r30">30</a>] is proposed an algorithm to automatically select the best alignment technique, given a set of ontologies to be aligned.</span></font></font></p> <ol start="3"> 	    <li>     <p lang="en-GB" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">The 	merging of ontologies is the process, where multiple ontologies in 	the same domain, are joined to standardize knowledge, grow the 	knowledge, or having full knowledge locally, among others. The 	merging of ontologies has different problems such as the handling of 	the same knowledge with different representations, the partial 	representation of the knowledge, among others. That requires the 	presence of experts during the process of merging to make decisions 	<a id="br33">[</a><a href="#r33">33</a>].</font></font></p> 	    <li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-GB">The 	linking of ontologies can be performed when have been identified the 	correspondences between concepts, in order to support the navigation 	between ontologies.</span></font></font></p>     ]]></body>
<body><![CDATA[</ol>      <p lang="es-ES" class="western" align="justify" style="margin-left: 0.5cm; text-indent: 0.25cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Other concept in the semantic ontology domain is folksonomy, used to describe a system where users use public tags to redefine items, particularly online items, which allows building a social or collaborative classification.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>3.3.2 Hierarchical Clustering</i></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Hierarchical clustering is a data mining method of cluster analysis, which seeks to build a hierarchy of groups. The hierarchical clustering strategies generally fall into two types <a id="br13">[</a><a href="#r13">13</a>]:</span></font></font></p> <ul> 	    <li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>Agglomerative: 	</i></span><span lang="en-US">This is a bottom-up approach that 	starts with different groups, and pairs of groups are mixed when one 	moves up the hierarchy</span><span lang="en-US"><i>.</i></span></font></font></p> 	    <li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>Divisive: 	</i></span><span lang="en-US">This is a top-down approach that 	begins with a group, and divisions are performed while is descending 	in the hierarchy</span><span lang="en-US"><i>.</i></span></font></font></p>     </ul>      <p lang="es-ES" class="western" align="justify"><font color="#0070c0">   </font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">We use the agglomerative approach because its quality <a name="br13">[</a><a href="#r13">13</a><a name="br28">,</a><a href="#r28">28</a>]. Since the computing of the similarity of an individual to the centroid of a cluster using characteristics of the individual is really difficult, we stay with a simple way to generate the clusters, as is the agglomerative hierarchical. In order to calculate the similarity, we need to define the attributes for the comparison, and the similarity measures to use in each attribute, for each individual <a id="br38">[</a><a href="#r38">38</a>]. If we use simply a hierarchical agglomerative algorithm, the time complexity is O(m</span><sup><span lang="en-US">2</span></sup><span lang="en-US">). In the agglomerative case, initially an identifier is assigned to each node; and successively these nodes will be grouped, up to the point desired: either a number of nodes per cluster or a maximum number of clusters. The representation of the hierarchy of clusters obtained are usually an inverted tree, called dendrogram, with successive mergers of the groups into top-level groups (larger, less uniformity, see <a href="#f2">Fig. 2</a>).</span></font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f2"> <img src="/img/revistas/cleiej/v19n2/2a07f2.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 2</b></span><span lang="en-US">: Hierarchical Clustering </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.25cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The grouping is given by a function that defines the distance between clusters <a id="br28">[</a><a href="#r28">28</a>]. The choice of an appropriate metric will influence the shape of the clusters. Different to the classical distance metrics (Manhattan, Euclidean or maximum distance) used by the hierarchical clustering strategies, our hierarchical clustering is based on the modularity measure. Modularity is a scalar value between -1 and 1 that measures the density of arcs inside communities to arcs outside communities. Optimizing this value, results in the best possible grouping of the nodes of a given network. However, going through all possible iterations of the nodes into groups is impractical, so heuristic algorithms are used. In our hierarchical clustering, the first small communities (The first communities are the initial grouping of nodes) are found by optimizing modularity locally on all nodes in the leaves (for example, among clusters 2 and 3, and not among clusters 1 and 2 or 1 and 3), then each small community is grouped into one node, and this step is repeated to a desired point. </span> </font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.25cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The hierarchical clustering algorithm is presented in <a id="br39">[</a><a href="#r39">39</a>], and when we introduce the modularity measure we find high modularity partitions for large networks in short time. This algorithm is divided in two phases. The first phase is to assign a different community to each node of the network. Then, for each i</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> node and its j</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> neighbors is evaluated the gain of modularity (see eq. 1) that would take place by removing i from its community and by placing it in the community of j. The i</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> node is placed in the community with more profit. This process is applied repeatedly to all nodes, until no further improvement can be achieved. Then, the first phase is completed. The second phase of the algorithm consists in building a new network whose nodes are now the communities found during the first phase. The weights of the links between the new nodes are given by the sum of weights of the links between nodes in the corresponding two communities <a id="br40">[</a><a href="#r40">40</a>]. Once this second phase is completed, then the first phase can be reapplied to the resulting weighted network, until there are not more changes, a maximum of modularity is attained, and a maximum number of clusters or nodes per cluster (this is the desired point). The hierarchical clustering used in this study has the following macro-algorithm:</span></font></font></p>     <p lang="en-US" class="western" align="justify">    <br>  </p>  <a name="f3"> <img src="/img/revistas/cleiej/v19n2/2a07f3.jpg"> </a>     <br>       <p lang="en-US" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">The macro-algorithm assigns initially a cluster to each node (step 1); then continues by grouping the nodes that are closer, using the maximum distance metric (step 2.1). Then, a new partition (cluster) is formed (step 2.2). Steps 2.1 and 2.2 are repeated until reach the desired conditions (maximum number of clusters or number of nodes per cluster).</font></font></p>       <p lang="en-US" class="western" align="justify" style="margin-top: 0.42cm; margin-bottom: 0.42cm"> 		<font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><b>3.4 Gene 		Ontology</b></font></font></p>      <p lang="es-VE" align="justify" style="margin-top: 0.49cm; margin-bottom: 0.49cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The Gene Ontology (GO) project <a id="br41">[</a><a href="#r41">41</a>] rovides structured, controlled, vocabularies and classifications, which cover several domains of molecular and cellular biology. They are freely available for community use <a id="br1">[</a><a href="#r1">1</a>]. Many biological databases and genome annotation groups use the GO and contribute to the GO project. The GO database integrates the vocabularies and provides full access to this information in several formats. The GO Web resource also provides access to extensive documentation about the GO project, and links to applications that use GO data for functional analyses. The GO ontology is a directed acyclic graph, where each term has relationships to one or more terms in the same domain, and sometimes to other domains. In this graph it can be found:</span></font></font></p> <ul> 	    ]]></body>
<body><![CDATA[<li>     <p lang="en-US" align="left" style="line-height: 0.39cm; page-break-after: avoid"> 	<font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Cellular 	Component: it is the parts of a cell or its extracellular 	environment. It describes a component of a cell that is part of a 	larger object, such as an anatomical structure.</font></font></p> 	    <li>     <p lang="en-US" align="left" style="line-height: 0.39cm; page-break-after: avoid"> 	<font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Biological 	Process: It describes the operations of a set of molecular events, 	with a defined beginning and end, with respect where they are 	integrated: cells, tissues, organs, and organisms. Examples of 	biological process is &quot;signal transduction&quot;. A biological 	process is not equivalent to a pathway. </font></font> 	</p> 	    <li>     <p lang="en-US" align="left" style="line-height: 0.39cm; page-break-after: avoid"> 	<font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Molecular 	Function: it is the elemental activity of a gene.  Molecular 	functions generally correspond to activities that can be performed 	by individual genes, but some activities are performed by assembling 	complexes of genes.</font></font></p>     </ul>      <p lang="es-ES" class="western" align="justify" style="text-indent: 0.64cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">A variety of browsers that provide visualization and query capabilities for the GO are available <a id="br1">[</a><a href="#r1">1</a>]. For example, the AmiGO browser (developed by the GO software group at Berkeley; see <a href="http://www.godatabase.org/cgi-bin/go.cgi">http://www.godatabase.org/cgi-bin/go.cgi</a>) provides a web interface for searching and displaying the ontologies, and organism databases, developed in the GO project. AmiGO easily allows users to browse and to search for terms, using a variety of different keys such as a name, synonyms, definitions, numerical identifiers, among others. The summary view presents the list of genes associated with each term.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.64cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">PANTHER (Protein ANalysis THrough Evolutionary Relationships) is a tool for the extracting knowledge from GO <a name="br1">[</a><a href="#r1">1</a><a name="br35">,</a><a href="#r35">35</a>]. PANTHER receives a gene identifier, and returns the semantic content of the gene.  It is a Library of Families and Subfamilies of Protein, Indexed by Function <a id="br35">[</a><a href="#r35">35</a>]. In this way, PANTHER is a classification system of proteins in order to facilitate high-throughput analysis. PANTHER has a method for relating protein sequences to functions, in a robust and accurate way. The Proteins have been classified according to: </span></font></font> </p> <ul> 	    <li>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Family 	and subfamily: groups of proteins that have the same function. </span></font></font> 	</p> 	    <li>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Molecular 	function.</span></font></font></p> 	    <li>     <p lang="en-US" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Biological 	process. </font></font> 	</p> 	    <li>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Pathway. 	</span></font></font> 	</p>     </ul>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>4. </b></span><span lang="en-US"><b>Our Approach</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The main component of our approach is the next macro-algorithm, which detects the clusters within a signaling pathway network by using one of the most successful solutions for the communities&rsquo; detection problem <a id="br40">[</a><a href="#r40">40</a>], based on the modularity measure <a id="br39">[</a><a href="#r39">39</a>], and enriched it using the GO.</span></font></font></p>      ]]></body>
<body><![CDATA[<br> <a name="f4"> <img src="/img/revistas/cleiej/v19n2/2a07f4.jpg"> </a>     <br>      <p lang="es-ES" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The macro-algorithm is described below: the first step is to bring the signaling pathway network to the desired format. Normally, it is received in OWL format (Ontology Web Language), and must be transformed to a network traditional format, in order to be analyzed by a SNA tool (in our case, Gephi <a id="br36">[</a><a href="#r36">36</a>], see step 2). Among the formats that the tool allows are: NET, DOT and CSV. Then, the modularity of all nodes is calculated using the equation 1 (step 3).</span><font color="#0070c0"><span lang="en-US"> </span></font></font></font> </p>     <p lang="es-ES" class="western" align="justify"><font color="#0070c0">   </font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Each community is defined in step 4 by using our hierarchical clustering algorithm (see section 3.3.2 for more details). These three first steps are done for a given signaling pathway network, such as the TGF-&beta; (see <a href="#f5">Figure 3</a>). The hypothetical result is shown in <a href="#f6">Figure 4</a>, where the clusters are represented by circles. They define the dendrogram, as is shown in <a href="#f2">Figure 2</a>.</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f5"> <img src="/img/revistas/cleiej/v19n2/2a07f5.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 3</b></span><span lang="en-US">: The signaling pathway network of TGF-&beta;</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f6"> <img src="/img/revistas/cleiej/v19n2/2a07f6.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 4</b></span><span lang="en-US">: Clusters for the signaling pathway network of TGF-&beta;</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm; margin-top: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Then, the centroids are extracted for each cluster (step 5). For this study, the centroids will be taken as equivalent to the central nodes, taking into account the measures of centrality such as degree, closeness and betweenness. We are striving to merge the concepts of centroid of the cluster and central nodes of communities. A negative consequence is that we will not have a unique centroid, but the positive consequence is that we do not need just one, indeed more than one node that represents a cluster is helpful, in order to extract more knowledge from the GO to enrich the cluster. Initially, the central nodes are defined by the centrality measures of degree and closeness, because these are nodes that can be reached fast for the rest of the nodes in the network. If these centralities are not enough to identify the central node in each cluster, the betweenness centrality is used as a second filter, this is done because there are some networks that using just the closeness centrality, it is not possible to detect a group of interesting nodes.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Next, the central nodes are passed to a semantic enrichment (step 6). This is done by using the PANTHER tool from the GO consortium. In this work, the nodes that are passed to PANTHER are the highly central nodes of each cluster that were filtered using the degree, closeness and betweenness centralities, and the semantic information returned will be extrapolated to each cluster where each central node belongs (step 7). The query using the PANTHER tool is an OM task (alignment and linking of the central nodes of the clusters with GO), for the semantic enrichment of the clusters.</span></font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm; margin-top: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The step 6 is based in the alignment of the Gene Ontology with the most central nodes inside each cluster, for a further enrich of each cluster within the pathway with the knowledge that can be incorporated from the Gene Ontology to the clusters. Then, the Gene Ontology is merged with these central nodes of each cluster. For some authors, this can be understanding as folksonomy, because our system uses public tags to enrich its clusters.</span></font></font>      <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>5. </b></span></span><span style="font-variant: normal"><span lang="en-US"><b>Case Studies</b></span></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span style="font-variant: normal"><span lang="es-UY"><b>5.1 </b></span></span><span lang="en-US"><b>TGF-&beta;</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">TGF-&beta; is a protein that controls cell proliferation implicated in cancer. This study allows biologists to detect biological functions specific to the cancer cell proliferation. The network used is shown in Fig. 5 in the network format that supports Gephi. The network has 1534 nodes or genes, and 3029 reactions or relations between them.</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f7"> <img src="/img/revistas/cleiej/v19n2/2a07f7.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 5</b></span><span lang="en-US">: Gephi Clusters </span></font></font> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Once calculated the modularity, 16 communities were found, which for this work represent 16 clusters of genes. In the calculation of centrality for all genes in the network, the Closeness Centrality was highly interesting, because for this type of network, the critical nodes are the ones that may be causing a disease, in this case carcinogenic. These nodes with high closeness centrality will spread a disease faster, or they are responsible for triggering a series of reactions that lead to the disease. The central nodes are shown in Fig. 6, which have a larger size than those with lower centrality (the size is proportional to the measure of centrality). Our approach calculates the degree and closeness centralities as a first filter, and the betweenness centrality when the other centralities give too much central nodes, in order to reduce the number of central nodes</span><font color="#0070c0"><span lang="en-US">.</span></font><span lang="en-US"> </span></font></font> </p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f8"> <img src="/img/revistas/cleiej/v19n2/2a07f8.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 6</b></span><span lang="en-US">: Network view with larger size of central nodes </span></font></font> </p>     <p lang="en-US" class="western" align="justify">    ]]></body>
<body><![CDATA[<br>  </p>     <p lang="en-US" class="western" align="justify" style="text-indent: 0.5cm; margin-bottom: 0.42cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">As mentioned in the algorithm, the closeness centrality is calculated (<a href="#f9">Fig. 7</a> shows the closeness central nodes of <a href="#f8">Fig. 6</a>). These nodes are potentially critical genes in the development of cancer diseases. This leads to a greater understanding for the biologists.</font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm">  <a name="f9"> <img src="/img/revistas/cleiej/v19n2/2a07f9.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 7</b></span><span lang="en-US">: Network view showing only central nodes </span></font></font> </p>      <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Times New Roman, serif"><font size="3" style="font-size: 12pt"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><a href="#t1">Table 1</a> shows the output that provides Gephi, where </span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>label</i></span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"> is the identifier of the gene, </span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>degree</i></span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"> is the degree centrality of the node or gene in the network, </span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>closeness centrality</i></span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"> is the value of proximity of the gene, and </span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><i>id of the cluster</i></span></font></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"> is the number of the community to which each node belongs. In this case, it does not need to use the betweenness centrality because with the degree centrality and the closeness centrality there are enough nodes from each cluster to enrich the clusters</span></font></font><font color="#0070c0"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">.</span></font></font></font></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Times New Roman, serif"><font size="3" style="font-size: 12pt"><font size="3" style="font-size: 12pt"><b><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Table 1:</font></font></b></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"> Gephi TGF-&Beta; output</font></font></font></font>     <br> <a name="t1"> <img src="/img/revistas/cleiej/v19n2/2a07t1.jpg">  </a>     <br> </p>      <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><a href="#t1">Table 1</a> is a small version of the real table, with only 5 of the most central nodes, belonging to two different clusters; the genes _:A615, _:A617  and _:A1091 belong to cluster 7, and have the highest closeness centrality from the entire network. Furthermore, the genes _:A1092, and _:A664 belong to cluster 4, the most central nodes from cluster 4 (using the Closeness Centrality). We have used the Silhouette Coefficient like the performance metric, to measure the quality of the clustering process. The Silhouette Coefficient of a </span><span lang="en-US"><i>i</i></span><span lang="en-US"> node is <a id="br13">[</a><a href="#r13">13</a>]:</span></font></font></p>      <p class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><img src="/img/revistas/cleiej/v19n2/2a07z6.jpg">  (6)     ]]></body>
<body><![CDATA[<br></font></font></p>      <p lang="es-ES" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Where a(i) is the mean distance between the i</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> node and all other nodes in the same cluster, and b(i) the mean distance between the i</span><sup><span lang="en-US">th</span></sup><span lang="en-US"> node and all other nodes in any cluster not containing the node. We compute the average of the Silhouette Coefficient of all nodes as an overall measure of the goodness of the clustering process. The result is 0.9214, it is a very good result (closer to 1 is the best value). This is done in order to check and verify that our approach is giving good results.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The next step is the semantic enrichment of the data in <a href="#t1">Table 1</a>, which is performed with the PANTHER tool. The list of genes identifiers is given as input to PANTHER, and the output that gives the tool can be seen in <a href="#f10">Fig. 8</a>, where all terms are referenced to a concept GO.</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <font face="Verdana, sans-serif"> <a name="f10"> <img src="/img/revistas/cleiej/v19n2/2a07f10.jpg"> </a>     <br> <font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 8</b></span><span lang="en-US">: Semantic enrichment of the nodes, extracted from PANTHER</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.64cm; margin-top: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">In <a href="#f10">Fig. 8</a> can be seen the list of genes in <a href="#t1">Table 1</a>, with the semantic content extraction from GO. Each GO term within the table has a unique alphanumeric identifier; which leads to a definition with cited sources; and a namespace indicating the domain to which it belongs. The terms may also have synonyms, references to equivalent concepts in other databases, and comments about the meaning or usage of the term. This can be used by the biologists in order to know exactly which genes, proteins or reactions are the critical nodes in a pathway. These cells are essential in the proliferation of the disease.</span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>5.2 </b></span><span lang="en-US"><b>Alzheimer Disease</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">This experiment shows the pathway of genes, and other processes associated with the Alzheimer's disease. An adaption from KEGG 2011 (see <a id="br15">[</a><a href="#r15">15</a>] and <a href="http://www.genome.jp/kegg/pathway/hsa/hsa05010.html">http://www.genome.jp/kegg/pathway/hsa/hsa05010.html</a> for details) was used. This network has 2537 nodes or genes, and 5816 reactions or relationships between them.</span></font></font></p>     <p lang="es-ES" class="western" align="justify">   <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">As in the case of TGF-&beta;,</span><span lang="en-US"> the closeness centrality is calculated</span><span lang="en-US">. <a href="#f11">Fig. 9</a> shows the network using a filter that only allows highly central nodes using the closeness centrality, these nodes are the genes that can propagate a disease faster than other nodes</span><font color="#0070c0"><span lang="en-US">.</span></font></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f11"> <img src="/img/revistas/cleiej/v19n2/2a07f11.jpg"> </a>     ]]></body>
<body><![CDATA[<br><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 9</b></span><span lang="en-US">: Alzheimer Network view, showing only central nodes </span></font></font> </p>      <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">A second filter is used to allow genes with a high degree of input and output, the resulting network is illustrated in <a href="#f12">Fig. 10</a>. This is done to get the more representative nodes from each cluster, these genes are those that can propagate faster the disease (closeness centrality), and moreover, they are nodes that can spread the disease to a high number of nodes at the same time (degree centrality), for these two reasons, these nodes are critical nodes in the network.</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm"> <a name="f12"> <img src="/img/revistas/cleiej/v19n2/2a07f12.jpg"> </a>     <br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 10</b></span><span lang="en-US">: Alzheimer Network view, with the main central nodes</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm; margin-top: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><a href="#t2">Table 2</a> shows the output that provides Gephi. <a href="#t2">Table 2</a>, like <a href="#t1">Table 1</a>, is a reduced version of the real data table with thousands of genes. In this case are shown only 5 of the central nodes, belonging to three different clusters, clusters 0, 3 and 8. These genes have the highest closeness centrality of the graph. The Silhouette Coefficient value, using the eq. 6, in this case is 0.9067, that is, the quality of the clustering process is good.</span></font></font></p>     <p lang="en-US" align="center" style="margin-bottom: 0.18cm; line-height: 0.39cm; page-break-after: avoid"> <font face="Times New Roman, serif"><font size="2" style="font-size: 9pt"><font size="3" style="font-size: 12pt"><b><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Table 2:</font></font></b></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"> Gephi Alzheimer output</font></font></font></font>     <br> <a name="t2"> <img src="/img/revistas/cleiej/v19n2/2a07t2.jpg">  </a>     <br> </p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm; margin-top: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The next step is the semantic enrichment of the data in <a href="#t2">Table 2</a>. As mentioned in the previous case, it is performed with the PANTHER tool. The output of the tool can be seen in <a href="#f13">Fig. 11</a>, where all terms are referenced to a concept in GO. In <a href="#f13">Fig. 11</a> can be seen the genes listed in <a href="#t2">Table 2</a>, with the semantic content mapped from the GO. In this case, PANTHER gives links to their respective family in the ontology. This output type, besides giving semantic content to the nodes (genes and proteins); adds a macro-semantic content, which is the family and the kind of node. This will be used by the biologists to analyse the critical genes in a pathway (essential in the spread of diseases or other biological reactions), their family and the protein class.</span></font></font></p>     <p lang="es-ES" class="western" align="center" style="margin-top: 0.21cm; margin-bottom: 0.42cm"> <a name="f13"> <img src="/img/revistas/cleiej/v19n2/2a07f13.jpg"> </a>     ]]></body>
<body><![CDATA[<br> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Figure 11</b></span><span lang="en-US">: Alzheimer nodes, with semantic content extracted from PANTHER</span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>5.3 </b></span><span lang="en-US"><b>Comparison with similar approaches</b></span></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">The criteria used for comparison with other approaches are: Theoretical bases used for the study, the type of problem which is resolved, and the output given to the user. </span></font></font> </p>     <p lang="en-US" align="center" style="margin-top: 0.42cm; margin-bottom: 0.18cm; line-height: 0.39cm; page-break-after: avoid"> <font face="Times New Roman, serif"><font size="2" style="font-size: 9pt"><font size="3" style="font-size: 12pt"><b><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">Table 3:</font></font></b></font><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"> Comparison with previous work</font></font></font></font>     <br> <a name="t3"> <img src="/img/revistas/cleiej/v19n2/2a07t3.jpg">  </a>     <br> </p>      <p lang="es-ES" class="western" align="justify" style="text-indent: 0.64cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">According to <a href="#t3">Table 3</a>, similar approaches usually perform the analysis only using graph theory or ontologies. The only one that mix both in a semantic enrichment process is our approach. Our proposal is based on the structure of the generated graph of the signaling pathway, and the graph is analyzed using techniques of SNA to detect clusters (communities). Additionally, it identifies the most central nodes in each cluster. Also, it uses OM techniques (alignment and linking of the most central nodes of the clusters) for the semantic enrichment of the nodes (proteins) in the clusters. Our approach can use any gene ontology, not only GO, in the semantic enrichment process, and can be used in any signaling pathway, as we are shown in two case studies.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Times New Roman, serif"><font size="3" style="font-size: 12pt"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Additionally, we use the silhouette measure to determine the similarity of an object with its own cluster compared to other clusters (see eq. 6). The silhouette ranges from -1 to 1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate, otherwise the clustering configuration may have too many or too few clusters. The result of the Silhouette Coefficient value for the first case is 0.9214, and for the second case is 0.9067, that is, the quality of the clustering process is very good with respect to previous works (see previous sections).</span></font></font></font></font></p>      <p lang="es-ES" class="western" align="justify"><font face="Times New Roman, serif"><font size="3" style="font-size: 12pt"><span style="font-variant: normal"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="es-UY"><b>6. </b></span></font></font></span><span style="font-variant: normal"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US"><b>Conclusions</b></span></font></font></span></font></font></p>     <p lang="en-US" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt">This paper proposed the use of clustering techniques aimed initially for SNA, to detect communities or groups in the signalling pathway networks. As main contribution, respect to other signalling pathway analysis techniques, our approach does not use the traditional clustering techniques, but from another area (SNA). In this way, it uses the ideas of modularity to define the clusters, then the centrality of nodes is used to link them with semantic knowledge, and characterize the biological clusters where each central node belongs to. In particular, the centrality identifies the central nodes within groups, without making a study of the characteristics of the nodes, only considering their structures and connectivity. It is sufficient to determine the critical nodes in a community. </font></font> </p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Particularly, this idea of the utilization of measures from the graph theory can extend the analysis of the biological networks with other concepts, to understand them better. Next work must explore the relationship between these measures and the biology, and analyse the interest in using other metrics from the graph theory: eigenvector centrality, Katz centrality, PageRank, among others</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">On the other hand, we have used the GO for the semantic enrichment, and in particular, the PANTHER query engine. Semantically, it enriches the most central nodes in each group. This provides much information of value to biologists because it gives precise biological information for critical nodes in the spread of a disease. Our approach does not depend of this ontology, we can use others ontologies, or a mix of them.</span></font></font></p>     <p lang="es-ES" class="western" align="justify" style="text-indent: 0.5cm; margin-bottom: 0.21cm"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">A future work is an application that integrates all these tools. Also, future studies should analyse a process of semantic enrichment from multiple ontological sources (for this task, it will be required other OM tasks, such as the ontological merge.</span></font></font></p>     <p lang="en-US" align="left" style="margin-right: 0.02cm; margin-top: 0.42cm; margin-bottom: 0.21cm; page-break-inside: avoid; orphans: 0; widows: 0; page-break-after: avoid"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><b>Acknowledgements</b></font></font></p>     <p lang="es-ES" class="western" align="justify"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><span lang="en-US">Project CDCHTA I - 1407-14 - 02 - B from the Universidad de Los Andes, for their financial support. PhD. Aguilar has been partially funded by the Prometeo Project from the Ministry of Higher Education, Science, Technology and Innovation of the Republic of Ecuador.  </span></font></font> </p>     <p lang="en-US" align="left" style="margin-right: 0.02cm; margin-top: 0.42cm; margin-bottom: 0.21cm; page-break-inside: avoid; orphans: 0; widows: 0; page-break-after: avoid"> <font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><b>References</b></font></font></p>      <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r1">[</a><a href="#br1">1</a>] M. Harris, J. Clark, A. Ireland, J. Lomax, M. Ashburner, R. Foulger, et al. &ldquo;The Gene Ontology (GO) database and informatics resource&rdquo;. Nucleic Acids Research. vol. 32, pp. D258-D261, 2004. <a href="http://geneontology.org/">http://geneontology.org/</a>.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r2">[</a><a href="#br2">2</a>] U. Stelzl, U. Worm, M. Lalowski, C. Haenig, F. Brembeck, H. Goehler, M. Stroedicke, et al. &ldquo;Human protein&ndash;protein interaction network: a resource for annotating the proteome&rdquo;. Cell, vol. 122, pp. 957&ndash;968, 2005.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r3">[</a><a href="#br3">3</a>] J. Hasty, D. McMillen, F. Isaacs, and J. Collins, &ldquo;Computational studies of gene regulatory networks: in numero molecular biology&rdquo;. Nat. Rev. Genet., vol. 2, pp. 268&ndash;279, 2001.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r4">[</a><a href="#br4">4</a>] E. Ravasz, A. Someraz, D. Mongru, Z. Oltvai and A. Barabasi. &ldquo;Hierarchical organization of modularity in metabolic networks&rdquo;. Science, vol. 297, pp. 1551&ndash;1555, 2002.</font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r5">[</a><a href="#br5">5</a>] H. Kitano, &ldquo;Systems biology: a brief overview&rdquo;. Science, vol. 295, pp. 1662&ndash;1664, 2002.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r6">[</a><a href="#br6">6</a>] A. Barabasi and Z. Oltvai, &ldquo;Network biology: understanding the cell&rsquo;s functional organization&rdquo;. Nature Reviews Genetics, vol. 5, pp. 101&ndash;113, 2004.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r7">[</a><a href="#br7">7</a>] L. Chen, R. Wang and S. Zhang, Biomolecular Networks: Methods and Applications in Systems Biology. John Wiley &amp; Sons, 2009.    </font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r8">[</a><a href="#br8">8</a>] L. Chen, R. Wang and K. Aihara, Modeling Biomolecular Networks in Cells: Structures and Dynamics. London, Springer, 2010.    </font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r9">[</a><a href="#br9">9</a>] C. Aggarwal, and H. Wang, Managing and Mining Graph Data. Advances in Database Systems, Springer, 2010.    </font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r10">[</a><a href="#br10">10</a>] R. Agrawal, and R. Srikant, &ldquo;Fast algorithms for mining association rules in large databases&rdquo;, in Proc. 20th International Conference on Very Large Data Bases, San Francisco, CA, USA, 1994, pp. 487-499.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r11">[</a><a href="#br11">11</a>] S. Agrawal, S. Chaudhuri and G. Das, &quot;DBXplorer: a system for keyword-based search over relational databases,&quot; in Proc. 18th International Conference on Data Engineering, San Jose, CA, USA, 2002, pp. 5-16.    </font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r12">[</a><a href="#br12">12</a>] S. Bhagat, G. Cormode, and I. Rozenbaum, &ldquo;Applying link-based classification to label blogs&rdquo;, in Proc. 2007 workshop on Web mining and social network analysis, New York, NY, USA, 2007, pp. 92-101.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r13">[</a><a href="#br13">13</a>] Aguilar, &ldquo;Resolution of the clustering problem using genetic algorithms&rdquo;, International Journal of computers, vol. 1, pp. 237-244, 2007,</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r14">[</a><a href="#br14">14</a>] G. Yu, L. Wang, Y. Han, and Q. He, &ldquo;ClusterProfiler: an R package for comparing biological themes among gene clusters&rdquo;. OMICS: Journal of Integrative Biology, vol. 16, pp. 284-287, 2012.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r15">[</a><a href="#br15">15</a>] KEGG: Kyoto Encyclopedia of Genes and Genomes, <a href="http://www.genome.jp/kegg/">http://www.genome.jp/kegg/</a></font></font><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r16">[</a><a href="#br16">16</a>] M. Bux, U. Leser, U., and T. Philippe, &ldquo;Comparing semantically enriched experimental protein networks in colorectal cancer&rdquo;. Humboldt Universit&auml;t zu Berlin, 2012.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r17">[</a><a href="#br17">17</a>] H. Chuang, E. Lee, T. Liu, D. Lee, and T. Ideker, &ldquo;Network-based classification of breast cancer metastasis&rdquo;. Molecular Systems Biology, vol. 3, pp. 140, 2007</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r18">[</a><a href="#br18">18</a>] J. Wang, Q. Huang, Z. Liu, Y. Wang, L. Wu, L. Chen, and  X. Zhang. &ldquo;NOA: a novel Network Ontology Analysis method&rdquo;. Nucleic Acids Research, vol. 39, 2011</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r19">[</a><a href="#br19">19</a>] Z. Gu, and J. Wang. &ldquo;CePa: an R package for finding significant pathways weighted by multiple network centralities&rdquo;. Bioinformatics Applications Note, vol. 29, pp. 658&ndash;660, 2013. </font></font> </p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r20">[</a><a href="#br20">20</a>] P. Porras, &ldquo;Network generation and analysis through Cytoscape and PSICQUIC&rdquo;. EMBL-EBI, vol. 6. Cambridge, U.K. <a href="https://www.ebi.ac.uk/sites/ebi.ac.uk/files/content.ebi.ac.uk/materials/2013/130702_San_Michele/biolnetworksanalysis_tutorial.pdf">https://www.ebi.ac.uk/sites/ebi.ac.uk/files/content.ebi.ac.uk/materials/2013/130702_San_Michele/biolnetworksanalysis_tutorial.pdf</a></font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r21">[</a><a href="#br21">21</a>] C. Bettembourg, C. Diot, and O. Dameron, &ldquo;Semantic particularity measure for functional characterization of gene sets using gene ontology&rdquo;. PLoS One, vol. 9, 2014.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r22">[</a><a href="#br22">22</a>] A. Dinasarapu, B. Saunders, I. Ozerlat, K. Azam, and S. Subramaniam. &quot;Signaling gateway molecule pages, a data model perspective&quot;. Bioinformatics, vol. 27, pp. 1736&ndash;1738, 2011.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r23">[</a><a href="#br23">23</a>] M. Berridge, Cell Signalling Biology, Portland Press Limited M.J. 2014)</font></font><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r24">[</a><a href="#br24">24</a>] O. Mason, and M. Verwoerd, &ldquo;Graph theory and networks in biology&rdquo;. IET Systems Biology, vol. 1, pp. 89-119, 2007.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r25">[</a><a href="#br25">25</a>] V. Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre, &ldquo;Fast unfolding of communities in large networks&rdquo;. Journal of Statistical Mechanics: Theory and Experiment, Vol. 2008, pp. 10008- 10020, 2008.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r26">[</a><a href="#br26">26</a>] E. Ravasz, A. Somera, D. Mongru, Z. Oltvai, and A. Barab&aacute;si, &ldquo;Hierarchical organization of modularity in metabolic networks&rdquo;. Science, vol. 297, pp. 1551-1555, 2002.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r27">[</a><a href="#br27">27</a>] S. Borgatti, &ldquo;Centrality and network flow&rdquo;. Social Networks, vol. 27, pp. 55&ndash;71, 2005.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r28">[</a><a href="#br28">28</a>] J. Sun and J. Tang, &ldquo;A survey of models and algorithms for social influence analysis&rdquo;. In Social network data analytics (C. Aggarwal Ed.), Nueva York, Springer, pp. 177&ndash;214, 2011.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r29">[</a><a href="#br29">29</a>] G. Sabidussi, &ldquo;The centrality index of a graph&rdquo;. Psychometrika, vol. 31, pp. 581-603, 1966.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r30">[</a><a href="#br30">30</a>] C. Rangel, J. Aguilar, M. Cerrada, J. Altamiranda. &ldquo;An approach for the emerging ontology alignment based on the bees colonies&rdquo;, in Proc. Int. Conf. Artificial Intelligence, Las Vegas, USA, 2015, pp. 536-541.</font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r31">[</a><a href="#br31">31</a>] J. Aguilar, J. Altamiranda, &ldquo;Miner&iacute;a de Datos en la Web usando Computaci&oacute;n Evolutiva&rdquo;, In Ingenier&iacute;a de Software en la D&eacute;cada del 2000 (N. Brisaboa Ed.), AECI, RISTOS2, pp. 153-168, 2003.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r32">[</a><a href="#br32">32</a>] J. Euzenat and P. Shvaiko. Ontology Matching. Berlin, Springer-Verlag, 2007.     </font></font> </p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r33">[</a><a href="#br33">33</a>] B. Bouchou, C. Niang, and M. Lo. &ldquo;Towards tailored domain ontologies&rdquo;, in Proc 5th International Workshop on Ontology Matching, pp. 241-243, 2010.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r34">[</a><a href="#br34">34</a>] J. Altamiranda, J. Aguilar, and C. Delamarche, &ldquo;Similarity of Amyloid Protein Motif using an Hybrid Intelligent System&quot;, IEEE Latin America Transactions, vol. 9, pp. 700-710, 2011.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r35">[</a><a href="#br35">35</a>] P. Thomas, M. Campbell, A. Kejariwal, H. Mi, et. Al. &ldquo;PANTHER: a library of protein families and subfamilies indexed by function&rdquo;. Genome Research, vol. 13, pp. 2129-2141, 2003.</font></font></p>     <!-- ref --><p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r36">[</a><a href="#br36">36</a>] M. Bastian, S. Heymann and M. Jacomy, Gephi: An open source software for exploring and manipulating networks, International AAAI Conference on Weblogs and Social Media, 2009.    </font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r37">[</a><a href="#br37">37</a>] Y. Ishikawa, J. Li, W. Wang, R. Zhang, and W. Zhang. &ldquo;Web Technologies and Applications&rdquo;. 15th Asia-Pacific Web Conference, APWeb 2013, Sydney, Australia, April 4-6, 2013, Proceedings (Vol. 7808). Springer.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r38">[</a><a href="#br38">38</a>] L. Rokach, and O. Maimon. &ldquo;Clustering methods&rdquo;. Data mining and knowledge discovery handbook (pp. 321-352). 2005 Springer US.</font></font></p>     ]]></body>
<body><![CDATA[<p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r39">[</a><a href="#br39">39</a>] V. D. Blondel, J-L. Guillaume, R. Lambiotte, E. Lefebvre. &ldquo;Fast unfolding of communities in large networks&rdquo;.  Journal of Statistical Mechanics: Theory and Experiment 2008 (10), P1000</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r40">[</a><a href="#br40">40</a>] A. Arenas, J. Duch, A. Fern&aacute;ndez, and S. G&oacute;mez. &ldquo;Size reduction of complex networks preserving modularity&rdquo;. New Journal of Physics, 9(6), 176. 2007.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r41">[</a><a href="#br41">41</a>] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, and M. A. Harris. &ldquo;Gene Ontology: tool for the unification of biology&rdquo;. Nature genetics, 25(1), 25-29. 2000.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r42">[</a><a href="#br42">42</a>] I. Peters, I. &ldquo;Folksonomies: indexing and retrieval in Web 2.0 (Vol. 1)&rdquo;. Walter de Gruyter. 2009.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r43">[</a><a href="#br43">43</a>] K. Borne. &ldquo;Collaborative annotation for scientific data discovery and reuse&rdquo;. Bulletin of the American Society for Information Science and Technology, 39(4), 44-45. 2013.</font></font></p>     <p lang="es-ES" class="western" align="left"><font face="Verdana, sans-serif"><font size="2" style="font-size: 10pt"><a id="r44">[</a><a href="#br44">44</a>] G. Asmolov. &ldquo;Crowdsourcing and the folksonomy of emergency response: The construction of a mediated subject&rdquo;. Interactions: Studies in Communication &amp; Culture, 6(2), 155-178. 2015.</font></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Harris]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Clark]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Ireland]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Lomax]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Ashburner]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Foulger]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Gene Ontology (GO) database and informatics resource]]></article-title>
<source><![CDATA[Nucleic Acids Research]]></source>
<year>2004</year>
<volume>32</volume>
<page-range>D258-D261</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stelzl]]></surname>
<given-names><![CDATA[U]]></given-names>
</name>
<name>
<surname><![CDATA[Worm]]></surname>
<given-names><![CDATA[U]]></given-names>
</name>
<name>
<surname><![CDATA[Lalowski]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Haenig]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Brembeck]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Goehler]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Stroedicke]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Human protein-protein interaction network: a resource for annotating the proteome]]></article-title>
<source><![CDATA[Cell]]></source>
<year>2005</year>
<volume>122</volume>
<page-range>957-968</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hasty]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[McMillen]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Isaacs]]></surname>
<given-names><![CDATA[F]]></given-names>
</name>
<name>
<surname><![CDATA[Collins]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Computational studies of gene regulatory networks: in numero molecular biology]]></article-title>
<source><![CDATA[Nat. Rev. Genet.]]></source>
<year>2001</year>
<volume>2</volume>
<page-range>268-279</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ravasz]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Someraz]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Mongru]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Oltvai]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Barabasi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Hierarchical organization of modularity in metabolic networks]]></article-title>
<source><![CDATA[Science]]></source>
<year>2002</year>
<page-range>297</page-range><page-range>1551-1555</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kitano]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Systems biology: a brief overview]]></article-title>
<source><![CDATA[Science]]></source>
<year>2002</year>
<volume>295</volume>
<page-range>1662-1664</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Barabasi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Oltvai]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Network biology: understanding the cell’s functional organization]]></article-title>
<source><![CDATA[Nature Reviews Genetics]]></source>
<year>2004</year>
<volume>5</volume>
<page-range>101-113</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Biomolecular Networks: Methods and Applications in Systems Biology]]></source>
<year>2009</year>
<publisher-name><![CDATA[John Wiley & Sons]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Aihara]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<source><![CDATA[Modeling Biomolecular Networks in Cells: Structures and Dynamics]]></source>
<year>2010</year>
<publisher-loc><![CDATA[London ]]></publisher-loc>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Aggarwal]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<source><![CDATA[Managing and Mining Graph Data: Advances in Database Systems]]></source>
<year>2010</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Agrawal]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Srikant]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Fast algorithms for mining association rules in large databases]]></article-title>
<source><![CDATA[]]></source>
<year>1994</year>
<conf-name><![CDATA[ Proc. 20th International Conference on Very Large Data Bases]]></conf-name>
<conf-loc> </conf-loc>
<page-range>487-499</page-range><publisher-loc><![CDATA[San Francisco^eCA CA]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Agrawal]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Chaudhuri]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Das]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[DBXplorer: a system for keyword-based search over relational databases]]></article-title>
<source><![CDATA[]]></source>
<year>2002</year>
<conf-name><![CDATA[ Proc. 18th International Conference on Data Engineering]]></conf-name>
<conf-loc> </conf-loc>
<page-range>5-16</page-range><publisher-loc><![CDATA[San Jose^eCA CA]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bhagat]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Cormode]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
<name>
<surname><![CDATA[Rozenbaum]]></surname>
<given-names><![CDATA[I]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Applying link-based classification to label blogs]]></article-title>
<source><![CDATA[Proc. 2007 workshop on Web mining and social network analysis]]></source>
<year>2007</year>
<page-range>92-101</page-range><publisher-loc><![CDATA[New York^eNY NY]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Aguilar]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Resolution of the clustering problem using genetic algorithms]]></article-title>
<source><![CDATA[International Journal of computers]]></source>
<year>2007</year>
<volume>1</volume>
<page-range>237-244</page-range></nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yu]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Han]]></surname>
<given-names><![CDATA[Y]]></given-names>
</name>
<name>
<surname><![CDATA[He]]></surname>
<given-names><![CDATA[Q]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[ClusterProfiler: an R package for comparing biological themes among gene clusters]]></article-title>
<source><![CDATA[Journal of Integrative Biology]]></source>
<year>2012</year>
<volume>16</volume>
<page-range>284-287</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="">
<source><![CDATA[KEGG: Kyoto Encyclopedia of Genes and Genomes]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bux]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Leser]]></surname>
<given-names><![CDATA[U]]></given-names>
</name>
<name>
<surname><![CDATA[Philippe]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Comparing semantically enriched experimental protein networks in colorectal cancer]]></article-title>
<source><![CDATA[Humboldt Universität zu Berlin]]></source>
<year>2012</year>
</nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chuang]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Ideker]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Network-based classification of breast cancer metastasis]]></article-title>
<source><![CDATA[Molecular Systems Biology]]></source>
<year>2007</year>
<volume>3</volume>
<page-range>140</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[Q]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[Y]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[X]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[NOA: a novel Network Ontology Analysis method]]></article-title>
<source><![CDATA[Nucleic Acids Research]]></source>
<year>2011</year>
<volume>39</volume>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gu]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[CePa: an R package for finding significant pathways weighted by multiple network centralities]]></article-title>
<source><![CDATA[Bioinformatics Applications Note]]></source>
<year>2013</year>
<volume>29</volume>
<page-range>658-660</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Porras]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Network generation and analysis through Cytoscape and PSICQUIC]]></article-title>
<collab>EMBL-EBI</collab>
<source><![CDATA[]]></source>
<year></year>
<volume>6</volume>
<publisher-loc><![CDATA[Cambridge^eU.K U.K]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bettembourg]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Diot]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Dameron]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Semantic particularity measure for functional characterization of gene sets using gene ontology]]></article-title>
<source><![CDATA[PLoS One]]></source>
<year>2014</year>
<volume>9</volume>
</nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dinasarapu]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Saunders]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Ozerlat]]></surname>
<given-names><![CDATA[I]]></given-names>
</name>
<name>
<surname><![CDATA[Azam]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
<name>
<surname><![CDATA[Subramaniam]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Signaling gateway molecule pages, a data model perspective]]></article-title>
<source><![CDATA[Bioinformatics]]></source>
<year>2011</year>
<volume>27</volume>
<page-range>1736-1738</page-range></nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Berridge]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[Cell Signalling Biology]]></source>
<year>2014</year>
<publisher-name><![CDATA[Portland Press Limited M.J]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mason]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
<name>
<surname><![CDATA[Verwoerd]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Graph theory and networks in biology]]></article-title>
<source><![CDATA[IET Systems Biology]]></source>
<year>2007</year>
<volume>1</volume>
<page-range>89-119</page-range></nlm-citation>
</ref>
<ref id="B25">
<label>25</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Blondel]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Guillaume]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Lambiotte]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Lefebvre]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Fast unfolding of communities in large networks]]></article-title>
<source><![CDATA[Journal of Statistical Mechanics]]></source>
<year>2008</year>
<month>20</month>
<day>08</day>
<page-range>10008- 10020</page-range></nlm-citation>
</ref>
<ref id="B26">
<label>26</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ravasz]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Somera]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Mongru]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Oltvai]]></surname>
<given-names><![CDATA[Z]]></given-names>
</name>
<name>
<surname><![CDATA[Barabási]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Hierarchical organization of modularity in metabolic networks]]></article-title>
<source><![CDATA[Science]]></source>
<year>2002</year>
<volume>297</volume>
<page-range>1551-1555</page-range></nlm-citation>
</ref>
<ref id="B27">
<label>27</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Borgatti]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Centrality and network flow]]></article-title>
<source><![CDATA[Social Networks]]></source>
<year>2005</year>
<volume>27</volume>
<page-range>55-71</page-range></nlm-citation>
</ref>
<ref id="B28">
<label>28</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sun]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Tang]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A survey of models and algorithms for social influence analysis]]></article-title>
<source><![CDATA[Social network data analytics]]></source>
<year>2011</year>
<edition>C. Aggarwal</edition>
<page-range>177-214</page-range><publisher-loc><![CDATA[Nueva York ]]></publisher-loc>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B29">
<label>29</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sabidussi]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The centrality index of a graph]]></article-title>
<source><![CDATA[Psychometrika]]></source>
<year>1966</year>
<volume>31</volume>
<page-range>581-603</page-range></nlm-citation>
</ref>
<ref id="B30">
<label>30</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rangel]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Aguilar]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Cerrada]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Altamiranda]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[An approach for the emerging ontology alignment based on the bees colonies]]></article-title>
<source><![CDATA[]]></source>
<year>2015</year>
<conf-name><![CDATA[ Proc. Int. Conf. Artificial Intelligence]]></conf-name>
<conf-loc> </conf-loc>
<page-range>536-541</page-range><publisher-loc><![CDATA[Las Vegas^eUSA USA]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B31">
<label>31</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Aguilar]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Altamiranda]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang="es"><![CDATA[Minería de Datos en la Web usando Computación Evolutiva]]></article-title>
<source><![CDATA[Ingeniería de Software en la Década del 2000]]></source>
<year>2003</year>
<page-range>153-168</page-range><publisher-name><![CDATA[AECI]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B32">
<label>32</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Euzenat]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Shvaiko]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
</person-group>
<source><![CDATA[Ontology Matching]]></source>
<year>2007</year>
<publisher-loc><![CDATA[Berlin ]]></publisher-loc>
<publisher-name><![CDATA[Springer-Verlag]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B33">
<label>33</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bouchou]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
<name>
<surname><![CDATA[Niang]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
<name>
<surname><![CDATA[Lo]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Towards tailored domain ontologies]]></article-title>
<source><![CDATA[]]></source>
<year>2010</year>
<conf-name><![CDATA[ Proc 5th International Workshop on Ontology Matching]]></conf-name>
<conf-loc> </conf-loc>
<page-range>241-243</page-range></nlm-citation>
</ref>
<ref id="B34">
<label>34</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Altamiranda]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Aguilar]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Delamarche]]></surname>
<given-names><![CDATA[C]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Similarity of Amyloid Protein Motif using an Hybrid Intelligent System]]></article-title>
<source><![CDATA[IEEE Latin America Transactions]]></source>
<year>2011</year>
<volume>9</volume>
<page-range>700-710</page-range></nlm-citation>
</ref>
<ref id="B35">
<label>35</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Thomas]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Campbell]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Kejariwal]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Mi]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[PANTHER: a library of protein families and subfamilies indexed by function]]></article-title>
<source><![CDATA[Genome Research]]></source>
<year>2003</year>
<volume>13</volume>
<page-range>2129-2141</page-range></nlm-citation>
</ref>
<ref id="B36">
<label>36</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bastian]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Heymann]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Jacomy]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[Gephi: An open source software for exploring and manipulating networks]]></source>
<year>2009</year>
<conf-name><![CDATA[ International AAAI Conference on Weblogs and Social Media]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B37">
<label>37</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ishikawa]]></surname>
<given-names><![CDATA[Y]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[W]]></given-names>
</name>
</person-group>
<source><![CDATA[Web Technologies and Applications]]></source>
<year>Apri</year>
<month>l </month>
<day>4-</day>
<conf-name><![CDATA[ 15th Asia-Pacific Web Conference]]></conf-name>
<conf-loc> </conf-loc>
<publisher-loc><![CDATA[Sydney ]]></publisher-loc>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B38">
<label>38</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rokach]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
<name>
<surname><![CDATA[Maimon]]></surname>
<given-names><![CDATA[O]]></given-names>
</name>
</person-group>
<source><![CDATA[Clustering methods: Data mining and knowledge discovery handbook]]></source>
<year>2005</year>
<page-range>321-352</page-range><publisher-loc><![CDATA[^eUS US]]></publisher-loc>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B39">
<label>39</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Blondel]]></surname>
<given-names><![CDATA[V. D]]></given-names>
</name>
<name>
<surname><![CDATA[Guillaume]]></surname>
<given-names><![CDATA[J-L.]]></given-names>
</name>
<name>
<surname><![CDATA[Lambiotte]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Lefebvre]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Fast unfolding of communities in large networks]]></article-title>
<source><![CDATA[Journal of Statistical Mechanics]]></source>
<year>2008</year>
<numero>10</numero>
<issue>10</issue>
<page-range>P1000</page-range></nlm-citation>
</ref>
<ref id="B40">
<label>40</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Arenas]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Duch]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Fernández]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Gómez]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Size reduction of complex networks preserving modularity]]></article-title>
<source><![CDATA[New Journal of Physics]]></source>
<year>2007</year>
<volume>9</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>176</page-range></nlm-citation>
</ref>
<ref id="B41">
<label>41</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ashburner]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Ball]]></surname>
<given-names><![CDATA[C. A]]></given-names>
</name>
<name>
<surname><![CDATA[Blake]]></surname>
<given-names><![CDATA[J. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Botstein]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Butler]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Cherry]]></surname>
<given-names><![CDATA[J. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Harris]]></surname>
<given-names><![CDATA[M. A]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Gene Ontology: tool for the unification of biology]]></article-title>
<source><![CDATA[Nature genetics]]></source>
<year></year>
<volume>25</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>25-29</page-range></nlm-citation>
</ref>
<ref id="B42">
<label>42</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Peters]]></surname>
<given-names><![CDATA[I]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Folksonomies: indexing and retrieval]]></article-title>
<source><![CDATA[Walter de Gruyter]]></source>
<year>2009</year>
<volume>1</volume>
</nlm-citation>
</ref>
<ref id="B43">
<label>43</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Borne]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Collaborative annotation for scientific data discovery and reuse]]></article-title>
<source><![CDATA[Bulletin of the American Society for Information Science and Technology]]></source>
<year>2013</year>
<volume>39</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>44-45</page-range></nlm-citation>
</ref>
<ref id="B44">
<label>44</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Asmolov]]></surname>
<given-names><![CDATA[G]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Crowdsourcing and the folksonomy of emergency response: The construction of a mediated subject]]></article-title>
<source><![CDATA[Interactions]]></source>
<year>2015</year>
<volume>6</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>155-178</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
