Identifying Enzymes Active Site with Genetic Algorithms
Sandro Carvalho Izidoro, Raquel C. de Melo-Minardi, Gisele Lobo Pappa.Abstract
Motivation: Given the number of proteins cataloged but with
unknown function, the development of computational methods
to perform function prediction efficiently and accurately is still a
challenge. This paper focus on identifying new attributes to help
enzyme function prediction. The proposed method searches for an
arrangement of amino acids directly involved in the catalysis reaction,
called catalytic or active site, which are responsible for molecular
recognition. Due to their importance, the active site amino acids are
more conserved during evolution than the sequence as a whole, and
can be successfully used for protein function prediction. The objective
of this work is to present a new technique to find similar active
sites based on genetic algorithms (GA). The method can perform
non-exact amino acid matches (taking conservative evolution into
account) without restriction on the number of amino acids and find
active sites in different protein chains.
Results: The use of GA in search of similar active sites,
hitherto unpublished, proved promising in data sets with different
characteristics. In specific enzymes families, GA found active sites
according to CSA (Catalytic Site Atlas). Tests using enzymes family
Serine Protease and comparing GA proposed with other existing
software showed that it is able to recover more active sites with
better accuracy. The implementation of a ranking to select the best
individuals (possible active sites) for each enzyme and the adaptation
of the mutation operator to deal with conservative mutations, gave
more to flexibility and robustness to the GA. Furthermore, the
possibility of finding residues of a site in more than one chain and
the absence of restriction on the size of the active site, make GA a
good tool to be used in the prediction function of proteins.