The volume and diversity of biological data are increasing at very high rates. Vast amounts of protein sequences and structures, protein and genetic interactions and phenotype studies have been produced. The majority of data generated by high-throughput devices is automatically annotated, since it is not possible to manually annotate them. Thus, efficient and precise automatic annotation methods are required to ensure the quality and reliability of both biological data and associated annotations.
We proposed ENZYMAP, a technique to characterize and predict EC number changes based on annotations from UniProt/Swiss-Prot using a supervised learning approach. We evaluated ENZYMAP experimentally and showed that it is possible to predict EC changes using selected types of annotation. Finally, we compare ENZYMAP and DETECT w.r.t. their predictions and checked both against UniProt/Swiss-Prot annotations. ENZYMAP has shown to be more accurate than DETECT, getting closer to the actual changes in UniProt/Swiss-Prot.