Data: 09/09/2021, às 16:00 horas (GMT-3, horário de Brasília).
Título: Graph theoretical approach for genome comparison with large-scale mutations
Palestrante: Marilia Braga (Universität Bielefeld)
Resumo: A classical problem in comparative genomics, proposed by Sankoff in
1992, is the one of computing the rearrangement distance, which is the
minimum number of large-scale mutations required to transform a genome
into another, in a setting in which the genes in both genomes are
classified into families. Assuming that each family occurs exactly
once in each genome, substantial results were developed with the help
of graph theoretical models and include the polynomial-time
computability of the inversion distance by Hannenhalli and Pevzner in
1995, the introduction of the double-cut and join (DCJ) operation by
Yancopoulos, Attie and Friedberg in 2005 and the linear computability
of the DCJ distance by Bergeron, Mixtacki and Stoye in 2006. These
results rely on the fact that the underlying breakpoint/adjacency
graphs are simple collections of paths and cycles.
In this talk I will present research results that I obtained with my
colleagues over the last 11 years. First, I will describe an extension
of the DCJ model for singular genomes, which are those in which each
gene family occurs at most once. Singular genomes may contain distinct
genes and, besides DCJ operations, are also subject to insertions and
deletions of DNA segments (jointly called indels). We have shown that
computing the singular DCJ-indel distance can be done in linear time
with the help of the relational diagram, which is also a collection of
paths and cycles (results from 2010, with Willing and Stoye). Then I
will describe two recent NP-hard extensions of this model: the
DCJ-indel distance of natural genomes (from 2020, with Bohnenkämper,
Doerr and Stoye) and the weighted DCJ-indel distance of family-free
genomes (from 2014/2020, with Martinez, Feijão and Rubert). Both can
be solved efficiently with ILP formulations that search for optimal
cycle decompositions of a (weighted) multirelational diagram.