photoPedro Calais Guerra


Computer Science Department, Federal University of Minas Gerais Belo Horizonte, MG, Brazil
email: pcalais AT dcc.ufmg.br











About me

I hold a doctorate degree from the Department of Computer Science of Federal University of Minas Gerais (UFMG) in Brazil, working with prof. Wagner Meira Jr.. From October 2012 to July 2013, I conducted my research as a visiting scholar at Cornell University CS Department.

Currently, I am a software engineer and machine learning specialist at Loggi, a Logistics startup using technology to reinvent logistics in a continental country such as Brazil.

My general academic and professional interests revolve around data science, machine learning and computational social sciences -- the intersection between computer science and social sciences. My academic research has focused in polarization of opinions -- the social process whereby a social or political group is divided into two opposing sub-groups with fewer and fewer members of the group remaining neutral or holding an intermediate position. I am working on unveiling which structural characteristics polarization induce in the social graphs we find on social media systems, specially in contexts such as Politics, Sports and highly-debated topics. I am also interested on multi-polarized social graphs, which can be found where more than two sides compete against each other. Such scenario is found, for example, in Elections with more than two candidates and in Sports competitions (for example, on the Soccer World Cup, we have N=32 sides making opposition one to the other).

As an academic side project, I am researching the feasibility of building recurrent neural networks, a type of deep neural network, to track the sentiment mood variation in sentiment streams such as the ones we can get from Twitter.

My major non-computer scientist interest is studying Austrian economics, the libertarian ethics, its associated non-agression principle and the anachorcapitalist societal structure that arises from it. Check my blog post "How have I become a libertarian?" (in Portuguese, sorry). In the same way that I like the intersection of social + computer sciences, economics + criptocurrencies and associated technologies is another interdisciplinary field that attracts me a lot.

Education

Click here to check my CV Lattes and Google Scholar Citations

P.hD., Computer Science, Universidade Federal de Minas Gerais (2015) - Ph.D. Dissertation
M.Sc., Computer Science, Universidade Federal de Minas Gerais (2009) - Master's Thesis (in Portuguese). Chosen as one of the best 11 Master's Thesis in Computer Science in 2010, in Brazil.
B.S., Computer Science, Universidade Federal de Minas Gerais (2006) (best student award - 4.7/5.0 GPA)

Media Coverage

  • Rivalidade toma conta da Web. Jornal o Tempo, October 2013. (in Portuguese)
  • Pesquisador desenvolve ferramenta para mapear sentimentos na Internet. Portal IG, August 2013. (in Portuguese)
  • Piora o vale-tudo na internet no 2º turno. Estado de Minas Newspaper, October 2010. (in Portuguese)

  • And also on my Spam Research:

    Research Interests

    My general research interests are in the area of data mining and machine learning. I am particularly interested in the following areas.

    Publications

    An Anatomy for Neural Search Engines [bibtex]
    Akio Nakamura, Pedro H. Calais, Davi Reis, Andre Paim
    Information Sciences, 2018.

    "Like Sheep Among Wolves": Characterizing Hateful Users on Twitter [bibtex]
    Manoel Ribeiro, Pedro H. Calais, Yuri Santos, Wagner Meira Jr. and Virgilio Almeida
    Misinformation and Misbehavior Mining Workshop (MIS2) @ WSDM 2018, Los Angeles, USA.

    "Everything I disagree with is #FakeNews": Correlating Political Polarization and Spread of Misinformation [bibtex]
    Manoel Ribeiro, Pedro H. Calais, Wagner Meira Jr. and Virgilio Almeida
    Data Science + Journalism Workshop @ KDD 2017, Halifax, Canada.

    Antagonism also Flows through Retweets: The Impact of Out-of-Context Quotes in Opinion Polarization Analysis (poster paper) [bibtex][extended paper]
    Pedro H. Calais, Roberto Nalon, Renato Assunção and Wagner Meira Jr.
    11h International AAAI Conference on Weblogs and Social Media (ICWSM 2017), Montreal, Canada.

    Sentiment Analysis on Evolving Social Streams: How Self-Report Imbalances Can Help [bibtex][slides]
    Pedro H. Calais Guerra, Wagner Meira Jr., Claire Cardie
    7th International ACM Conference on Web Search and Data Mining (WSDM 2014), New York City, USA.

    A Measure of Polarization on Social Media Networks based on Community Boundaries [bibtex][slides][blog post][video]
    Pedro H. Calais Guerra, Wagner Meira Jr., Claire Cardie, Robert Kleinberg.
    7th International AAAI Conference on Weblogs and Social Media (ICWSM 2013), Boston, USA.

    From Bias to Opinion: a Transfer-Learning Approach to Real-Time Sentiment Analysis [bibtex][slides]
    Pedro H. Calais Guerra, Adriano Veloso, Wagner Meira Jr., Virgilio Almeida.
    17h ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD 2011), San Diego, California.

    Spam Fighting and Characterization:

    I've also conducted research in the spam fighting field. During my master's course, I've focused on characterizing and investigating the behavior and strategies adopted by spammers in order to understand how they disseminate and distribute their messages. The core of the research is the development of clustering algorithms to detect spam campaigns. Take a look at some patterns and regularities on spam construction techniques we have been able to find:
    Mining Spam Campaigns and Spam Address Lists
    Spam Detection Using Web Page Content: a New Battleground [bibtex]
    Marco Túlio Ribeiro, Pedro H. Calais Guerra, Dorgival Guedes, Adriano Veloso, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
    7th Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS'11), Perth, Australia

    Exploring the Spam Arms Race to Characterize Spam Evolution [bibtex]
    Pedro H. Calais Guerra, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
    7th Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS'10), Redmond, WA, USA

    Spam Miner: A Platform for Detecting and Characterizing Spam Campaigns (demo paper) [bibtex]
    Pedro H. Calais Guerra, Douglas Pires, Marco Túlio Ribeiro, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
    International Conference on Knowledge Discovery and Data Mining (KDD '09), 2009, Paris, France.

    Spamming Chains: A New Way of Understanding Spammer Behavior [bibtex]
    Pedro H. Calais Guerra, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
    Sixth Conference on e-Mail and Anti-Spam (CEAS '09)

    A Campaign-based Characterization of Spamming Strategies [bibtex]
    Pedro H. Calais Guerra, Douglas Pires, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Klaus Steding-Jessen.
    Fifth Conference on e-Mail and Anti-Spam (CEAS '08)

    e-Commerce:

    A Seller's Perspective Characterization Methodology for Online Auctions
    Arlei Silva, Pedro H. Calais Guerra, Adriano Pereira, Fernando Mourao, Jussara Almeida, Wagner Meira Jr., Paulo Goes.
    International Conference on Electronic Commerce (ICEC), 2008.

    Broadband User Behavior Characterization:

    Characterizing Broadband User Behavior
    Humberto Marques, Leonardo Rocha, Pedro H. Calais Guerra, Jussara Almeida, Wagner Meira Jr., Virgílio Almeida.
    Handbook of Research in Global Diffusion of Broadband Data. 1 ed. Hershey, Pennsylvania, US: IGI Global, 2008

    Characterizing broadband user behavior and their e-business activities
    Humberto Marques, Leonardo Rocha, Pedro H. Calais Guerra, Jussara Almeida, Wagner Meira Jr., Virgílio Almeida.
    ACM SIGMETRICS Performance Evaluation Review, v. 32, p. 3-13, 2004

    Characterizing Broadband User Behavior
    Humberto Marques, Leonardo Rocha, Pedro H. Calais Guerra, Jussara Almeida, Wagner Meira Jr., Virgílio Almeida
    The first ACM Workshop on Next Generation Residential Broadband Challenges, New York, NY (NRBC '2004)

    Other Topics:

    Estimativa de Demanda Potencial de Matrículas em Ensino Superior usando Dados Públicos e Múltiplos Modelos de Regressão (in Portuguese)
    Pedro H. Calais Guerra, Rodrigo Mizobe, Eduardo Hruschka.
    II Symposium on Knowledge Discovery, Mining and Learning KDMILE, 2014, São Carlos, SP. II Symposium on Knowledge Discovery, Mining and Learning KDMILE, 2014.

    AnthillSched: A Scheduling Strategy for Irregular and Iterative I/O-Intensive Parallel Jobs
    Luis Fabrício Góes, Pedro H. Calais Guerra, Bruno Coutinho, Leonardo Rocha, Wagner Meira Jr., Renato Ferreira, Dorgival Guedes, Walfredo Cirne.
    Workshop on Job Scheduling Strategies for Parallel Processing, 2005, Cambridge. (JSSPP '2005)


    Some of my other interests:


    Find me on Social Networks:



    eXTReMe Tracker