photo Pedro Calais
Computer Scientist Ph.D and Software Engineer








Belo Horizonte, MG, Brazil
email: pcalais@dcc.ufmg.br



I am a computer scientist P.hD. motivated in doing long lasting, fundamental and technical work in the intersection of science and engineering.
This means you can find me at the same time building an enterprise large-scale distributed system, teaching a bootcamp about big data tools, and writing a paper on how social psychology theories can be embedded into AI algorithms.

In the roles of academic researcher, teacher, individual contributor and manager, I've learned that great things can be accomplished when we are constantly pursuing truth and excellence by sharpening our knowledge in theory and practice, breadth and depth, and soft and hard skills.

My central purpose is designing and building useful software in a pragmatic manner, what means finding the right combination of code communicability, simplicity and flexibility.
Academic teaching and research also motivates me due to the amazing positive cycle it produces. "In learning you will teach, and in teaching you will learn."

For more details on my career, check my Linkedin profile and my publications at Google Scholar.

If you speak Portuguese, this is an interview I gave to UFMG reflecting on my career and the lessons learned during the journey.




Education:

I like studying Economics and I've just finished the Austrian School of Economics specialization course by Instituto Mises Brasil.
My academic background is in computer science:
  • P.hD., Computer Science, Universidade Federal de Minas Gerais (2015) - Ph.D. Dissertation. I was a visiting scholar at Cornell University CS Department.
  • M.Sc., Computer Science, Universidade Federal de Minas Gerais (2009) - Master's Thesis (in Portuguese). Chosen as one of the best 11 Master's Thesis in Computer Science in 2010, in Brazil.
  • B.S., Computer Science, Universidade Federal de Minas Gerais (2006). I received the best student award - 4.7/5.0 GPA.


Research and Professional Interests:

  • software design principles that allow software to evolve and be stable by taming software complexity
  • data science and data engineering
  • machine learning
  • computational social sciences, i.e, connecting social theories with algorithms


Talks:

Using Spark in Scala (in Portuguese): link
Measuring Polarization in Social Media using Community Boundaries: link

Publications:

Software Engineering and Flow State:

Recently, the connection with software engineering and neuroscience has sparked my interest. While working at Stone Co, I noticed how test-driven development helps developers enter the so called flow state.

Since then, I have published academic work detailing the mechanics of this connection, and have given talks at the Brazilian The Developer's Conference (TDC) and Google Developer Groups meetups.

Test-Driven Development Benefits Beyond Design Quality: Flow State and Developer Experience [bibtex] [slides]
Pedro Calais, Lissa Franzini
International Conference on Software Engineering - New Ideas and Emerging Results (ICSE NIER, 2023).

Politicization, Fake News and Hate Speech:

I have a special interest in connecting social science theories with computational methods. In this recent research, me and some colleagues at UFMG show how politicization can be observed as a genuine social process -- a transition from a non-political to a political topic.

Topic Shifts as a Proxy for Assessing Politicization in Social Media
Marcelo Sartori, Pedro Calais, Joao Pedro Junho, Matheus Prado, Tomas Lacerda, Wagner Meira Jr., Virgilio Almeida
International AAAI Conference on Web and Social Media (ICWSM 2024).

"Like Sheep Among Wolves": Characterizing Hateful Users on Twitter [bibtex]
Manoel Ribeiro, Pedro Calais, Yuri Santos, Wagner Meira Jr. and Virgilio Almeida
Misinformation and Misbehavior Mining Workshop (MIS2) @ WSDM 2018, Los Angeles, USA.

"Everything I disagree with is #FakeNews": Correlating Political Polarization and Spread of Misinformation [bibtex]
Manoel Ribeiro, Pedro Calais, Wagner Meira Jr. and Virgilio Almeida
Data Science + Journalism Workshop @ KDD 2017, Halifax, Canada.

Polarization and Sentiment Analysis on Social Streams:

My Ph.D research was focused on connecting social psychology theories on how people express their opinions to sentiment analysis algorithms tailored to operate on rapid evolving social streams having an underlying social graph.
We were able to demonstrate that effective and simple algorithms based on such theories can operator on highly dynamic topic such as a soccer match.

Antagonism also Flows through Retweets: The Impact of Out-of-Context Quotes in Opinion Polarization Analysis (poster paper) [bibtex][extended paper]
Pedro Calais, Roberto Nalon, Renato Assuncao and Wagner Meira Jr.
11h International AAAI Conference on Weblogs and Social Media (ICWSM 2017), Montreal, Canada.

Sentiment Analysis on Evolving Social Streams: How Self-Report Imbalances Can Help [bibtex][slides]
Pedro Calais, Wagner Meira Jr., Claire Cardie
7th International ACM Conference on Web Search and Data Mining (WSDM 2014), New York City, USA.

A Measure of Polarization on Social Media Networks based on Community Boundaries [bibtex][slides][blog post][video]
Pedro Calais, Wagner Meira Jr., Claire Cardie, Robert Kleinberg.
7th International AAAI Conference on Weblogs and Social Media (ICWSM 2013), Boston, USA.

From Bias to Opinion: a Transfer-Learning Approach to Real-Time Sentiment Analysis [bibtex][slides]
Pedro Calais, Adriano Veloso, Wagner Meira Jr., Virgilio Almeida.
17h ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD 2011), San Diego, California.

Spam Fighting and Characterization:

In my master's degree, I've focused on characterizing and investigating the behavior and strategies adopted by spammers in order to understand how they disseminate and distribute their messages.
The focus of the research was the development of clustering algorithms to detect spam campaigns, and the use of association rule mining to uncover regularities and spam abuse patterns.
Take a look at some patterns and regularities on spam construction techniques we have been able to find:

Mining Spam Campaigns and Spam Address Lists

Spam Detection Using Web Page Content: a New Battleground [bibtex]
Marco Tulio Ribeiro, Pedro Calais, Dorgival Guedes, Adriano Veloso, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
7th Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS'11), Perth, Australia

Exploring the Spam Arms Race to Characterize Spam Evolution [bibtex]
Pedro Calais, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
7th Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS'10), Redmond, WA, USA

Spam Miner: A Platform for Detecting and Characterizing Spam Campaigns (demo paper) [bibtex]
Pedro Calais, Douglas Pires, Marco T�lio Ribeiro, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
International Conference on Knowledge Discovery and Data Mining (KDD '09), 2009, Paris, France.

Spamming Chains: A New Way of Understanding Spammer Behavior [bibtex]
Pedro Calais, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Marcelo H. P. C. Chaves, Klaus Steding-Jessen.
Sixth Conference on e-Mail and Anti-Spam (CEAS '09)

A Campaign-based Characterization of Spamming Strategies [bibtex]
Pedro Calais, Douglas Pires, Dorgival Guedes, Wagner Meira Jr., Cristine Hoepers, Klaus Steding-Jessen.
Fifth Conference on e-Mail and Anti-Spam (CEAS '08)

Information Retrieval:

An Anatomy for Neural Search Engines [bibtex]
Akio Nakamura, Pedro Calais, Davi Reis, Andre Paim
Information Sciences, 2018.

e-Commerce:

A Seller's Perspective Characterization Methodology for Online Auctions
Arlei Silva, Pedro Calais, Adriano Pereira, Fernando Mourao, Jussara Almeida, Wagner Meira Jr., Paulo Goes.
International Conference on Electronic Commerce (ICEC), 2008.

Broadband User Behavior Characterization:

Characterizing Broadband User Behavior
Humberto Marques, Leonardo Rocha, Pedro Calais, Jussara Almeida, Wagner Meira Jr., Virgilio Almeida.
Handbook of Research in Global Diffusion of Broadband Data. 1 ed. Hershey, Pennsylvania, US: IGI Global, 2008

Characterizing broadband user behavior and their e-business activities
Humberto Marques, Leonardo Rocha, Pedro Calais, Jussara Almeida, Wagner Meira Jr., Virgilio Almeida.
ACM SIGMETRICS Performance Evaluation Review, v. 32, p. 3-13, 2004

Characterizing Broadband User Behavior
Humberto Marques, Leonardo Rocha, Pedro Calais, Jussara Almeida, Wagner Meira Jr., Virgilio Almeida
The first ACM Workshop on Next Generation Residential Broadband Challenges, New York, NY (NRBC '2004)

High-performance computing:

AnthillSched: A Scheduling Strategy for Irregular and Iterative I/O-Intenstive Parallel Jobs
Luis Fabricio Goes, Pedro Calais, Bruno Coutinho, Leonardo Rocha, Wagner Meira Jr, Renato Ferreira, Dorgival Guedes, Walfredo Cirne.
Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP' 2005)



Some of my other interests:

  • Austrian School of Economics
  • Stoicism
  • Cars



eXTReMe Tracker