Decoding Programmers: How Emotions Can Change Code
So, do you know which human emotions cause developers to commit more errors, or generate more code?
A few weeks ago, I sat down at my desk in a horrible mood. I had taken the bus to work as usual, but on this ride I only had a £10 bill, and the driver refused to make change, so it turned into a very expensive bus ride. The incident left me fuming, and as if my computer felt the same, the first few lines of code that I typed were full of errors.
As I fixed my mistake, I couldn’t help but wonder: Would my code have been better had I been in a more positive frame of mind? Was it coincidence, or does a happy coder create more effective code?
Others have looked into whether and how moods and emotions affect quality of work, yet very little of that research has focused on developers and how the emotions they bring to their coding affect the outcome. When it has focused on that, the data was collected in real time, or participants were asked to self-assess their performance. While those studies created specific environments and rules, I focused on the work as it had been done in a natural setting to match developer messages with emotions and compare that with the quality of the code itself.
To get my answer I turned to a resource at Semmle’s lgtm.com site that has assessed millions of lines of code in terms of quality and includes messages developers create when they produce their code. I focused on the three programming languages most used on lgtm.com: Java, JavaScript and Python. To match emotions with code, I analyzed the language in the developer comments using a combination of sentimentr, a dictionary-based sentiment analysis tool, and the NRC Emotion Lexicon, which classifies English words as representing one or more of the eight basic emotions: anger, fear, joy, sadness, trust, disgust, surprise and anticipation.
For example, a message reading “Damn! Forgot to update the tests,” was categorized as angry, while a message reading “partial fixes, but still broken,” was classified as sad. Once the messages were categorized by emotion, I assigned them a score from 0 to 100 based on the number of mistakes in the code -- a quality rank, if you will.
For this research, I analyzed hundreds of millions of lines of code and their associated messages, which had been collected over several years. This allowed me to work with a larger sample and fewer variables than other research. While my goal was specific, and this research should not be taken as the end of the discussion, it raised some interesting trends that have somewhat universal applicability.
The above graph illustrates the average quality rank of messages associated with eight respective emotions (a score of zero equals an average quality rank of 50). The data suggests that negative emotions such as sadness and fear are correlated with higher quality code, while emotions such as anger are correlated with lower quality code.
Happiness leads to productivity and vice versa
The first and most uniform takeaway was that messages with positive sentiment correspond to programming actions that result in a change to a lot of code. On top of this, positive messages don’t just reflect change to a lot of lines of code; they mostly crop up when lines of code are added rather than deleted. In other words, positive messages seem to correspond to productivity and creation. When people are happy, they may be less critical of their own work. This means that they focus on writing more code rather than fine-tuning what they currently have. This begins a positive cycle, writing more code leads to happy developers, which leads to more code.
Now, you may be asking: more code sounds good, but it still has to be correct code. Do positive messages also equal cleaner code? Here, the answer seems to be no.
Sadness leads to cleaner code
Remember my quality rankings? By looking at the average quality rank across all eight emotion categories we can determine which emotions lead to cleaner code. What I found held true across all three programming languages: negative sentiments in the messages, and sadness in particular, are related to cleaner code, showing the highest average quality ranking of any emotion by a large margin. While this may sound surprising, it actually reflects a pattern that has been observed in many fields; sadness or lowered mood often coincides with focus, improved judgment and critical thinking.
So far, this study tells us that happiness leads to more code, and sadness leads to better code. So where do the mistakes come from?
Anger leads to mistakes
This was perhaps the least surprising result that I found. When measured by average quality rank, anger was the only negative emotion that led to more mistakes. In fact, other than surprise and anticipation, it was the only emotion to have an average quality rank under 50. (Surprise is quite rare, and only associated with more mistakes when combined with anticipation. This suggests that the real problem emotion is anticipation, perhaps because developers get ahead of themselves in finishing a task before properly checking their code). Given anger’s known blinding effect, this effect made total sense.
The humans behind the machines
In the minds of many (and in no small part thanks to Hollywood movies), developers work effortlessly, hands flying across the keyboard as they write code with ruthless efficiency while munching salty snacks. Programming is a very abstract, even foreign, concept to many, but the work that developers do matters; it leads to stronger and faster applications, better software and a safer online world. Within this context, we must remember that the coders behind the machines are not machines themselves. Like any person, a developer’s work is affected by their mood and well-being, which can have serious ramifications.
Albert Ziegler is a Data Scientist working at security software company Semmle.
About the Author
You May Also Like
2024 InformationWeek US IT Salary Report
May 29, 20242022 State of ITOps and SecOps
Jun 21, 2022