My second attempt at analyzing the emails that my 9 best friends share between once another. I feel that this time around, our top words truly illustrate the year we went through together. It was one filled with moves, new relationships, engagements, weddings and lot of other large life events.
Over the course of a year, my friends and I had sent 96 emails to one another. We typically all shoot for 1 a month from the 9 of us, but sometimes, people get buys and miss a month. We structure those emails into “highs” – the high points in our lives and “lows” – the stuff that isn’t so great that month. Recently people have been including lots of “middle” or “meh’s” as part of their update, so for simplicity I included all of those as highs. I did my analysis on the overall content of the email as well as divided by highs and lows.
Like the year prior, I took the top words overall, for each month and for each person. I also took a look at the words that defined each of us as well as the sentiment analysis for our emails. For fun, I wanted to see how many times each of our significant others were mentioned and I did some year to year comparisons for word frequencies and sentiment. Since it was a big year for weddings my friends also wanted to know the rankings of how many times we each said wedding.
This is pretty basic, I just took the frequency of each word, excluding the most commonly used words from the English Corpus. This was a little because one of my friends is dating someone named Will, which happens to be a very common verb. The analysis said he was mentioned 7 times, which was a lowball answer, but including common words, will showed up 111 times. The easiest solution? I asked my friends to stop dating verbs and nouns (*cough* Will and Miles *cough*). The frequencies came out to the following:
Work really tops a lot of our conversations, which isn’t a big surprise. We are continuing to use the word “really” a lot. So maybe we nee to brush up on our vocabulary. Clearly we had a wedding in June and October as that topped the charts those months. The comparison wasn’t too astounding overall from 2015, we love our weekends, we love to talk about work and a lot of us feel strongly about projects.
My favorite part is seeing data take the shape of my friends. Just the words that get pulled tell the story of the year. I found these words by plotting the frequency of words that the individual used by the frequency used by the group. Some people didn’t have very strong defining words, maybe they used a certain word 9 times out of 19, but I first pulled any outliers on the chart. Those in the top left corner display high personal use, but low group use. I then looked at the top 20 or so words and tried to pull out relatively small distances, which meant that individual was responsible for most of the uses of that word, for example Kayla said “Leo” 34 out of the 35 times he was mentioned. Makes sense, since they’re engaged. If that didn’t fill up the top 5, I then looked at words they used very frequently that no one else used.
I also took a look at the opposite, which is a bit more tricky, but looked at top words from the group that the individual never used. This word is less telling.
UZ was prepping for gmat quite a bit, as well as traveling to China. Michelle was in the middle of planning her wedding. Fer had moved to San Francisco, Nancy picked up running this year, Pranitha got a break between her projects (?), Todd got married and had a lot to say about his job, I got into grad school, Spencer has been attempting a career change into design and Kayla moved to Dallas and took a trip to Norway.
While I finish my natural language processing course, I used an online tool to help identify the sentiment of everyone’s emails. I figured that since we classify emails as highs and lows, most would turn out to be neutral, so I also took the analysis of highs and lows an wanted to see who was was having “the best year ever.” Since a lot of the words are specific to their context, I’m hoping to build my own dictionary and run this analysis again myself.
As expected, highs were usually positive and lows were usually negative. Todd tends to use more negative words to describe things an Michelle is clearly a robot. I think its Fer to say that Fer had a really great year.
So how many times did we end up saying wedding? 85. Michelle topped the list, as expected with 19 mentions. Not included in this count are “wedding(s) and wedingssssss.”
Here is the infographic in case you’re interested in seeing some more fun facts: