From The Field/ Online Media

What The Data Science Of Email Can Tell Marketers


by Aaron Beach
Senior Data Scientist

See More by this author >

Article Highlights:

  • Data science sits at the intersection of social science, statistics, computer science, and design.
  • Predicting engagement rates is one way email data is being applied to real-world situations.
  • What fundamentally matters is whether sending an email to a particular recipient will generate value.

No doubt, we’re in the midst of a data explosion. Social media, digital photos, transaction records, and emails--these are all part of the big data puzzle. In fact, every digital clue, crumb, or record that we create everyday represents a part of the big data landscape today.

How to effectively collect, manage, extract insight from, and take action on this avalanche of data are questions that every organization now faces. 

Companies that want to analyze these vast quantities of stored data and turn them into meaningful insights have turned to a new discipline called data science.  Data science sits at the intersection of social science, statistics, computer science, and design. With the help of new technologies and specialized talent, data science is able to identify patterns and regularities in data of all sorts to better understand, predict, and cater to customer behaviors.

So what are some of the early areas where companies can start putting their big data investments into action? Digital touch points such as email, which has become the backbone of nearly every Web and mobile application, seems like a good place to start. With 182 billion email messages sent each day, according to the Radicati Group, marketers, as well as email service providers (ESPs), can leverage big data infrastructure and data science to assess customer behaviors and uncover science-backed strategies to ultimately make them more successful.

Through a deep dive into back-end email systems, data scientists can profile inbox behaviors to learn how to reduce spam complaints, increase delivery, and improve recipient engagement.

ESPs receive many different forms of data. Of all the information that is collected, the most telling signals are those events generated within the life cycle of an email--smtp responses, opens, clicks, and spam reports. Data scientists collected hundreds of millions of these events daily, and when collectively analyzed, they paint a picture of user preferences, inbox habits, and spam folder policies.

This research tells data scientists a lot about what happens to an  email after it is delivered. For instance, it may be used to infer policies for particular inbox provider such as Gmail or Yahoo, or even if a particular recipient maintains a clean inbox.

Predicting engagement rates is another way email data is being applied to real-world situations. For instance, if a particular recipient is signed up to receive email alerts from a daily deal site, but hasn’t opened or clicked on an email for the past two weeks, we might infer that he is on vacation. But if the daily deal site doesn’t know that, it will continue to send messages and clutter the recipients’ inbox. On the other hand, if the recipient is actively engaged with his inbox but is ignoring a particular sender’s email, it sends a strong message that the recipient is not interested. In either case, whether it’s sending fewer email messages or ceasing until engagement resumes, email senders can proactively respond to theses data signals from the email ecosystem. 

This event data can be used to identify clusters of senders and recipients centered around particular topics. These topical clusters can be used to predict the kind of email a new sign-up might prefer or which campaigns should (and should not) be sent to a newsletter subscriber. Topic clusters can be used to automatically segment email lists into segments that prefer the same types of email. Together with predicting engagement rates, data can provide marketers with a clear expectation of what kind of response an email campaign may elicit.

Many marketers take a more-is-better approach to stats, preferring to have the ability to slice and dice their email lists based on numerous arbitrary stats. However, what fundamentally matters is whether sending an email to a particular recipient will generate value. Data scientists can design high-level insights, such as engagement prediction models and content clusters, that allow marketers to cut through the noise and design their campaigns around strong, predictive signals, rather than arbitrary statistics.

Ultimately, using smart data models to make sense of big data allows complex systems--including the global email ecosystem--to be optimized. As email-as-a-service becomes more a common commodity, this critical mass of data and the ability to make sense of it will be a true differentiator.

About Aaron Beach

Aaron Beach is a senior data scientest at email delivery service SendGrid. He has experience in email services, energy systems, privacy, social networks, mobile apps, natural language processing, recommendation systems, and big data. Beach has a Ph.D. in computer science.