A basic intro to data analysis for communicators: It’s math, not magic.


[Read this post on medium >>]

If you work in any communications field, literacy in data analytics is increasingly important. You’re undoubtedly aware of this, but you may not be certain what exactly the discipline consists of, how it can help you, or what you can do to become better versed in the world of data. And, frankly, the whole concept can be intimidating — media and industry embrace an image of data analysis as sorcery performed by nerd geniuses bathed in the blue glow of their monitors.

The reality is that the key concepts of data analysis are fairly simple, and you’re almost certainly using them already, whether you know it or not. Below is a look at basic terminology, categories of analysis, examples of how they’re used in communications, and recommendations for what to do next if you want to learn more.

A mess of terminology
One reason data analysis can seem overwhelming to the uninitiated is that the basic practices are obscured by a cloud of unwieldy terminology. Terms like big data, artificial intelligence, and machine learning have become buzzwords that are frequently thrown around with abandon. Ignore them for now, and focus on the foundations.

To begin with, what does “data analysis” actually mean? Data, according to Charles Wheelan, “is really just a fancy name for information” (2013, p.3). And analysis is studying something to look for meaning. So data analysis is simply studying information to look for meaning. Sounds pretty manageable, right?

There are a lot of ways to look for meaning in data, and the approach varies depending on the size of the data set, the desired outcome, and the tools and expertise that are available. The size of data sets can be relatively small or stunningly large (think Facebook, Amazon, etc.), and the tools used to analyze them range from basic statistics formulas you can work out by hand to complex algorithms run on powerful computers. But regardless of how your analysis is carried out, it will generally fall into one of two categories: descriptive or predictive analytics.

Descriptive analytics — What’s working best?
Descriptive analytics turn your raw data into descriptive statistics that summarize what happened so you can determine what works best. Descriptive statistics are based on fairly elementary math, including “counts, sums, averages, percentages” and basic arithmetic (Wu, n.d.). In the broader world, you regularly encounter descriptive statistics in the form of metrics such as grade point average and batting average — measures that are simple to calculate and easy to understand.

In communications, descriptive statistics are used to summarize how your work is performing, who’s in your audience, and how they behave. You’re probably already looking at this kind of information since digital platforms for social media, email, publishing — and pretty much anything else you do online — collect descriptive data in abundance and usually deliver it to you in an analytics interface. Stats such as pageviews, follower count, shares, likes, email opens, clicks, retweets, reach, and audience demographics are all just descriptive statistics. And if you’re paying attention to them in order to improve your work, you’re already using descriptive analysis.

What you can do with descriptive statistics
The first thing to do with the descriptive statistics produced by your communications is simply to be sure you’re familiar with them. Know what kind of information is readily available from Facebook insights, Twitter analytics, Google Analytics, MailChimp, or whatever other digital platforms you’re using. Some trends are so obvious that just browsing your data without any special analysis will yield useful insights (e.g., your engagement stats doubled when you started using photos on your posts).

Beyond just gazing lovingly at your analytics dashboards, you can apply statistical methods to make analysis of your performance data much more powerful. The best starting point for understanding how this works in the field is A/B testing. This elementary statistical tool is so widely used and effective that it literally shapes the face of the digital world you interact with every day.

Here’s how it works: create two versions of your content and change a single element in the test version, such as the subject line of an email or the color of a button on a website. Show the different versions to different, randomly selected portions of your audience and see which performs better per a relevant metric like clickthroughs or open rate. This is where the math comes in. A basic statistical test compares the average performance of each (a descriptive statistic) to tell you with a high degree of certainty if the difference in performance was due to random variation or if one version actually gets better results for a reason. If there’s a winner, you know to roll it out to your entire audience.

A/B testing is used to identify winning email subject lines, webpage layouts, calls to action, news headlines, photos, and advertisements — and to optimize almost any other element of the digital environment people interact with. Examples abound, but be sure to check out “The Science Behind Those Obama Campaign E-mails” (Green, 2012) for a brief account of how email testing gets results. It turned out that for the Obama campaign, testing could yield millions of extra dollars from a single email that might have been foregone otherwise.

There are certainly more complex statistical methods that are available to practitioners. Multivariate testing, for example, is similar to A/B testing in concept but tests multiple combinations of changes at one time. However, if you understand A/B testing, you’ve got the basic idea of how descriptive statistics are analyzed: you gather data on what happened, determine what’s working best, and adjust your efforts accordingly.

Predictive Analytics — Where do we focus?
Predictive analytics use more complex math to extrapolate based on existing data what’s likely to happen in the future. As Wheelan explains, “We can use data from the ‘known world’ to make inferences about the ‘unknown world’” (2013, p.6). In communications, predictive analysis is useful for focusing your time and resources when you have a big audience. You can identify who’s most likely to take an action you desire and put all your energy into giving those people a nudge.

To begin with, select the outcome you want to predict — for example, someone subscribing to your newsletter. You then randomly select two groups from your audience. Feed one group’s data to the computer so it can find a model that predicts who subscribed based on all the other data points it knows about them (e.g., how often they visit your site and how long they spend there). Next, you test the model on the second chunk of users that was held out from modeling in order to make sure it’s accurate. If it works well, you can apply it to your entire audience to get a good idea of who’s likely to subscribe in the future and focus your efforts on getting those people signed up.

There are a few caveats to be aware of. The first is that, like any statistical method, predictive analysis requires large sample sizes to be accurate, so this is much more useful for organizations with massive audiences. The second is that even the best predictions are “probabilistic,” which means they tell you what’s likely to happen and aren’t a guaranteed picture of the future (Silver, 2012, p. 61).

Predictive analytics in the wild
If you’re wondering what predictive analytics look like in action, and how they might apply to your work, these examples from the realms of political communication and digital publishing help illustrate the practical uses of prediction.

Large-scale political campaigns face potential audiences of tens of millions of people, but not all of those people are going to turn out and vote for the desired candidate. So how do you avoid burning money, time, and volunteers’ energy communicating with the wrong people? In 2008, the Obama campaign pioneered a solution to this problem using predictive analytics. They “assigned every voter in the country a pair of scores based upon the probability of showing up to vote and supporting Obama. These scores were built using multiple data points, including past voting history, commercial data, and surveys, and they were used to create algorithms to predict behavior” (Price, 2017). With the results in hand, the campaign was able to target communications and get out the vote efforts toward the people who were most likely to respond.

The same principles apply in digital media. Fighting for ad dollars against tech giants is a losing battle, so many publishers are turning their attention to winning more subscribers, and predictive analytics can provide a huge advantage in that effort. One recent example was documented by a couple of data scientists at the Schibsted Media Group, who created a model that “predicts the likelihood of an individual user purchasing a subscription, based on their behaviour on our websites and apps” (Cody-Kenny & Fiskerud, 2017). Before building the model, the company telemarketed to randomly selected registered users, asking them to subscribe. By using the model to target calls to likely subscribers, the company improved its conversion rate by a hefty 540%.

Building predictive models does take significant technical expertise, but you don’t have to become a full-fledged statistician or computer scientist to get more comfortable with the practice of data analysis. Below are some suggestions for how to improve your data literacy, even if you’re a total novice.

Putting your data to work
Depending on where you work, you may have a whole team of data analysts at your disposal, or you may be trying to figure out the numbers on your own. Either way, getting more comfortable with data analysis will make you a better communicator. Here are a handful of recommendations — listed from easiest to most work-intensive — you can use build your skills.

1) Pay attention to descriptive statistics. This is a no-brainer. Look at the data dashboards your digital platforms offer. Get familiar with what kinds of stats they can show you, and think about which ones are most relevant to your specific goals. Be sure to examine historical data so you can identify longer-term trends and top-performing content.

2) Take advantage of automation. Excellent news for math-averse communicators is that some key statistical tools are automated on digital platforms. For example, many email platforms offer automated A/B testing, as does Facebook’s ad platform. Be sure to use these tools any time you have the opportunity — you’ll get valuable data with zero mathematical effort.

3) Get familiar with statistics. Data analysis is built on the mathematical discipline of statistics. Even a general familiarity with it can go a long way in helping you understand which numbers matter and which ones don’t. For a non-threatening intro, take a look at Charles Wheelan’s book, Naked Statistics. If you want to dig deeper, there are plenty of free stats courses online.

4) Go deep. If you really fall in love with this stuff — or want to stand out among comms practitioners, there’s no limit to what you can learn in online courses (often for free). For a starting point, check out David Venturi’s Data Science curriculum; he created his own master’s degree in data science using only online classes.

There you have it. Data analysis isn’t magic; it’s just math, and even basic arithmetic is enough to get you started. Analytics aren’t going to replace creative thinking or strategic planning in communications, but they are an input that should shape those undertakings so the results are more effective.

Cody-Kenny, C., & Fiskerud, E. (2017, September 20). Growing News Subscriptions with Data Analytics. Retrieved from http://bytes.schibsted.com/growing-news-subscriptions-data-analytics/

Green, J. (2012). The science behind those Obama campaign e-mails. Bloomberg Businessweek.

Price, M. (2017). Engagement organizing: The old art and new science of winning campaigns. UBC Press.

Silver, N. (2012). The signal and the noise: why so many predictions fail — but some don’t. Penguin.

Wheelan, C. (2013). Naked statistics: Stripping the dread from the data. WW Norton & Company.

Wu, M. (n.d.). Big Data Reduction 1: Descriptive Analytics. Retrieved from https://community.lithium.com/t5/Science-of-Social-Blog/Big-Data-Reduction-1-Descriptive-Analytics/ba-p/77766

Photo source: NBC Television 1973 https://commons.wikimedia.org/wiki/File:Bill_Bixby_The_Magician_1973.JPG

Brent Merritt