Ok, a list of Excel formulae is unlikely to grab you and make you desire to keep reading. But it should.
I’ve seen the ability to play with data and prove an RoI (beyond the discredited AVE) win new business time and again. Excel capabilities isn’t seen as sexy (certainly not by most people I know at least) but it is essential in winning work, and even in creating stories for your clients. And yet as an industry we don’t play enough or code enough.
While hardly scientific, a quick straw poll of friends shows several struggle and ignore data analysis as a result – with those handful friends asked on the whole listing Excel capabilities as a key reason why they aren’t doing more evaluation to prove the effects of their work. That and a fear that it won’t show anything – But I’ve not yet seen PR that doesn’t deliver a positive RoI as a whole.
I’ve therefore listed some of my favourite formulae below that will help you play around with a table and your clients’ report data. These have real examples that can be copied.
What should you measure?
Before I begin, one of the best ways to monitor the effectiveness of an awareness PR campaign is to monitor traffic. Your aim is to get people to the store (web or real-word) or at least to a place where they can find out information.
To measure sales would be great (and you can certainly correlate against this IF your client gives the information) but this is secondary. Did the sale take place because the sales assistant was amazing. Did it fail because the website was slow to load or design didn’t take people to the right information easily? So web hits. It’s also a good idea to remove bounces (visiting one page then leaving) from web-traffic figures as these can add skews – for example if mainstream media publication writes about you but, thousands of people click a link, but 95% of these don’t want or need a niche B2B product.
Google Trends data also gives you a good idea on salience, but you can’t dig in the same way.
1: How to measure the base rate for web traffic
To get the data you can either import from the Analytics page, or link an Excel / Google Sheet using a Google Analytics plug in.
Once you have this data you need a base rate to show that your work has had an effect. For this you need an average. Mean is the sum of all days divided by the number of days. Median is the middle number – the 50th of 99 values, for example. On top of this there’s the mode (most common number) and IQR (inter quartile range).
The median will give you the best result as it won’t be affected by large-traffic-spike skews.
To do this simply type = Median([Range]) – see below.
2: How to measure the base rate for web traffic for a specific condition – eg weekday traffic
The problem is that – especially if you’re a B2B business – you will have an error introduced as you’ll be including the weekend’s traffic which will either be significantly lower (B2B) or higher (B2C). To eliminate this you can introduce an if rule to the median. To do this you can either remove weekend data points or you can type = Median ( If ([If Range]=[Value], [Median Range] ) ). Below this is done to calculate the median value for a Monday – with If Range being a column stating day of the week, Value being Monday and Median range being web traffic for all days.
3: How to understand how big these spikes are vs other traffic
For this, you’ll also need either the rank, which will allow you to give an absolute figure “The press release correlated with the second-biggest-ever day’s traffic, jumping from a median of 200 web hits to 500. The formula for this is = Rank ([Number to test], [List of numbers], 0). The zero says it’s a descending list (1 = highest number), and can be changed to 1 for an ascending list with the lowest number assigned 1.
Or, preferably, a Standard Deviation, which says just how unusual this jump was. The standard deviation lets you understand how much variation there is within a population, so you can see if it’s high or not. If it’s normally distributed (and web traffic typically is) then 95% will lie within 2 standard deviations.
The formula is the sum of, the square of, each number – the mean; divided by the total number. Thankfully the Excel / Google Sheets formula is just =stdev([Range]):
4: How to quantify the spike, just how non-normal is this data point?
Using an anonymised client’s data as an example, we could see that two recent results increased traffic, with each having on-the-day traffic that were the second and 11th highest recorded.
More interestingly, however, with a standard deviation of nearly 37, we were able to show that the first one (which was genuinely interesting news) shifted traffic to 4.9 standard deviations from the norm. Which is a very significant shift.
5: How to normalise Google Analytics’ daily output and Google Trends Weekly output
Google Trends outputs data according to how long ago it was collected and how great a time period it was collected against. If it’s less than a week old you can see it hourly. The company even stores and makes data available to users on a minute-by-minute basis for the first hour. But as data gets older, users can only access daily statistics (if less than 271 days), weekly (if less than 5 years) or monthly thereafter.
While you shouldn’t be overlaying Google Trends and Google Analytics data onto one another it does make it harder for your client / CEO interpret as X-axis scales are different.
Option 1 is to set Google Analytics to report weekly, but this causes a secondary issue of it not matching the same start date as Google Trends (which can’t be controlled). Option 2 is to do convert the Google Analytics data from Day (or hour if you’re so inclined).
Here a sum if will help, allowing you to quickly convert Google Analytics data into the same format as the trends data =sumifs([Users],[Date Range],”>”&[Start Date-1],[Date Range],“<“&[end date]). See example below. Here Column B is users, A is date and H3 / H3 are specific dates taken from Google Analytics data.
NB – H2 – 1 is used as “>”, rather than GTE has been used.
6: How to cross reference data onto a web traffic graph?
Taking this one stage further, you can then plot this to show which activities (releases / bylined articles / other) have an effect on web traffic, and which don’t.
People tend to prefer visual information, give them a graph showing a clearly and continuing increase in searches, traffic or share price and they’ll be able to evaluate it. Take the below anonymised web data showing weekly web hits (black line), year on year data (light grey dotted line) and the most recent two announcements (grey columns) – NB, data is from a new client’s client and older information on when releases were distributed wasn’t given by the old agency.
To pull in release data and when it’s this, take a similar approach to the above formula, but instead use the countif formula, searching a column with dates that releases were distributed. This time = countif ([range], “>”& [start date -1], [range], “<“& [end date])
One of the best examples I had of this came from the power supply sector, where one company had placed a huge number of articles in a vast array of papers over a 6 month period. Quickly we were able to see that some were and some weren’t having an effect. And when we dug deeper, colour coding each publication we able to see just 3 titles had an effect on web traffic: the major industry titles EE Times and Electronic Design plus the niche vertical Power Systems Design. Instantly we were able to focus on fewer articles that drove greater engagement.
7: Cross referencing data
When we look at the effect of coverage on web traffic, it’s worth highlighting that we have fought long and hard to get a link in said coverage to help drive SEO. And while not all coverage will have this in-bound link, for those that do, we can measure the number of people visiting the site by clicking these links. It’s not perfect, but it is a good metric and the easiest way to do this is using Coverage Book’s excellent Inform The Client’s platform, which gives referrals from any URL, very easily and for free in a downloadable table.
Once you have this (or in the case below, share-count data from Buzz Sumo / SharedCount) you need to bring this information back into your coverage table. Here the Vlookup formula comes into its own: = Vlookup ([cell to search for], [range to search in], [cell to look up – X squares to right of it], (if information is in order]).
Note – the Vlookup function can be demanding if it’s running on a large dataset or spreadsheet. It’s worth, once the data analysis is run, that you paste values to reduce memory consumption.
8: Breaking down long lists of data
And by doing bringing these together, you can start to look at which publications and which titles are the most influential, even if your client’s data set doesn’t include it.
Here we take a look at the most influential English-speaking titles to cover 5G technology topics – using the above formulas, plus the Unique formula to list unique values for subsequent analysis= unique ([Range]).
From here we can plot the data (removing the labels of ineffective titles to aid visibility) and instantly see which sites are the easiest targets (X axis – article count), and which do / don’t get shared (Y axis – share count) and from this where to focus our energies.
NB – Meltwater was used to identify relevant topics with data parsed to remove media wire coverage (which has very little effect).