The Art of Data Science: Understanding Data’s Role in Business
Summary: What Businesses Get Wrong When it Comes to Analytics
Many businesses look at the wealth of data available to them and think: “Finally, I’ll have all the answers I need.” Meanwhile, every great data scientist is shaking their head… Because they know that data science is about more than the data (or even the science).
One such master of the art of data science is Andy Hasselwander, Chief Analytics Officer at MarketBridge, who joins the show to share what businesses (and many data scientists) get wrong about analytics.
In this episode, we discuss:
- Common mistakes companies make with data analytics
- The measurement trap (and how to avoid it)
- Why data specialists need to better understand the businesses and industries they work in
Mark: So how would you describe to someone who doesn’t have a background in this (analytics) the difference between, or the relationship between data and analytics?
A lot of times we just, we just talk about signal. And if you think about, in just the human brain and just walking around outside, data are all the things that are coming in through your eyes and your ears, and sort of the raw inputs. You know, color or shape, motion, etcetera, and analytics is what your brain does with it. Right? So it’s finding meaning. That’s probably the simplest explanation. We just had a new class of analysts start at MarketBridge and I looked up the definition, the etymology for the word at analytics, and it comes from Greek and sort of means untying a knot or tearing apart a problem, which, which is another way to think about it.
Think about sort of some impenetrable problem, and picking at the loose ends and picking at the strands of the rope and then figuring out, how that thing all lays out. It’s sort of unstructured analytics, and then you can get into more predictive things as well. But, what’s interesting about data is generally people today talk about big data. Like a lot of things in our field, it’s overused, but really what is interesting about data today is simply scale. There’s just more data being created by machines today than there ever were. And those machines include, a lot of people don’t think about this, marketing machines. And marketing machines are obviously everything that’s out there, digital ad networks and so forth and so on. We also spend a lot of time, by the way, with small data, such as survey data and things like that. We, we think those are interesting too.
There’s a huge amount of technology chasing the big data problem and probably quite a few executives who are not quite familiar enough to realize they may sometimes be barking up the wrong tree.
You know, there’s the scope or the scale of an org. The scope of an organization’s data are often, or are much more challenging than the scale. So if you think about it, I have 25 data sources, some of which might be look-up tables that only have a couple of hundred records, but I’m only really using five. I might be missing a tremendous amount of insight there.
Mark: I think a lot of people are used to thinking about measurement as kind of a be-all and end-all. But what is the limit of the value of data? Understanding you have to have it in order to power analytics, but absent analytics, how much value can you actually get from data?
You know, we talk a lot about the measurement trap. There’s an idea that “if you can’t measure it, it doesn’t exist.” There are channels that are inherently more measurable. Direct mail is a great example. Direct mail is extremely measurable. We know, you know, we can predict based on a mailing and model exactly when phone calls are going to come in or if there’s a vanity URL, when those responses are going to come in, and how many leads those are going to create. So sometimes direct mail might be for that reason, used more often than something that might be more difficult to measure.
The hardest thing to measure for brands generally is up-funnel, advertising, brand advertising.
There are companies, particularly in the more financed-driven space, that are used to measuring everything that decided to shy away from up-funnel brand advertising for that reason. That’s the measurement trap. And you can get yourself in trouble where you start circling the drain on optimizing these super measurable channels. So I think that’s one of the limits.
If you can’t measure it doesn’t mean it’s not worth doing.
Now a CFO might argue with you and say, show me the money. But you might just have to dig a little bit harder to find your evidence, and there’s a lot of ways to do that, but it requires a little more creativity and, and, and a little, maybe even a little faith sometimes.
Mark: When you get into a conversation like this with a marketer who understands in conversational terms, that marketing takes time to pay off, but has never really thought about mathematically calculated time lag. How do you typically address that?
The best way to convince a skeptic, or at least to lead somebody along a path towards econometric measurement, that’s the fancy way to say it, is with case examples. And there are good ones out there. There’s is a lot of academic work that’s been done dating back into the seventies on the utility of econometric models to understand up-funnel marketing’s contribution to enterprise value. There are also meta-studies that have been done. One of the best ones is a study called “The Long and Short of It,” came not maybe seven or eight years ago out of the UK that looked at I think a hundred or so consumer brand, and the relationship over the long run, again with time-la, and a metric called extra share voice, which is share of voice minus market share.
It’s important to get the difference between the two. If your share of voice is higher than your market share, you have a positive, extra share of voice. If it’s lower, you have negative. A pretty simple metric. And the relationship between that and overall brand performance was there’s a positive relationship. Then you get into the questions after somebody says, “okay, I get that this can be done, and I get it’s a relationship, but what about, how can I do it?” And that can be a challenge, particularly when someone doesn’t have a lot of track record. One of the hardest things is trying to get a company that has not particularly done a lot of measurement investment up-funnel to start it because that’s where you do require the leap of faith.
You can say, I’ll show you these case studies. But ultimately, you have to do a test. And one of the challenges is that the conservative test is to do the bare minimum. “I’m going to spend half-million dollars, you know, in four weeks.” Well, that’s not going to show anything. So, you know, you need to really dig in. That job requires a huge amount of guts.
Mark: If we look at analytics, historically it has continued to gain efficacy, impact, and credibility. And yet, there has been a lag in how analytics are actually operationalized. What do you see as the biggest challenges and where do you see it going?
You know, I think there’s a lot of challenges. We could talk about the challenge of finding talented people to actually run these models or to collect these data. That certainly is out there today. And it seems to be in right now in 2021, as hard as I can never remember it being right to find, to find really good people.
I think the challenge though, from an executive perspective is that a lot of executives or aspiring executives sometimes miss is how important it is to be system.
And what I mean by that is, when you think about data, for example, think of a log file for a website. You know, I have a date timestamp, I have some kind of a user ID, I’ve got all this that’s big data. I might have millions and millions of these log files being generated per day. Okay, that’s fine. The challenge is, is the look-up tables that go against that. For example, if you’re in salesforce.com, it’s the pick-lists, right? The three audience dimensions that you choose, the age bands that you choose, the campaign objectives, those simple things, you know, audience definitions. And what you find in so many organizations is you might have different pick-lists between systems. So for example, between a marketing planning system and a marketing measurement system like in Adobe. Different pick-lists.
You might have different rapidly changing dimensions through time. Another one would be, it’s a new year, a new campaign planner, a totally new strategy, let’s change all those things. So what you get into is it’s very difficult to start drawing inferences and use analytics when the data, sometimes called metadata, aren’t stable.
One of the things executives can do is simplify the dimensions across which we analyze and speak about our business, keeping those simple and consistent.
Obviously, there will be changes, right? And we’re not suggesting that the business is laid in concrete, never changes, but there’s huge value in consistency, even testing, right? So when you think about how tests are run, making sure that the way we run tests is consistent. You know, if we make a decision at the enterprise that 10% of our budget should be focused on testing and 90% on production, we should stay that way.
And we should use the same language as we talk when we talk about them, and it’ll be much easier for all 50 or a hundred people that are required to slice and dice, munch the data, do the analytics, build the reports. This is one of the biggest problems with dashboards. You know, it’s very hard to build dashboards.
If the dimensions are constantly changing, try to build a dashboard–which fundamentally is a chart with a bunch of pick lists across the top–when the pick lists change all the time. You can’t do it.
Mark: It kind of begs the question about whether we’re kind of starting at the wrong end of the stick. And I think that is one of the things, lot of people observed in 2020. If you compare in 2019 to 2020, into 2021, there is huge variability. Right? How do you account for the changing dependent variables? A dependent variable is like sales or something that you’re trying to make happen. Independent variables are what you’re doing or what someone else or the marketplace is doing. When those are changing as well, how do you navigate?
I’ll talk about the analytics jobs to be done. This is a tremendously oversimplified framework, but you need to know the “what,” “why,” “who” and “how” right? The “what” happened in 2020 we know. Maybe 20 years ago it was difficult, but now, any organization will be able to say here’s a time series of my sales. Here’s a time series of my leads. I know what happened. We were above or below goal. The “why” thing gets pretty difficult sometimes. And it got very difficult last year. Well the answer is everything changed. Why did it change?
The “why” job for an analyst or marketing analyst maybe is the most critical. Because if you don’t understand the drivers of why things change, it’s very tough to make a strategy.
That data detective role, by the way, it’s also one of the hardest. Again, going back to the theme that it’s the hardest to hire for. The tool of econometric approaches which is really organizing data to your point through time and across dimensions will get you there. It’s the Sherlock Holmes of marketing. And what are those dimensions? Well, I can look at it by DMA or by city. Obviously, I’m going to look through time. I’ve got the pick-list we talked about before. I’m going to go look at what my competitors did. One of the things we saw last year, for example, was a couple of companies saw it was really important to control for geographic mobility due to COVID. And that hugely different by parts of the country. So bringing in those control variables and then saying, you know, it turns out that the reason we were down here was that we had real suppression of customer activity here and actually not here.
Mark: Where do you see all this going? If you were to prognosticate where this was going to be, some of these issues that we’ve just discussed, and how they’re going to be resolved. Not only socially and medically, but in business, where is it going?
Was it Yogi Bear that said it’s hard to make predictions, especially about the future? You know, everyone makes predictions based on the past. And I, and I think if you think back 20 years, I think there are two rough ways it could go. Way one is software, marketing software really finally reaches the promised land that executives have been told in Salesforce and Adobe Conferences year in year out, and there’s a common operating system where it no longer is this constant reinvention and things stabilize. That’s, that’s an option. I think marketing becoming more deterministic is another way to think about it and less fuzzy. But the problem with that is that unlike accounting, marketers are dealing with people who are human beings that spend 99% of their life underwater and using the iceberg analogy. And you only see 1% of them and that’s never going to change. And certainly, things like third party cookies going away, continuing privacy legislation, and antitrust are probably creeping up to make that harder.
I see marketing analytics becoming more transparent, more reproducible, in other words, more code-base. Them being exposed, and executives getting more comfortable with that. But also at the same time, I see possibilities on standardization of data structures. This goes back to the theme of pick-list dimensions; there’s really no reason that, as an industry, marketers and advertisers could not start at least coming up with some basic standards about how to talk about customers, audiences, etc.