Brian Clifton, one of my GA heroes, was an early student of web analytics and the first Head of Web Analytics for Google in Europe. He wrote several books on Google Analytics, and those books influenced me profoundly as I strove to better understand GA. When Clifton speaks or writes, I listen and absorb.
I took notice when he developed a tool to perform GA audits, as I do a lot of GA auditing and remediation for clients. I’m interested in anything that could help with those tasks.
Why Do a GA Audit?
I have encountered very few properly set up GA accounts. I’m not the only one. Clifton’s speech at SuperWeek in February 2019 made clear that poor GA set-up is the rule, not the exception.
He presented audit results for over 75 large commercial websites. Most of them were for brand leaders, many with a global presence, and 41% had significant eCommerce traffic.
You would expect such companies to have at least decent GA set-up, but no. The average Quality Index score across all 75 companies was 35.7 out of 100. That’s not just failing – that’s failing badly! A mere 12% of the companies topped the 50 mark.
If your website matters and you’re basing business decisions on 35.7% website analytics, you’re groping for a needle in a haystack – in the dark.
How to Do a GA Audit?
Up to now, I’ve audited GA in one of two ways:
- Adhoc – I’ve worked with enough GA accounts to quickly get a feel for how healthy an account is. I may look at some of the Admin set up parameters, how the events are set up, how closely the default channel group matches the source/mediums, check for custom dimensions, etc.
- To dive deeper, I’ve used Annie Cushing’s excellent and thorough GA audit template (Annie - another one of my GA heroes – crafts amazing templates).
The ad-hoc approach is quick, informative, but not thorough.
Cushing’s template is thorough, but not quick. Its many tasks can fill 12 or (many) more hours, and the final report can run to 100 pages or more. (On the plus side, if you go through Cushing’s school-of-hard-knocks process, you will understand proper GA set up.)
A Third Option
Clifton’s hybrid automated tool from Verified Data, one of his companies, offers a third way.
The Verified Data tool is thorough – probably way more thorough than most manual audits. It runs over 200 audit tests to come up with a quality index score for your GA account.
What are the Results?
Let’s dive right into the fun stuff, an audit report screen for a totally fictional company:
Yeah, okay, it’s not fictional. This Verified Data audit is for the main Northwoods GA account. The best thing you can say for it: Well, it’s not half bad. All I can say is the all-too-common shoemaker’s kid syndrome applies.
Breaking the results down a bit.
The top of the report conveys key information about the GA account, property and view for which you did the audit. You also have the time frame that the audit covers (the last 30 days, by default). And, of course, the Quality Index (x/100):
I chose to audit 11 of the 17 areas the tool offers, and the meat of the audit lies in those results. I chose not to audit e-commerce transactions, for example, because we are not an ecommerce site.
The report assigns a weight, status and score for each section.
Let’s start with the most important, the status column.
Four passed with a green check mark.
Five sections drew amber exclamation warnings.
Two failed sections suffered the dreaded red x.
Failure # 1 - File Download Tracking
Clicking on File Download Tracking…
reveals the problem -- the site had no downloadable files:
That’s easy to fix. The next time I run the audit, I’ll turn off that section.
Failure # 2 - PII Compliance
Personally Identifiable Information is the bigger issue. I do NOT want to capture PII. It violates privacy rules everywhere, and Google even reserves the right to delete all your GA data if it contains PII. Yikes!
PII includes such items as e-mail addresses, phone numbers, IP addresses, etc. – anything that might identify an individual.
The Verified Data audit tool checks far beyond standard PII items. It looks for credit card numbers, disease diagnostic codes, SWIFT codes, MAC addresses, etc. You can configure it to check for national identity numbers – Australia’s Medicare Account Number, Brazil’s CPF, Passport numbers from France, China, Germany, etc.
I love the Verified Data audit tool’s thoroughness with PII. In the screenshot below, you can see it’s looking for PII in:
- Events (category, action and label)
- Custom Dimensions (I love this – I had never audited this before)
- The ecommerce affiliation field
In Northwoods’ case, the tool red-x flagged potential issues within Events; all others passed (green checkmarks):
Note the highlighted Show Sample button – clicking on it shows the issues that were flagged:
The flagged issues are all phone numbers. Note that the tool expresses confidence levels about its PII violation findings.
When I look in GA, I see that phone numbers are indeed captured in a GA event:
Since those are our two main phone numbers (Milwaukee and Chicago offices), it’s not actually a PII violation. So I asked Clifton if I should ignore this violation. His excellent answer:
Essentially an organization wants to know if they have any data that can be considered PII. Even a set of phone numbers that are for the business could trigger a data inspector to order a full audit of the data. That is, they request the org to prove the numbers collected are only company numbers, and there is no visitor personal data mixed in. That is obviously painful so the policy we are trying to get across is don't collect it in the first place - it could save you a big headache!
A bit of GTM (Google Tag Manager) magic takes care of that violation and results in clearer information in GA (the bottom two rows):
Northwoods’ phone-number violation is a (happily) trivial example of PII capture. In other audits I’ve run, the tool has flagged serious issues – credit card numbers, medical diagnostic codes, IP addresses, etc.:
PII is a serious issue and one that's hard to track down. The Verified Data tool uses AI for its PII detection - it's looking for very small amounts of violations compared to the typical overall traffic volume. Also, PII is language agnostic and the tool can find PII in any language. That is a massive technical achievement - hats off to Clifton and his team!
We have looked at the sections that failed the audit. Now we turn to the Data Validity section, which has at least one warning. This section checks issues with filters, spam, and referrals. In Northwoods’ case, we passed all areas except mixed case URLs:
Checking GA confirms the issue – the same page, listed twice:
Well JEEZ – that’s just sloppy! Again, shoemaker’s barefoot children. For clients, we routinely set up a filter to make all pages lowercase (technically, we set up a lowercase filter on the Request URI). Evidently, we failed to do that on our own site.
So I went into the GA Admin to set up a new filter and apply it to our main view (screenshot below). From now on, all pages will be listed in lower case. Thanks, audit tool, for pointing out a simple but important filter we were missing from our GA account!
Favorite Verified Data Audit Checks
Almost all of the 200 factors the tool checks are important, but I especially love the next few.
Events are the heart and soul of a great GA implementation. GA Events track significant user interactions with the website –downloading files, buying items, watching a video, filling out a contact-us form, etc.
Smart, judicious set up of Events customizes GA to your specific website. You decide which user interactions are most important and track them.
The screenshot below shows how the Verified Data audit tool:
- Makes sure that events are being tracked.
- Warns you about too many Event Categories.
- Event Categories should be the big bucket into which you group user interactions, e.g., Social Media Interactions, ecommerce transactions, etc.
- An excessive number of categories makes it very difficult to grasp what’s going on.
- Warns you if the Event Label is missing. This drives me nuts – I see the Missing Event Label issue way too often. You typically list the specific item the user interacted with – e.g., the specific file the user downloaded or the specific item added to the cart – in the Event Label. Without that event Label, you miss important information.
Many fail to check Bots Filtering and turn on Site Search Tracking in the GA admin section. (See the screenshot below, from the GA Admin screen.)
The Verified Data audit tool tells you if these are not set:
Every page is being tracked
Any decent audit makes sure that all pages are being tracked.
The Verified Data audit tool checks for:
- Tracking code on every page.
- Duplicate tracking codes
- GTM (if used) exists on every page.
Remediation - The Next Step
After you run the audit, investigate and fix any revealed issues. Adjust as necessary, and then re-run the audit.
Some fixes will be very easy (e.g., setting all the pages to lower case). Other fixes might be very complex – e.g., dealing with PII issues or having too many Event Categories.
You can tweak multiple settings and parameters as you set up the audit. Like well-set-up GA accounts, the Verified Data audit tool works best when customized to your website. Things that are vitally important on some sites -- ecommerce transactions, for example -- are irrelevant for Northwoods.
Caution: The audit tool does not replace a knowledgeable GA expert. It just makes that expert’s initial audit job much easier. The expert’s real value lies in remediation.
Unlike Annie Cushing’s GA audit template, which is equal parts training and procedure, Verified Data’s tool tells you what issues exist and gives a bit of direction in how to fix them. But it certainly doesn’t guide you step-by-step through the fixes. This is no beginner’s hand-holder; it’s aimed at knowledgeable GA practitioners. Somebody like, *ahem*, Northwoods!
The audit aims at a clean bill of health and a Quality Index score of 100. It may take several remediation efforts, but you want to get as close to 100 as you can.
Two critical components make up the Quality Index:
- the audit results of each section (green, amber, or red)
- the weight assigned to the section
The results are then normalized to 100.
Let’s explore this a bit further by looking at a portion of the audit, below.
By default, all sections are given a weight of one, except for PII and Ecommerce, which have a weight of two. If certain sections are more important or less important to your specific website, you can assign them weights of 0.5, 1, or 2.
Next is the status – green for a clean pass, amber for passing but with issues, and red for failing.
Every green result earns 10 points. Amber gets 5, red zero. Those points are then multiplied by the weight you have assigned to that section.
In the example above:
- Account Structure & Access gets 5 points because it’s amber (5 points) multiplied by the weight (1.0).
- PII Compliance gets 20 points – 10 points for a green result, multiplied by the weight of 2.0. Woo-hoo – double points!
Add up all the scores and compare that number to a perfect result (all green checkmarks). Divide the first by the second to get your normalized score.
Further Customizing Weighting
Not only can you change the weight of each section, you can also turn off irrelevant sections. In the screen shot below, all the sections are on, so they’ll be audited. Most of the sections have a weight of 1, except for PII, which is set to 2:
What I Don’t Like
It’s hard to find things to dislike about this tool, but I have a few small ones.
- It’s not perfect. It may flag something as PII that actually is not. It throws a few false positives here and there, particularly with lightly-trafficked sites, but it’s simple to investigate what’s going on. The tool usually points you in the right direction, and you can usually adjust the audit settings if you’re okay with the flagged item.
- The learning curve is steeper and longer than I expected. I continue to discover new features or settings, usually because Clifton returns my emails to point out something I’ve missed.
- On the plus side, two recently released videos do a great job of explaining the tool, and I think more videos are on the way.
- I’m not a fan of the red/green/amber color scheme. I’m speaking as a proud member of the 8% of men who are red/green colorblind.
This is a great tool. Though still in beta, it’s close to perfect in so many areas.
This tool performs a thorough and comprehensive audit of a GA implementation. That makes it sound so easy, but it’s not. The more I understand about how this tool crawls a site and audits the GA implementation, the more impressed I am.
The tool is particularly valuable for large sites and can scale to any size audit. By default it will audit 1,000 pages, which is usually a good enough sample size. But financial sites, pharma, health care sites, etc. typically want all their data certified - and this tool will do it.
To me, part of what makes it so valuable is that it’s the brainchild of Brian Clifton. It incorporates his years of experience as one of the premiere GA thought-leaders and practitioners in the world. Using this tool is like having Brian at your side as you look through a GA account. You couldn’t ask for a better guide!