The 8-step data playbook for AI in customer experience
A practical breakdown of the 8 data types that should underpin every successful AI program.
You’ll hear ‘data’ shouted from the rooftops when it comes to making AI work, but what exactly is this data? Where is it? What types of data do you need? How can you gather the data you need to make AI work for your business? What’s meant by having ‘good data’? Or ‘AI ready’ data in customer experience?
In this article I’ll break down the 8 types of data that are most useful in customer experience AI.
I think of it in levels as outlined below:
Level 1: Intent data
This is the data that underpins the foundations of your customer experience AI strategy. It is your customer needs. Typically, you’ll find this insight inside your customer interaction data: your call transcripts, chat logs, website search, emails, help tickets and wherever else there’s an exchange of information between customers and the business.
This data is used to firstly understand your real customer demand. Relying purely on agent-logged wrap codes isn’t accurate.
Benefits of intent data
Once you understand your real demand, you can do two things:
Standardise your customer intent map across teams and departments so that you can make sure every area is reporting on the same thing. This improves data integrity and your ability to measure outcomes.
Inform your AI strategy: what use cases are the biggest in terms of volume? Which channel has the most demand? Which intents do you already have highly performing self-service channels for? And which would benefit from AI automation?
This is the foundation. Stage 1 in your data audit. Ask yourself:
Do we have all our customer interaction data in one place?
And do we properly understand our demand?
Level 2: Use case data
Underneath your high-level intent data is your use case-specific data. These are the additional questions, sub-needs or sub-tasks your user needs to complete in order to meet their primary goal. You’ll also find this information in:
Customer interaction data
Customer journey maps
Staff experience (staff are sometimes aware of user needs that aren’t documented)
This data contains all of the underlying needs that exist underneath your global intent. This gives you your design scope and your critical success factors for your interaction. It’ll also give you the first indication of what content and knowledge you may need to support the delivery of the use case, as well as what system integrations you might need.
So for stage 2 in your data audit, ask yourself:
During the course of resolving their primary need, what questions do customers have?
What sub-tasks do they need to complete?
Level 3: Process and policy data
The next layer of data that sits underneath your customer intent and use case data is your process and policy data.
Process data is concerned with mapping the business process that sits underneath the customer intent and the user journey (or reviewing and updating the process maps you already have ((if you have them)). This will provide you with your workflow requirements for your automation.
Then, your policy documentation will provide the justification for the business process, as well as any other business rules or considerations you need to cater for.
Challenging process data to innovate
Now, that’s not to say that you should just crack-on and start automating what’s there. I’ve written before about the dangers of slapping lipstick on a pig. Ideally, you’d review these policies and processes with a view to redesigning them to make them more hospitable for AI automation. However, depending on your role and the business demands, you might not have this luxury.
If your intent and use case data are representations of customer needs, then your process and policy data are representations of your business needs. Any process you automate with AI needs to satisfy both sides of the coin.
Level 4: Legal and compliance data
The data that underpins business policies, processes and sub-processes is legal and regulatory compliance data. The boring stuff, I know.
All businesses have to follow things like GDPR regulations, and some industries, like financial services and utilities, have specific legal and regulatory requirements. The upholding of these regulations has to be in place in any live AI automated process.
For example, in insurance, regulation dictates that you cannot forcefully sell a user a specific policy. You can only uncover needs and present policy options that are in line with needs that you have uncovered. You are not allowed to use your intuitive sales patter to hard-sell a user a policy that you have no grounds for selling.
These regulations inform your design requirements, as you cannot properly implement a production-grade system until you understand and cater for them. Reviewing this data will make sure that you have the complete picture of your business requirements.
Level 5: Content and knowledge data
Use case data, combined with process and policy data, combined with legal and compliance data, typically shapes the type of questions that customers have, and the things that you must do. Therefore, it is through the creation (or the assembly) of this data that you create the knowledge, and the content, required to answer customer questions as they progress through their journey.
There may also be additional content needs that arise based on other factors. For example, let’s say you have an artificial intelligence solution that exists in your mobile app, you may get queries from users who are having trouble accessing certain features in the app. This will then require specific content of a troubleshooting nature that isn’t specifically related to your use case, process, policy or compliance requirements. Nevertheless, use cases cannot be realised unless you have sufficient content and knowledge to answer customer queries throughout their journey.
The importance of content format
Even if knowledge or content exists, it may not be in a format (or written in a language) that is sufficient to be used in a production-grade AI system.
For example, if you simply ingest policy documentation that was written for the purposes of documenting the policy, then this content isn’t written in a way that is easy to understand for an end user.
Simply throwing all of the data that you have into a knowledge base provides more work than necessary for large language models. Not only do models have to find the right content, they also have to reword things substantially to make it user friendly. This is where you introduce the risk not finding the right content or hallucination.
To get your content house in order, ask yourself:
Do we have complete content and knowledge for a given use case?
Do we have any gaps?
Have we covered use case, process, policy, legal, compliance and usability/troubleshooting knowledge?
And is our knowledge written for users or for documentation?
Level 6: Business system data and connectivity
Your content and knowledge data provides a foundational ability for your AI agents to discuss topics. But users never want to just discuss something. Behind every question, there’s a need. An action. To complete actions on behalf of users, you need connectivity to the business systems that are in place to uphold business processes and enforce business policies. This data comes in two forms.
Data retrieval
The first form is the retrieval of data from those business systems to make interactions more personalised and provide specific answers to specific user queries.
For example, if you wanted to ascertain whether your item you have just purchased is eligible for a return, you may need to provide your order number, which will then be used to interrogate a business system to determine an answer.
This also goes for your content and knowledge. Where is it? And how can you access it to ingest it?
Data writing
The second element of system connectivity is pushing data into systems to start, progress or finish the action of a business process. For example:
Starting: gathering customer data and pushing it into a CRM to kick off the process of a callback to book an appointment.
Progression: a user, who is in the middle of a loan application and was asked to provide bank statements. When returning, after uploading the bank statements, the business process can resume.
Finishing: customer provides a meter reading to their utility provider that is recorded in the line of business system as a completed as a completed, end-to-end process.
Then, you have to consider whether you have:
API connectivity to perform the functions you need in your AI agent
What’s returned from the API?
How fast is it?
How secure is it?
And is it fit for real-time automation?
Level 7: Foundational data
Foundational business data is data that can be used to generate either insights or understanding, or used as the basis of bespoke machine learning models.
Think about your call transcripts, customer data stored in your CRM, case management data, financial data, stock and logistics data, product and service data and so on.
This data on its own may yield insights, but when combined, you have the potential for some real innovation.
Foundational data for insights
For example, perhaps there is a trend between customer satisfaction in your call centre and customer churn. Maybe customers that most often churn are those that experience multiple poor experiences across multiple touch points. Being able to access and analyse this data to reveal those insights will enable you to make changes to your customer experience that could help you keep hold of more customers.
Foundational data for action
That same data could be used by to train a machine learning model to recognise the patterns that lead to churn. This could then be used in real time to appraise the churn risk of individual customers, and thus trigger preventative measures.
For this stage of your data audit, you should ask yourself:
What do you want to know about your customers?
What insight would enable you to make a huge impact for the business?
This is the starting point in beginning to track down the data sources needed to build this kind of capability.
Level 8: Performance data
Performance data is data you use to understand whether the services that you offer meet the expectations of the business, the user and any regulatory or compliance requirements.
This could be data such as average handle time, customer satisfaction, customer effort, first contact resolution rate, cost to serve, lifetime value, NPS, and any other data that you use either to assess the performance of an individual process or use case, or that you use to assess how the company is progressing towards its goals.
Without this performance data, you will struggle to understand the purpose of your AI program. What metrics are you trying to effect?
The importance of alignment
If your AI program isn’t in service of fundamental business goals, and if it can’t impact the things that you care about, it’s misguided. If you cannot measure progress against those business goals in your current operation, you have no idea where to focus your AI efforts. And if you cannot identify signals, markers and data points in your AI program that will demonstrate whether or not you are having the intended effects, there is a good chance that you are wasting time and money on trivial matters.
So consider how you quantify success today:
Do you measure the things that you care about?
Are those things relevant in an AI-first world?
Can you accurately understand your starting point?
And what metrics do you want to affect?
Post-live data
There are many other types of data that, when you go live with your AI solution, you will generate. Some of this is foundational data. Some of it is new business data and new use case contextual data. But you’ll also have new analytics data and solution-specific operational data.
There’s perhaps a part two of this piece where I jump into more detail about the data that your AI program will generate, and how you can use this data to further improve and innovate.
Your AI Ultimatum
There’s enough here, though, to give you a solid place to start in reviewing your maturity against each of these 8 types of data, in order to assess whether your AI strategy is starting or scaling from a solid place, and where you might need to spend some time improving or investigating.
This has always been one of our AI Design Principles; principal number two, to be data-driven. This article is essentially what it means to be data-driven, and what type of data is important to you as you start and scale your AI program.
Before you invest another pound in AI, are you certain your data foundation can carry the weight? And if not, which layer needs your attention first?



