A Balancing Act
Of course, there are challenges and opportunities developing in the realm of synthetic data as it has become a more viable option in certain industries, ranging from healthcare to finance and more.
On the one hand, for example, it might be a cost effective and flexible method to scale up a research project. It might also have benefits regarding data privacy and confidentiality. For tasks like market segmentation and predictive analytics, synthetic data may prove useful. Then again, some may call into question synthetic data’s quality and accuracy, and there may be acceptance or trust issues for those used to more real-world scenarios.
As always, for synthetic data to prove beneficial, each market research professional must weigh the advantages and disadvantages of the process while focusing on human insights and integration, leveraging AI, and balancing the two approaches. It may also depend upon the goals of the research initiative, while at the same time there needs to be checks and balances in place to assess the underlying data that the synthetic data is based upon.
In All Things Insights’ recent “Future of Insights” report, Anu Sundaram, Vice President, Business Analytics, Rue Gilt Groupe, observes, “Like with any new tool, there are different advantages and disadvantages. Data quality is key as always. The biggest challenge with all the speed, efficiency and all the different ways to now generate insights is the problem with synthetic data—what is truly fact versus what’s being generated on the Internet?”
Sundaram adds, “That’s when the human element comes in. A seasoned market researcher will understand what is truly being generated from true customers versus not. You will always need a balance of your controlled environment compared to external sources. There must be checks and balances in terms of the weights you provide to structured research versus more unstructured research. Then there is the human element. Who’s the subject matter expert? Does this insight make sense for your business?”
Still, despite drawbacks, synthetic data may be intriguing to some data science professionals, and some industries where real world data is more difficult to obtain.
Su-Feng Kuo, Sr. Director, Global Clients Insights & Analytics at Visa, notes, “An executive said they use AI in part to leverage their past data to create synthetic data. That’s something I’m very interested in learning more about. I’m sure once I bring those findings back to my team, they will ask how you validate that synthetic data and how do you compare with the current methodology?”
Synthetic data has some potential in work for a financial company, for example, as synthetic data is said to be useful in the finance field where some information is confidential.
“I can see how it could be useful, especially in certain industries,” agrees Kuo. “I will imagine, for example, in the medical and healthcare fields there are certain data that’s difficult to obtain. There are certain diseases that are rare and so it will be hard to find a patient to get the data. But then again, there’s also the question about how much we can trust the data. In this whole area, it creates a lot of opportunities but also brings a lot of uncertainty in how we communicate and make the best use of it.”
Focusing on the Human Element
Perhaps inevitably, the growth of synthetic data is linked to the rise of artificial intelligence. As more data is needed to power machine learning and artificial intelligence technology and capabilities, synthetic data is being used to power some of these advanced functions.
While the insights community is beginning to adapt to the use of AI, there is some consternation when it comes to synthetic data, and what it’s various impacts could be.
Karen Kraft, Associate Director, Consumer Insights & Analytics, Johnsonville, notes, “Being able to analyze an online message board, or an online call and find themes across different question types and different activities that you’re asking consumers—it’s amazing that AI can do that. These tools have already been developed. AI is really going to help researchers. How can we take the data that we’re collecting and find tools to make that data even more accessible? Because the problem with big data is there’s a lot of it. AI is going to be able to help with organizing it.”
However, Kraft says, “I am skeptical of synthetic respondents, though. There are some companies in that area now. I think it’s really going to quickly hit a space of hitting stereotypes. It’s going to be more of a flash in the pan. I don’t think it’s going to be on the data generation side as much it is on the back end. That taps into data quality concerns. I just don’t want synthetic respondents that are based off stereotypes just as I don’t want fake respondents when I’m doing real data collection.”
In addition to any potential biases, synthetic data is viewed as a bit untrustworthy compared to the “real thing.”
Aarti Bhaskaran, Global Head, Research & Insights, Snap Inc., cautions, “You have people using technology to build synthetic data. Does it have its uses? Potentially. But is it going to be highly misunderstood and misused? Any tech is going to bring with it positives and negatives, and I feel like the same applies to AI. We really need to understand why we want to use it. Because at the end of the day, what we want to be is the voice of the consumer, and that voice needs to be real. That’s why I’m concerned about things like synthetic data. We as an industry haven’t thought it through, in terms of what are the guardrails for using AI in these circumstances.”
There may also be a reputational aspect to the debate. Just what does the insights professional stand for if not for quality?
Idil Cakim, Senior Vice President, Research and Insights, Audacy, says, “AI allows us to harness a lot of information and unlock opportunities in ways that we couldn’t before. But it requires a lot of time to ingest that data, to train those systems, to make sure the trial-and-error period doesn’t leak into production. We have a lot of upfront work that needs to be done to make sure a system sings the way we want it to sing.”
Cakim adds, “There’s also the issue of synthetic data. As people who are responsible and who take pride in uncovering truth, are we going to be OK with synthetic data if it delivers statistically speaking the same sort of quality results? We’re going to have to grapple with those questions and we’re going to have to still stand for quality and truth—and be the ambassadors of those two qualities.”
Enhancing the Data
Just how synthetic data will play out in the marketplace is still to be determined. But many feel that once the genie is out of the bottle, it could be difficult or impossible to put it back in.
Perhaps, some suggest, a simple name change would suffice to at least give synthetic data a better image in the market.
“As for synthetic data, I would move to rebrand it. There is a lot of potential and some early success, but the thought of ‘synthetic consumers’ turns people off and gives them ideas about things that it is not. Think about it as enhanced or simulated data, one of the many data sources that need to be better connected and more accessible,” suggests Oksana Sobol, Vice President Insights, The Clorox Company.
Yogesh Chavda, Director, Center for Marketing Solutions, University of South Carolina, feels that perhaps the conversation is still worth having as there are various uses for synthetic data, if handled properly.
“Now comes this new space called synthetic data,” says Chavda. “To augment the existing foundational data that you have from a segmentation, you could basically create synthetic data and make that value go up. The sample size is big enough for you to find those nuances that you could not find before. That’s a skill set in terms of how do you create synthetic data? Is it designed the right way? Is the algorithm the right one being used? Are there any biases or other issues? Things that we never talked about in the insights space will become part of our conversation now. How do you then translate that and make it actionable?”
In All Things Insights’ recent “Future of Insights” report, Chavda noted the potential of simulated environments. Synthetic data could, perhaps, play a role in such dynamic environments.
He adds, “Another area of potential in the insights space is the ability to create simulations, such as market or customer behavior simulations. Think about it like gaming. You could take that same framework of simulation games, apply it to insights, and create simulation tools. If you’re launching a new product, you can gauge what might happen if behaviors change. That’s where synthetic data plays a big role.”
Another related area is called digital twins. Chavda notes, “These are ways on how to start building out new tools, new solutions, new ways of thinking about insights. Because imagine if you’re now creating insights in a simulated environment, based on a set of behaviors that don’t actually exist in the marketplace but could. If you’re able to simulate what you think could happen, does that increase the value? Does it increase the premium of what our services could look like? You might be able to guide leadership in ways that they could not have foreseen.”
More Synthetic Data Resources
Exploring Synthetic Data Applications
Synthetic data, a type of artificial data generated through algorithms rather than real-world interactions, is rapidly gaining traction in various industries, including market research and insights. This guide aims to provide a comprehensive overview of synthetic data, including its benefits, challenges, and potential applications in market research.
Creating New Frontiers with Synthetic Data Solutions
As artificial intelligence technology continues to develop at a rapid pace, the data pipeline has gotten wider and more complex. It’s all about the amount of data being fed to these dynamic machines. With this influx of data, there has been more focus on data accessibility, data governance, quality, privacy, and security, to name a few critical issues. Indeed, the idea of “responsible” data has become more prevalent in the marketplace. There has also been a growing influx of synthetic data. As AI becomes more pervasive, the need for more data to feed the machine becomes apparent. Synthetic data might be the next step in that process.
Video courtesy of IBM Technology
Contributor
-
Matthew Kramer is the Digital Editor for All Things Insights & All Things Innovation. He has over 20 years of experience working in publishing and media companies, on a variety of business-to-business publications, websites and trade shows.
View all posts