Inside the Groundbreaking Effort to Model and Measure the Data Economy

Ever since big data became a big buzzword, it’s been called everything from “the new gold” to “more valuable than oil.” But truthfully, there are limited frameworks for determining the dollar value of data, such as customer databases, transaction records, user behavior patterns online, or other information.

Laura Veldkamp, the Leon G. Cooperman Professor of Finance and Economics at Columbia Business School, is actively seeking answers to key questions such as: What specific features make data more or less valuable? And how is that value defined? “As a theorist, my role is to figure out the mathematical representation of an idea — to take a concept and formalize and quantify it,” she says. “With that set of equations, we then have a basis for measuring things.”

Such equations already exist for traditional financial assets. But when it comes to data, economists are in uncharted territory. “Data is a new asset class,” explains Veldkamp. “As such, we need new pricing theory. We’ve got models to price stock options, equities, and bonds, and to value firms. Right now, we just don’t have any tools to price data.”

The ‘Commitment Conundrum’ and Other Challenges

One challenge with data valuation is that macroeconomic measurement systems are often rooted in industrialist terms and metrics. “A lot of macroeconomic models take into account workers and capital and factories and consider those factors to produce output that only one person can use at a time. I don’t know anybody who works in that economy around here,” says Veldkamp, who holds a PhD in economic analysis and policy. “In New York City, we live in a knowledge economy, and people work with data to produce actionable recommendations that potentially lots of people can use.”

Another core challenge when it comes to assigning value to data is that it’s worth different amounts to different stakeholders. There are also nebulous and difficult-to-quantify consumer-facing factors at play, like the value of privacy. “If I’m going to buy a pair of socks on Amazon, should I be eligible for a discount because they’re going to get data that might affect the shoes I buy next time? We’re just really far away from knowing how to answer that,” Veldkamp says.

Data also has some quirky features. For instance, its value drops as it becomes more accessible to more people. “It’s more valuable to have a hot insider tip that only a few people know than to have information that’s been broadcast online,” she says. “These features make selling data problematic because when I’m trying to decide how to value it, what I’d really like to know is how many people you’re going to sell it to.” This leads to what Veldkamp calls a “commitment” problem for firms looking to sell data sets: They often can’t provide prospective buyers with a fixed number of entities who will ultimately gain access.

Yet another complex issue is the lack of publicly available, disaggregated data (in other words, data that has been broken up into smaller and easier-to-parse sets). It’s challenging for academics to study the value of data when so much of it is proprietary.

Early Findings in an Emerging Field

Veldkamp is undaunted by such challenges. Her recent research takes inventive approaches toward quantifying the value of data and attempts to lay the groundwork for future models.

For instance, a paper Veldkamp co-authored, “The Changing Economics of Knowledge Production,” linked the salaries of data-focused professionals with the value firms place on data. “We looked at why it makes sense to hire this many people and pay them this much,” she explains. “By measuring firms' hiring, we can infer the value of data without seeing the data itself.” Veldkamp recently spoke about this approach, along with other emerging methods for assigning value to data, in her keynote address to the National Bureau of Economic Analysis spring finance meeting in Chicago.

Digging into how firms handle the “commitment conundrum,” Veldkamp collaborated with the Brookings Institution on “Data Sales and Data Dilution,” finding that many are shifting to a subscription model when selling data. This method provides buyers with a degree of reassurance. “If the company oversells the data and it becomes less valuable, the buyer can just cancel their subscription,” explains Veldkamp.

Other areas of Veldkamp’s research focus on how “superstar” data firms — those with widespread ownership of and influence over data — impact the larger market landscape. “As data becomes more important, we should expect big data owners to make out like robber barons,” she predicts. “Eventually, all firms will be flooded with data, and there’ll probably be more widespread ownership of it, but I think the inequality will worsen before it gets better.”

The “robber baron” metaphor isn’t the only comparison Veldkamp draws between historical events and modern realities. “The Changing Economics of Knowledge Production” compares the impact of the Industrial Revolution with current technological leaps in big data and artificial intelligence (AI). By creating a model to contrast the change in capital intensity from the Industrial Revolution with the change in data intensity currently happening among firms adopting AI, the research found the AI revolution looks to be about half as large as the Industrial Revolution in terms of overall scope and scale.

That’s still significant, Veldkamp notes. “The AI revolution might be half the size of the Industrial Revolution, but it’s still big,” she says, adding that “people who have the skills to work with this new technology are going to benefit.” In fact, one estimate from the paper showcases that a worker in the financial sector with AI skills earns about $22,000 a year more than somebody without knowledge of AI.

“What this means practically for workers is that it’s worth learning to use things like AI plugins for Python or TensorFlow,” says Veldkamp. “These tools are going to have a ton of future value.”

‘Building the Alphabet’ Before Writing the Book

Veldkamp emphasizes that we’re in the early days of this type of research. Though MBA students may one day use data valuation models in their own business plans, we’re still in the foundational phase of such science.

“I’d love to put together an MBA class on the data economy, but we’re still at the point where we’re figuring out how to measure it. We have to build the alphabet before we can write the book,” she says.

Speaking of books, Veldkamp has her second in the works. In 2024, she plans to publish a textbook on the data economy, focusing specifically on academic resources for building a robust body of research on data valuation and the data economy.

“We first have to build up the tools so that people can do this research. Once we can generate knowledge and facts, we can go report them to the wider world,” she says. “That’s where we are right now: building the tools we need to build the understanding.”

Columbia Business School Professors Oded Netzer, Christopher Frank, and Paul Magnone discuss their new book, Decisions Over Decimals, which offers a roadmap for effective decision-making when using data: