(Bloomberg) -- Wu Haishan was at Princeton University studying how schools of fish swim together when the crowd behavior of a much bigger group grabbed his attention: 1.35 billion fellow Chinese.
It was Lunar New Year back home in 2014, and Baidu Inc., operator of the country’s biggest search engine, had created an animation of all the trips people in China make during the holiday -- the largest annual human migration. He soon joined the company as a data scientist in Beijing, where he’s tracking user location information to produce economic gauges such as which urban areas are ghost cities and how many people are buying cars.
Big-data gurus like Wu are bringing the nation’s colossal economy into sharper focus in a more potent way than in other major economies because, unlike most developed nations, China’s official stats are often suspect or incomplete and private gauges can disappear.
"We were running around pointing a flashlight at various things like labor or ports," said Jeffrey Towson, a professor of investment at Guanghua School of Management at Peking University. "This new information is supposed to improve existing information. That’s like turning on the lights and suddenly you see everything."
For now, the explosion of data sources gives global investors a fresh look inside the world’s largest trading nation. China UnionPay Co., the dominant card processor, can handle hundreds of millions of swipes per week. Alibaba Group Holding Ltd. reported 3.1 trillion yuan ($485 billion) of online shopping in the last fiscal year, nearly equal to Sweden’s gross domestic product.
Baidu serves 6 billion searches a day and dominates mobile mapping, which gives location data for its mobile users as well as those of apps built on its map data. That shows, for example, how many people visit Apple stores, and can signal interest in the next iPhone.
Wu used the search and map data to find so-called ghost cities, betrayed by buildings that show little mobile-phone activity. He and his team of 10 used the technology to make a suite of gauges for mall traffic, tourism visits, and industrial and high-tech employment.
“We didn’t know if there was any commercial value," Wu said in an interview at Baidu’s campus in northwest Beijing. Institutional investors did, and they quickly found Wu after his gauges were released in June.
Official data in China still lack key metrics, such as a regular survey-based unemployment rate. An private manufacturing indicator by Minxin was suspended indefinitely this year, and a preliminary factory gauge reading by Markit Economics and Caixin Media stopped last year.
Big data is allowing alternatives to spring up. Cheng Xin, a former McKinsey & Co. analyst now at Alibaba’s research arm, is developing a GDP-type gauge compiled from the company’s trade data. It will take readings from the Taobao e-commerce platform and other data such as transaction figures from Soufun.com, China’s biggest real estate web portal.
"The question is, will the government allow this type of thing to flourish?" said Andrew Polk, head of China research at Medley Global Advisors in Beijing. "If they start showing things starkly at odds with official data, that’ll be a real test of whether the regulatory environment is going to be supportive of these types of gauges."
Wang Zhanwei, a data analyst at Didi Chuxing, China’s answer to Uber, says information companies glean from users can benefit the government. His team plans to mine its ride-hailing data to gauge consumer spending by tracking how often people visit places like malls, cinemas and karaoke bars.
"We’re trying to use data to serve the public," Wang said. "Governments may plan cities better when they know more about how people commute."
Officials are paying attention. "We welcome and are open to big data," said Sheng Laiyun, a National Bureau of Statistics spokesman, adding that the agency includes some of the data in indicators such as retail sales, consumer inflation and home prices. But private providers should be more transparent with their methodologies to earn trust, he said.
Still, processing, sorting and making sense of all the new sources of data isn’t easy, and even the world’s biggest hedge funds can struggle to find a signal in the deep oceans of noise.
As the array of new gauges offer an increasingly complete alternative view on China’s economy, they are mostly verifying official statistics, according to a report by Bloomberg Intelligence economists Tom Orlik and Justin Jimenez, who compared NBS numbers with big data counterparts. The China Satellite Manufacturing Index compiled by San Francisco-based SpaceKnow Inc. rose recently to a multi-year high, just like the official index and a private gauge.
But some of the new data contradict official statistics. China may have stored more oil than official estimates, according to an analysis of satellite imagery by Orbital Insight Inc. in Palo Alto, California.
"There’s a risk of using these methods in policy decisions before we have a complete understanding of their accuracy," said Joshua Blumenstock, an assistant professor at the School of Information at University of California at Berkeley. They won’t replace official statistics, "but they can supplement them, provide additional information and context, and sometimes quick and dirty measurements when official data don’t exist."
For Baidu’s former fish-tracker Wu, the billions of data points that flow daily into his company’s servers allow him to look for economic trends at a much more detailed level, through the personal decisions of users.
"We’re touching dimensions we couldn’t before,” Wu said. “It’s always interesting to see how people move and act as economic animals.”