A well-funded start-up out of Stanford University is attempting to revolutionize the way data is analyzed. The result, if successful, could be akin to the Gutenberg printing press or the electronic calculator, innovative technologies that put hitherto arcane skills in the hands of the masses.
In this case the technology is topological data analysis made available by Palo Alto-based Ayasdi and its Insights Discovery platform. The goal is to enable ordinary business users without specialized training to analyze and discover meaningful patterns in very large, complex data sets.
"Our mission is to build tools that everyone can use. We want to transform any person at any company into a data scientist," Ayasdi CEO and co-founder Gurjeet Singh tells Information Management.
Ayasdi was launched at Stanford in 2008 with NSF funding. Its premise was that with the onset of big data, businesses and institutions would need ever larger numbers of data scientists — far more than currently existed or were ever likely to exist. Ayasdi was founded to devise technology that would make them more efficient and to put analytics tools in the hands of non-data professionals.
Singh says today’s big data technology is essentially the same as small data technologies: query processing systems based on the SQL standard. While the development of SQL nearly 40 years ago was an enormous innovation, data analytics haven’t progressed much since then. It’s become much less expensive to process data and it can be done on a much larger scale now, but data analysts still aren’t doing anything really new with the data itself.
The Ayasdi approach reverses this. Instead of coming up with a hypothesis and then validating it with data queries, topological data analysis parses the data and comes up with statistically valid correlations without having to ask questions.
A topographical map is drawn to represent the data correlations, which greatly simplifies the presentation of very complex data sets, incorporating both structured and unstructured data. People then review maps for “interesting shapes,” such as the ‘Y’ shape depicted in the accompanying example.
‘Y’ Some Cancer Patients Survive Longer
This is an Ayasdi data map extracted from data on cancer patient survival rates. Each colored node represents a group of patients with a similar genetic makeup.
Data scientists are still needed to devise the algorithms that sort and analyze the data. But Singh says it only takes about four hours for an ordinary business user to learn how to use the software and interpret the maps. Moreover, new algorithms can be added without users having to learn new analytical techniques.
Commercial and government organizations are paying attention. Since January, Ayasdi has signed General Electric, Citi Ventures, five of the top 20 global pharmaceutical companies, two oil and gas companies, the U.S. Food and Drug Administration and the Center for Disease Control and Prevention, among others.
"Through powerful analytic models developed over a decade at Stanford, Ayasdi lets you find the needle in a haystack you didn't know was there, quickly.” Jonathan Ballon, GE’s chief strategy officer, said in a statement. “For GE and our customers with vast amounts of industrial data, Ayasdi's technology will be a powerful tool for predictive analytic models that can drive billions of productivity and efficiency savings."
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access