JUN 8, 2010 4:48am ET

Related Links

IBM Introduces Watson to Consumers in Service for USAA Clients
July 23, 2014
Majority of Organizations Claim ‘Advanced’ Data Environments, Practices
June 5, 2014
C-Suite Tweets about Profit Performance
June 4, 2014

Web Seminars

Improve Omni-channel Shopping Experience with Product Information Management
August 21, 2014

Correlation vs Causality in BI

Print
Reprints
Email

Two years ago, Wired editor in chief and Long Tail author Chris Anderson wrote a provocative article entitled The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. The article's message was that in the era of petabyte-scale data, the traditional scientific method of hypothesize, model, test is becoming obsolete, the victim of the combination of huge data volumes and the computer capacity to process them. “There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.” Not simply content to take on the scientific establishment, Anderson seemed to go after  mainstream statistics as well: “At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics.”

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to information-management.com including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?

Filed under:

Advertisement

Comments (3)
Steve,

I like your "best of both worlds" position. My concern is a "wall" between an organization's IT and its business users. IT likes standardization. They might say, "We own the data and if you want a report, tell me and I will write a program for you." In contrast, experienced business analysts rely on exploration. They require easy and flexible access to data and the ability to manipulate it. They want more than data mining like trying to find a diamond in a coal mine. They speculate a hypothesis and then continually test and adapt their models based on what they learn. They are more "confirmatory".

I guess my leaning is toward the "causal" world. Framing the problem before diving into analysis provides a better solution. It promotes testing a hunch or hypothesis. But the massive compute power makes me wonder if just throwing the computer at data might surface some insights that an analyst has not thought of.

It is a thorny problem.

Gary

Gary Cokins, SAS

http://blogs.sas.com/cokins

Posted by Gary C | Monday, June 14 2010 at 12:39PM ET
great article Steve we all read your articles in our BI stages ..madeira plastica
Posted by Dennis H | Friday, April 01 2011 at 9:32AM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.