MAY 1, 2012 4:19am ET

Related Links

Ellison Becomes Oracle Chairman as Catz, Hurd Split CEO Job
September 18, 2014
Stop Governing Your Data - Start Leading Data Behaviors and Outcomes
September 18, 2014
Big Data Gets Bigger Footprint in Insurance
September 16, 2014

Web Seminars

Essential Guide to Using Data Virtualization for Big Data Analytics
September 24, 2014
Integrating Relational Database Data with NoSQL Database Data
October 23, 2014

Data Management is Based on Philosophy, Not Science


There's a joke running around on Twitter that the definition of a data scientist is “a data analyst who lives in California.” I'm sure the good natured folks of the Golden State will not object to me bringing this up to make a point. The point is: Thinking purely in terms of marketing, which is a better title -- data scientist or data philosopher?

Get access to this article and thousands more...

All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:

  • Full access to including all searchable archived content
  • Exclusive E-Newsletters delivering the latest headlines to your inbox
  • Access to White Papers, Web Seminars, and Blog Discussions
  • Discounts to upcoming conferences & events
  • Uninterrupted access to all sponsored content, and MORE!

Already Registered?


Comments (15)
Absolutely the nerdiest thing I will ready today... and I loved it. What you lay out is not so much that data management is a science, but actually a tool of science. So what then is data analysis? Research?

I for one, love the term 'data scientist'. I think it lends credibility to a growing field of analysts.

Thank you for writing this article!

Posted by Jody C | Tuesday, May 01 2012 at 10:51AM ET
I have a couple of philosophy degrees, primarily focused on epistemology, logic, and math. I think you're dead on, and I probably do more straightforward philosophy in a week than my counterparts in academia. There's three areas I use my education all the time:

1. Data modeling is classic foundational analysis. Its a mix of semantics and logic, and while it may look to outsiders like metaphysics, its not metaphysical at all. See e.g. Montague's project at UCLA to reduce language to logic. 2. Business Intelligence is all about justification - which is the branch of philosophy called Epistemology - or how we *know* that something is the case. Classically "knowledge" is defined as "justified true belief" but there are many theories of how this works. I'm personally more of what's called a "reliabilist." 3. MDM is actually the problem of meaning, or how it is that you can discover the referent of two senses. The classic example is "'the morning star' denotes 'venus'" and "'the evening star' denotes 'venus'", but "morning star" and "evening star" are not the same thing, in much the same way one shouldn't use maiden name and married name or mailing address or billing address interchangeably.

There are a few of the gray-hairs with wing-backed chairs left, sadly fewer than there used to be. But for the most part the philosophers you're referring to are as likely to be influencing people at Xerox PARC, SRI or Microsoft as they are to be teaching introductory logic for the 30th time.

Posted by David G | Tuesday, May 01 2012 at 11:10AM ET
Although great, I think there still is imprecision in the column. Data Science might be closest to the Philosophical area, Epistemology (roughly, philosophy of knowing).

The tool/method -centric approach that is popular now is valuable but going back to the roots and limits of them is critical. One might check the somewhat obscure philosopher Wittgenstein and the more popular Popper for limits of science/data regarding knowledge. (Although, I've just given away my edge in the field )

If I ever do the Ph.D., that would be my area. Interested in your and other's views on my comments.

Posted by Phil M | Tuesday, May 01 2012 at 11:16AM ET
Agile developers would love to refer to data analysts as philosophers to malign the discipline of data management as opinion while they deal in facts. It is true that because data management occurs primarily in the planning, analysis, and design phases, that a derived benefit in dollars is difficult to estimate. Data Management in fact is closer to geometry in that there are basic assumptions that must be made, but if followed they provide benefits, such as: one fact one place, management understanding the logical structure of their data to manage the enterprise, parallel the physical with the logical as much as reasonable, share data, implement/maintain data quality. If developers design your database they tend to optimize it to support their development rather than the actual system operation. If designed by report analysts, one tends to find many copies of the same data in the data warehouse, sometimes one for each report, which is responsible for the volume of data multiplying by 2 every seven years. Data analysis is really common sense in harmony with the KISS principle. Principled data management is similar to principled actions in any other walk of life, can be called philosophy or opinion, but is necessary to successfully maintain control of an enterprise's data, limit cost growth, and to stem chaos.
Posted by Thomas B | Tuesday, May 01 2012 at 11:20AM ET
I'm so sad that I can't share this with anyone without them looking at me like I've grown another head.
Posted by Susan S | Tuesday, May 01 2012 at 11:26AM ET
I agree that you have done a great job of characterizing Data Management as more closely aligned to philosophy than science.

There are other aspects of Data Management that generally fall under the topic of data governance and stewardship. I would say these are closely aligned with Management. Borrowing from Wikipedia, management is "the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively."

Data Science exists as another class of activity, more specialized than Data Management but reliant on it. Again borrowing from Wikipedia, science "is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions." Clearly there are many computational physicists, biologists, economist, and others who organize data as a representation of knowledge. They build computational models that produce testable explanations and predictions.

I do work in a university where we have programs that encourage students to apply, analyze, synthesize, and evaluate using data modeling and mathematics. (See or work follows the guidance of the Scientific Method. We produce Data Scientists who have been taught Data Management.

The boundary between data and science is a fruitful one that has produced significant impact. We ought to use the term when it is appropriate.

Posted by Tom K | Tuesday, May 01 2012 at 11:53AM ET
Unfortunately you are confusing the term data science. It is not about data management, it is about experimenting with data and discovering new information. Those discoveries come out of using the scientific method, statistical analysis, and hypothesis testing with data that has been captured, stored, and made available using data management principles - but the two are not the same.
Posted by michael e | Tuesday, May 01 2012 at 6:32PM ET
I love it. We still have some way to go in understanding and positioning Data Management, but this article signals an important component. To some degree we are about facilitating communication from human to human - with the added complication of one or more computers in between.
Posted by Chris M | Wednesday, May 02 2012 at 6:37AM ET
A wonderfully thought-provoking article. And this is the type of thinking that those of us who have been around a while need to provoke. So, on a similiar off-mainstream theme, I offer my most recent blog - "Death by a Thousand Analytics" ( some basic assumptions about BI. Thanks Malcolm!
Posted by Barry D | Thursday, May 03 2012 at 4:03AM ET
Very nice analysis. Very useful distinctions. The obsessive focus on numbers and their organization into neat normalized data tables with insufficient attention paid to what the numbers represent is a constant source of less than optimal decision-making and frustration on the part of decision-makers when they try to interpret the avalanche of reports they receive daily. And the report developers are "too busy" developing new reports to explain what their production reports mean.

My only edit would be to change the title of the article to Data Management "ought to be" Based on Philosophy, Not "just Technology"

An "ought" is not an "is". Philosophy is science.

Data "scientists" are often technologists. Science involves more than technology.

thanks again -

Posted by Charles P | Saturday, May 05 2012 at 10:04AM ET
I have a Masters and BA in Philosophy and have been doing AI and DBMS work since 1985. Every time I interview for a position, especially with a person who wasn't even alive when I started professionally in computer analysis, I get a raised eyebrow and a smirk when they ask about my education. If I start describing how much of philosophical history is precursor and foundation for computer science, I know from experience I won't get the job. Even if I utter a simple comment like "You use Boolean searches in Google, right?"

Thank you for the article. There seems to be less and less foundational theory being taught in schools now in CS, or maybe it is just the study-only-the-current commercial-apps approach of two year specialty "colleges" and "institutes".

Maybe with the upsurge of the Semantic Web and ontologies this will change. Philosophers have been doing merology, logic and ontologies since Plato. We philosophy students have a 2500 year head start over ITT tech majors.

Gary D.

Posted by Gary D | Saturday, May 05 2012 at 10:34AM ET
Great article. Philosophy and the analysis of the world from a metaphysical perspective are the roots for effective data management, particularly scalable data management with interoperable capabilities.

Where I disagree with the article is in some of the old practices of data modeling that are not based on philosophy and the metaphysical world of conceptual patterns, but on physical world modeling of entities. This is where past practices have gone wrong for effective data management. We should have stayed true to the root concepts in philosophy, discovered the appropriate patterns, and not have bought into commercial tools and methods.

While a data scientist may claim that using more sophisticated algorithms and more sophisticated tools (a commercial usually follows at this point)delivers a new kind of information, the basic truth is that if you put garbage data into anything you only get garbage information out, no matter how sophisticated are the algorithms used. Add disparate context and confused meanings to the data and the "information out" is likely misleading and dangerous.

Posted by James P | Saturday, May 05 2012 at 1:38PM ET
After I posted (Gary D, above) I realized a funny typo I made. Philosophers do "merEology"--the logic of parts and wholes, not "merology"--the study of bodily fluids!
Posted by Gary D | Saturday, May 05 2012 at 6:03PM ET
It appears to me that whether known as 'data scientist' or 'data philosopher', the realm in which both work falls under the area known as 'Information Theory' which grew out of work initially performed by Claude Shannon and many other great minds. And ultimately, at the apex of science, aren't all of the great thinkers also philosphers to some degree?
Posted by Gary B | Monday, May 07 2012 at 12:17PM ET
Loved It, made my day :)
Posted by Andreia S | Thursday, May 10 2012 at 12:12PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
Please note you must now log in with your email address and password.