Despite the hype and confusion surrounding data science, the need for people who can interpret data to help organizations make informed business decisions is very real. But what is required to be successful in this role, and how does someone get started on the path to becoming a data scientist?
In an interview with Information Management, data scientist Irmak Sirer discusses the realities and necessities of data science. Sirer is also instructing a 12-week data science boot camp offered through a partnership between test preparation organization Kaplan and Chicago-based data science and design firm Datascope. Boot camps, he explains, are one of the three main routes people can choose to begin a data science career.
What does it really mean to be a data scientist?
I like the word “scientist” in data scientist. My background is in science, too. I have a PhD, and basically I was looking at data sets in nature or complex systems. [In science] we have the scientific method to deal with a lot of data and extract information from it. In our day and age, businesses have started to gather a ton of data, and science already knew how to deal with [such large volumes]. Hence data science, where we take the tools developed from physics, applied mathematics and computer science and we apply them to business data. This way we can extract the most information in the most correct and [applicable] way. I think the incentive came from today’s technology gaining a lot of pace and businesses actually starting to create a lot of data that they can look at [and learn] from. I think data science is basically applying the scientific approach to business. It’s a pretty simple definition, but I think it exactly captures the whole idea behind it.
Do you think every data-driven organization needs a data scientist?
Clearly data science is very hot right now. Where I think data science contributes the most is basically converting numbers and data into consumable information for human brains; data science helps with decision-making. Beyond the hype and the application of the scientific method to business data, [data science] is a great way to convert the information hidden in the numbers and data into visually and conceptually understandable models that managers can make decisions upon.
Is the need for data scientists as great as some of the research has shown?
With any position like data science, there’s a lot of excitement as people realize they can benefit from it. People see the value and the industry is really excited, which I think is great, but also there is a lot of confusion. The confusion [will die] down as people are getting used to what data science does and how it can [help decision-making], and some of the demand may die down as the extra excitement and hype goes down. But because of that root problem of data containing information which needs to be rotated and converted for humans to understand and make decisions, data science will stay here for a very, very long time - as long as businesses are generating data, and I don’t think that’s going anywhere. So I think there will be some clarification and some dying down of the hype in the near future, but I think data science is not just a fad and it’s actually a fundamental part of data-driven businesses.
Will students who are studying data science be challenged in launching their careers by the hype?
Businesses know now that they need data scientists, but it’s difficult to [determine] how to find good data scientists. This is part of the reason that we were really happy to jump on the idea of a boot camp. Just like the businesses that are not sure where to find skilled data scientists, it’s also not very clear for students how to actually get into data science. And as things become clearer with more [understanding of] the necessary skills, how to gain those skills and how you find a job in data science, I think there will be a steady demand and supply for data scientists.
A boot camp is one way to begin a data science career. When students complete a boot camp, like the one you’re instructing in New York, can they consider themselves data scientists?
In data science, nobody knows everything. If you enrolled in data science school for four or five years you would still not know everything. I don’t know everything. None of the other data scientists that I know knows everything; that’s why data scientists work in teams. The most important property is to adapt quickly and learn quickly on the job. On a daily basis with my projects I see things that I have no idea how to tackle, but I have a strong enough foundation that I can quickly learn, adapt and take those projects head on. This is the most important skill to teach that in a boot camp. [The boot camp] is held by instructors who are actually data scientists. We can guide and we can show what’s important on a day-to-day basis on a data science job, what’s less important, what you need focus on. This is the experience that we are trying to provide.
The way we designed [the boot camp] is completely around projects. The structure is there are interactive lectures on theory in the morning, and most of the time in the afternoon is spent working on the current project. It’s kind of like you were hired as a data scientist and these are the projects you have to finish for your job. At the end of a boot camp, it is almost like you already had three months of job experience. Also, because you are going through actual projects and completing the actual data science project you build up your portfolio during the boot camp. When you graduate at the end of the 12 weeks, you basically have a whole bunch of evidence to show your skills.
How does someone know if they should become a data scientist? Are there skills or inherent characteristics that someone should possess?
In terms of background, data science definitely encompasses statistics and coding. So if you’re a data scientist, you know how to program. You don’t need to be a software engineer or [be able to] to program like an app developer, but you need to know how to program. And you need to have knowledge of statistical analytics. People have different strengths and weaknesses, so when I say statistical analytics, that [encompasses] a lot. Some people have way more experience with machine learning; others have way more experience with regression analysis and correlation hypothesis testing, and again people have different levels of programming. Nobody knows everything.
Also, I would say character properties are important. I think grit, curiosity and creativity are all very important character traits for a data scientist. A lot of the job is about confidence, being able to tackle completely new problems and adapting to things that you don’t know yet. Curiosity is definitely needed to motivate you to continuously keep attacking a problem. And creativity [is important] because a lot of data science is also about the design of the problem. As I mentioned, data science is the rotating of information, taking all the information in that data and making it understandable by humans to make decisions. The most important part is framing questions to answer and figuring out exactly what questions and what type of answer will help. That requires a lot of iterative design, continuously thinking about looking at the problem from different angles; and that takes creativity. I don’t mean creativity as in an artist’s creativity or an author’s creativity, but in terms of being able to come up
with new ways of looking at the data again and again.
Once people determine they want to pursue a career in data science, what then?
I think a very important question that a lot of potential data scientists have is about how to take it on. Where do I start? What do I do? There are three main routes: online courses, master’s programs and boot camps. All have different advantages and disadvantages [[http://datascopeanalytics.com/what-we-think/2014/08/04/how-do-i-become-a-data-scientist-an-evaluation-of-3-alternatives]]. I obviously am a big fan of boot camps because I think that the balance of in-person lectures and real job experience with actual data scientists, having a job portfolio and having placement managers to help you find a job is the perfect storm. But for people who have time to spend in a more intensive master’s program, or other people who don’t have the time to commit three months of their time or the tuition for a boot camp, the massive open online courses are a good option. I [recommend that people] look into these three different options more closely and choose the one that fits them the best.