OpenBI recently concluded our 2013 college recruiting. It was a banner year and we’re excited to have five “newbies” joining the company over the first half of 2014. Initial interviews were conducted at our usual suspect Midwestern university schools, one a big public and the other two private. This year’s hires include two science majors, two engineers and an actuary/statistician. Noteworthy is the absence of computer science or information systems degrees.
Not that we’re uninterested in computer scientists. On the contrary, OpenBI’s business is quite technical and potentially a good fit for CS grads. But CS’ers often seek roles in software development, which isn’t exactly what we do, and might not be as passionate as we are about data.
The good news is that many non-CS students we meet now come with at least a rudimentary computation background, some taking college programming courses voluntarily, others self-teaching languages such as Python for class projects, still more learning computation from summer internships, and even a fourth group that’s self-taught or learned in high school.
What OpenBI looks for in interviews are students passionate about an evidenced-based world driven by data and analytics. The best candidates wish to parlay the scientific rigor they’ve experienced in their university curricula to the business world. And more and more, that curricula is starting to focus on computation and data.
The ideal student background includes demonstrated programming skills in a language like Java, C++ or Python, experience with data management, preferably including SQL, classes in statistics/machine learning that challenge with non-trivial statistical programming, and a capstone internship involving all the above.
We don’t always meet ideal candidates, however, so in our musings OpenBI has encapsulated the priorities we see for prospective business intelligence/data science apprentices into a short university curriculum. At the undergraduate level, that curriculum wouldn’t be as comprehensive as either a major or a minor. Instead, a certificate designation that’s rapidly gaining popularity at top schools would appear to fill the bill nicely. If a minor is half a major, then a certificate is half a minor. Our program would supplement the student’s major, so Jane Doe would leave school with both a degree in Biology and a certificate in Computation/Data.
The OpenBI proposal is a four course sequence consisting of 3 hours/week classes for semester schools, 4 hours/week meetings for those on the quarter system. The first course would be an Introduction to Computation and Programming, and teach Python for the first two-thirds, followed by Java for the remainder. In addition to the basics of object-oriented and functional programming, this course would obsess on data wrangling, driving from libraries already developed for the task. A central theme would be that others have already done much of your work. Take advantage.
The second class is Data and Exploration. The beginning weeks would investigate the logic of databases and data organization, to be followed by a healthy dose of hands-on SQL using open source MySQL or Postgres. The final week or so would introduce the power of exploratory visual analytics against SQL databases with a tool like Tableau. At the conclusion of this class, students would understand basic relational data organization, be comfortable with all facets of SQL, and appreciate the power of visual exploration.
Up next would be Statistical Programming, and would focus on statistical methods with R and Octave (the open source Matlab). Students would be tasked with developing substantial algorithms and programs with these tools in addition to simply applying pre-packaged methods. A major emphasis would be on implementing Monte Carlo simulation techniques.
Our fourth course would be an approved elective. Included in the candidate list would be Introduction to Web Development with Java, Statistical Methods II, Numerical Methods, and Data/Computation Project. Or something else: Biology students, for example, might get approval for a Bioinformatics class.
In OpenBI’s thinking, a four course Computation/Data certificate wouldn’t be onerous and should be accessible to many students, especially engineering, science and quants majors. No matter what their major, students completing this program would position themselves for an apprentice role in data-driven business. A similar graduate curriculum for science PhD’s facing an unfriendly job market fueled by declining government funding makes a lot of sense too.