Not all data science job opportunities are equal. I’ve seen hundreds of job descriptions for data science positions that are vastly different from one another. This happens for myriad reasons, not the least of which is that many organizations don’t really have an understanding of what a data scientist does or where a successful application of data science can help the organization.
Some companies advertise for a data scientist when what they really need is a data analyst. Others believe that a data scientist is a data architect or data engineer. I had one business that wanted me to build a database data model! My reply: “These aren’t the data models you’re looking for. (Props to Obi-wan Kenobi)”—I am not a data architect.
So if you are in the position of interviewing, you will need to have a discerning eye for the real data science opportunities versus the openings where they tell you, “someone told us we need a data scientist so we threw this description together.” Here are four questions to keep in mind when you have your phone or in-person interviews.
|Looking forward to becoming a Data Scientist? Check out the Data Scientist Course and get certified today.|
Question 1: Why do you think you need a data scientist?
This question is probably not the most important but should be the first. Whether you are speaking to the CEO, CFO, CTO or hiring manager, you really need to understand why they think they need a data scientist. The chances are good that they need some of the skills that a data scientist will bring to the table, such as data wrangling and visualizations, but often they conflate or confound various data skills into one gigantic term: data scientist.
Data scientists typically do not build data warehouses, or design relational database applications. Nor are data scientists, in general terms, software developers. However, a data scientist will need to query data inside data warehouses or data repositories or data silos. Very often they will also need to stage data from those various silos so that they can analyze that data.
So if the hiring team needs someone to design and architect a data strategy, they probably don’t need a data scientist. At least not yet. If they are looking for someone to build a relational database application, they aren’t looking for a data scientist, although some may have that skill set. If the organization wants someone who can take various data pipes, stage that data, build an associative data model for display and analysis within a business intelligence application, they aren’t looking for a data scientist. That skill-set belongs to a business intelligence developer. And although many data scientists have this skill-set, those are not the core skills and functions that a data scientist needs or will use to perform their work.
Question 2: What business challenges do you want the data scientist to tackle?
This question is key and is related to question one. Chances are, the organization knows its business challenges all too well, however they're not sure how a data scientist can help. Make sure that they can clearly articulate whether these challenges relate to data science. Chances are good that they only need some of the skills that a data scientist may possess, not necessarily the “whole package.” They may have been told that they need a data scientist by someone. Or maybe they read it in a magazine. But they should be able to speak to why they need a data scientist versus, say, a data analyst.
I’ve been hired to do visualizations, which in and of itself is not a bad thing—I can do visualizations. I can use R or Qlik Sense or Tableau or even Excel, but so can a data analyst or a business analyst or an entry-level college graduate. And they don't require the salary I require.
If the organization does not have its sights on someday performing inferential, predictive or prescriptive analytics, then they do not need a data scientist.
Question 3: Do you have a data warehouse?
Many organizations do not have a good handle on their data. And that data immaturity can stand in direct opposition to a data scientist. They may have a lot of data coming at them very quickly from a lot of different sources, all of which is in various data silos or hundreds or thousands of Excel spreadsheets and across too many MS Access databases to count. Some of the data may be in an on-premise relational database or in some Cloud application. But just because a business has a lot of data, even if its big data, doesn’t mean they have the sufficient data maturity that will help a data scientist do their work.
If they do not have a data warehouse or at least some strategy whereby they can take all of this disparate data and organize it for consumption by the various organizational departments, then a data scientist will have a rough go.
Can you imaging telling a stakeholder that you can do the work but it will take you a year to gather and munge the data? Even before you can analyze it? No business would tolerate that, and yet it’s feasible that it may take that long. At some point, a data scientist is going to need to talk to a subject matter expert (SME), maybe several times.
I’ve been in a position in which I was precluded from interfacing with the SMEs. They were “too busy.” I couldn’t do my work. Having data is not the same as having the right data! I once waited for over a year for a view enhancement to a sequel query that was a 15 minute task. The SMEs were too busy and I was not allowed access to the view to make the changes myself. My work came to a crawl. You do not want to work in that sort of environment.
Question 4: Within what time frame are you expecting results?
This may be the most important question. When are they expecting results? Like so many things in life, data science takes time. Even if all the stars align, a data scientist will still need to do a fair amount of data wrangling, analysis, and visualization. And then a fair amount of work validation with the stakeholders before going back and do more wrangling, data visualization and analysis. This is a slow process, not a quick event. And it’s not glamorous. If you expect data science to be glamorous, do something else.
So if the organization that you want to work for is expecting results in 30, 60, 90 days, it's very likely that it is an unreasonable expectation and you may not be able to live up to it. In other words, you’ll be setting yourself up to fail.
So ask questions and get real answers. Take notes and give them the nickel tour of what you think a data science project looks like and compare that with what they are telling you. Then, and only then, will you be able to find the right fitting culture to do good data science work.