Is 'Data Scientist' the 'Sexiest Job of the 21st Century'? And How Do You Get One of Your Own?Even if you're not versed in advanced analytics and data science, you can understand the thought process data scientists go through.

ByAsha Saxena

Opinions expressed by Entrepreneur contributors are their own.

Sirinarth Mekvorawuth | EyeEm | Getty Images

When you hear the word "data scientist," what does that term mean to you?

Is it the "sexiest job" of the 21st century as theHarvard Business Review建议吗?是我t describe a really smart person with advanced degrees in computer science, applied math, statistics, economics? Someone who analyzes and extracts business value from big data?

Related:Think Your Company Needs a Data Scientist? You're Probably Wrong.

A data scientist can be all of these things and more. This type of professional looks for patterns and trends in large sets of data, using a variety of tools, techniques and critical thinking to arrive atpractical solutions to real-life data-centric problems.

According toHugo Bowne AnderseninHBR,"Data scientists use online experiments, among other methods, to achieve sustainable growth. They also clean, prepare, validate structured and unstructured data to build machine learning pipelines, andpersonalized data productsto better understand their business and customers and to make better decisions."

Now, even if you didn't go to school in advanced analytics and data science, understanding the thought process data scientists go through might help your early-stage startup understand what it is exactly these professionals do:

Data scientists ask good questions.

Any data science project will have a set of expectations on deliverables, goals, results, length of time, etc. And as James Le pointed out in theMediumarticle "How to Think Like a Data Scientist in 12 Steps," a helpful way to understand a person's expectations more fully is to ask good questions.

"Good questions are concrete in their assumptions, and good answers are a measurable success without too much cost," Le wrote. Improving your skill inasking good questionsis valuable in any business situation. It may help your early-stage startup if you're on a journey to become more data-driven. Mark Schindler discussed on TowardsDataScience.com how "creating a question landscape" can be useful for creating adata strategy; he suggested placing your questions into three categories:

  • What questions could you answer right now?
  • What questions could you answer if you did a little digging with your current data?
  • What questions can't you answer because you don't have the data yet?

Schindler offered examples of questions for each category:

  • "How many downloads did you have in the past 30 days?" might fall into the first category.
  • "What are the age demographics of your most frequent users in the past 30 days?" might fall into the second.
  • And "What is the average session length of your top and bottom quartile of users?" might fall into the third.

This useful exercise helps you figure out things about your business and your data that you can answer, and may point you in the direction (adata road map) of new questions or hypothetical scenarios which may not yet be proven or known that you would like to explore further.

Data scientists understand how to identify data sources and their value.

Bill Schmarzo,CTO of big data practice at Dell EMC, created a 28-page white paper,Think Like a Data Scientistworkbook. In it, he delved into the data scientist thought process of usingpredictiveand prescriptive analytics to find the right answers so that a business can achieve its objectives.

Related:3 Ways Scrappy Entrepreneurs Can Keep Data Scientists on Board and Motivated

I particularly liked the section called "identify data sources," which explained that during the eight-step workbook exercise, a reader will find all kinds of new data sources that "might" provide value with respect to: 1) the targeted business initiative (increasing sales, revenue, website traffic, conversions, etc), and: 2) key business decisions he or she is looking to answer. Likely data sources, the white paper said, include:

  • Historical operational and transaction systems data (ERP, financials, HR, supply chain, sales force automation and marketing, for which data is captured, but likely not available on readily accessible platforms.
  • Internal unstructured data sources like email conversations, consumer comments, clinical studies, research papers and notes from employee and customer interactions.
  • External data sources, including social media, newsfeeds, weather, traffic, economics, research papers, white papers and public domain data from government and college institutions. (like the Think Like a Scientist workbook).

In the workbook, Chipotle is a frequent example. Chipotle's data sources could consist of: point of sales transactions,market baskets, Product Master, store demographics and competitive stores sales, store manager notes, employee demographics, consumer comments, Yelp, Zillow/Realtor.com,Twitter/Facebook/Instagram and more.

Once you have identified a variety of data sources, the next step is to assess the business value that each source brings with respect to supporting certain key business decisions. You can set up a spreadsheet and plot the data sources as row headers in the first vertical column, then plot the key business decisions as horizontal column headers in the first row. In the example with Chipotle, some of the business decisions were:

  • Increasing store traffic
  • Increase shopping bag revenue
  • Increase promotional effectiveness

You can do this exercise yourself by putting in business use case questions relevant to your industry and startup.

Some helpful tips for hiring your first data scientist.

If you're a startup or business looking to expand into the world of big data and machine learning so as not to fall behind your competitors, it might be time tohire your first data scientistengineer. Hiring for this role can be more complex than, say, a software developer.Forbescontributor Shourjya Sanyal recently wrote, in a post titled "How To Hire Your First Data Scientist," that this task is more complex because:

  • It is difficult to write up a job description for a data scientist role.
  • A large number of data scientists may be willing to apply, yet few have the required experience.
  • Few industry standards and benchmarks are available.

Shourjya suggested questions that may help the interviewing process. If, for instance, you are building a data product or app, hiring a scientist directly from the academic world and allied research laboratories might not give you the "software engineering experience, as well as some management experience" needed to prioritize tasks and drive your business's value.

Of course there is always the example of Uber which hired aVP of data science at Twitter. Scraping and preparing data from different sources to builddata pipelinesmay also be helpful for launchingdata-driven products. Shourjya also mentioned asking about a candidate's portfolio, and, if it includes a team project, what specifically the candidate did to contribute.

Overall, you need to define your company's needs for your first hire. In"The art of hiring data scientists,",Sara Vera, a data scientist at Insightly, wrote that, "If you're looking to build ad-targeting or recommendation engines or [to] do algorithmic training, then you are going to want to look for a candidate with really strong mathematics and computer science background."

Just remember that the field of data science is relatively new and is often categorized under an umbrella of different but overlapping skill sets such as data mining,data engineering, data prep, artificial intelligence, machine-learning, analytics, big data, statistics or even data visualization.

For instance, if you need a data scientist who will be reporting to managers about how your product is doing, or how user growth has increased or dropped off, then finding one good at "data storytelling" is helpful.Forbescontributor Brent Dukes described this kind of job as "a structured approach for communicating data insights that involves a combination of three key elements: data, visuals and an overarching narrative of what is going on."

And, as Vera wrote, these types data scientists may come from "social science academic backgroundsbecause in the medical field, for example, sociology, economics or geography makes them accustomed to doing this with their data already."

Related:The Five AI Professionals Companies Need to Succeed in 2019

一旦你做更多研究究竟是什么年代tartup needs, you can make a more informed hiring decision on who your first data scientist hire should be. Your foray into understanding this professional's engineering and computer programming language strengths -- and weaknesses -- can help you build a hiring road map toward growing an entire data science team down the road.

Asha Saxena

Aculyst Inc .的总裁兼首席执行官

Asha Saxena isColumbia University Professorand founder ofAculyst Inc. a healthcare big data solutions and advanced analytics services firm

Editor's Pick

Related Topics

Business News

Jeff Bezos Lost $5 Billion in 1 Day After Amazon FTC Lawsuit News

The lawsuit accuses Amazon of engaging in anticompetitive practices, which has led to a sharp decline in the company's stock value and a substantial reduction in Bezos's net worth.

Business News

Katy Perry Is Fighting the Founder of 1-800-Flowers for a $15 Million California Mansion He Doesn't Want to Sell Her

The eight-bedroom, 11-bathroom estate sits on nearly nine acres in the Santa Ynez foothills in Montecito.

Business News

Why Barbara Corcoran Chose Her Business Partner After Looking Inside Her Purse: 'Best Hire I Ever Made'

Esther Kaplan served as President of the Corcoran Group until 2000.

Growing a Business

So Your Company Is Talking About Transformation — But Is It Ready? Here's How To Tell.

Transformation is one of a company's many choices — but if a team opts to do it, they have to be sure the business is ready, willing and able.