Data has become increasingly crucial for businesses around the world. And with databases filling up with petabytes of data every week, it’s critical that businesses hire data scientists to analyze all of this in order to gain crucial insight.
Data scientists are in high demand. It’s not an easy job, however, and it takes a certain mindset to be successful in the role. We spoke with several experts to find out what it takes to become a data scientist and what to expect when interviewing a data scientist.
What are the essential qualities of data scientists?
Koyuki Nakamori, Public.comThe head of data science at (and recently Headspace meditation platform) told Dice that successful data scientists have a few key characteristics: “Analytical rigor and statistical skills, strong coding skills in SQL / Python and storytelling / communication skills. “
Adam sugano, executive director of data analytics at the University of California, Los Angeles (UCLA) adds: “Data science is a constantly evolving field, with new tools and technologies introduced every year that force workers in this field to constantly learn.
Curiosity is a key quality in data scientists, says Sugano: “Not only do they enjoy the learning process and soak up the new knowledge gained, but they immediately turn around and start thinking about how this new tool, this new method, this new data domain, etc. can be applied to the extent of the problems they have been asked to solve.
How can you show curiosity in your application materials? Sugano often seeks voluntary participation in data contests or further lifelong learning through platforms such as Datacamp. Highlighting your personal data projects or a data-science blog can also help highlight your passion for the field.
“In addition, a data scientist must know how to think about a problem,” Sugano continues. “Often times, I observe people on the ‘business’ side asking questions of data science teams necessary but not sufficient. The best data scientists don’t just take orders, but accompany the questioner, working to understand their world so that they can help frame both the problem and the question in a way that leads to better results. around. This skill is almost impossible to detect just by reading a resume, but can be identified through insightful questions in an interview process.
What emerges from a data scientist CV?
Knowledge of statistical methodology is fundamental when it comes to preparing your application documents. “There are too many people who call themselves data scientists simply because they have completed a four course sequence on Coursera or a 12 week Python bootcamp,” says Sugano. “Don’t get me wrong, these are good starting points, but just because someone lists a Kaggle project on their CV in which they used their favorite project. machine learning algorithm here doesn’t mean they actually know what this algorithm is doing behind the scenes.
In other words, it’s much more than calling a predictive modeling function in R or Python; you have to know Why you do something, as well as how to interpret the results. Knowing the limits of a tool or model is also essential. According to Sugano, “people trained in statistics can not only call up the functions that run the algorithms, but they also know how to properly prepare the data for the model being used, how to tune the model for even better performance, and can answer direct questions about the algorithm. how the predictions were generated and / or what the predicted values mean.
John fordice, Analytics Lead at Bonsai, agrees: “The candidate should be able to express their passion for data science.
Nakamori adds: “Candidates with multisectoral experience, interdisciplinary training (mathematics, statistics, computer science), strong computer experience” are of particular interest to many organizations.
What questions can you expect when interviewing a data scientist?
Nakamori, with Abhinav Unnam (Senior Data Scientist at Aviso AI) and Benn Stancil (Co-Founder and Chief Analytics Officer at Mode) suggest a few questions you should expect in a Data Scientist job interview:
- Python coding test, which typically uses the concept of lists, dictionary, etc. :
- Find all combinations of strings in a specific URL consisting of strings that meet specific requirements.
- Scheduling algorithms for the total time spent through a series of overlapping time intervals. Take the union of the time.
- Machine learning case interview:
- Solve an end-to-end problem statement.
- Define the problem statement, find the solution.
- Explain it in layman’s terms in terms of metrics; why these and how to measure?
- How would you help our sales leadership team decide if the sales team is the right size?
- How to measure the impact of a billboard?
- How would you help an Airbnb host decide how many photos to post on their profile?
- What is the P-value in simple terms?
- Type 1 and Type 2 errors: Explain in simple words.
- How to convert a wide data frame to a long data frame and vice versa in SQL and Python.
- What is XGB and why is it effective?
- What is a random forest? How is the importance of features calculated?
- What is logistic regression? How is maximum likelihood used?
- Code a logistic regression model from scratch using OOP.
- Tell me about a project that you led from its inception to its impact on the company, step by step.
Interviews can be especially difficult with some hiring managers and data scientists, especially if the job itself is ultra-specialized. “All of my questions are tailored to the individual through a combination of the specific nature and needs of the job and the specific skills and experiences that a candidate lists on their resume,” explains Sugano. “In addition, I find it beneficial to give candidates take-home assignments with real data to manipulate and analyze. ”
This kind of process, he adds, “is a better reflection of the real world where workers have Google Search, Stack Overflow, etc. available to them, instead of expecting them to know the answer. to a limited set of programming, statistics or probability questions (if there are 100 bulbs in a row and…). ”
Communicating your results is also extremely important; When you sit down with the recruiter and hiring manager, be prepared to explain to them your logic behind solving problems in a certain way. Much of the job of data scientists is to present data for analysis by multiple stakeholders, including executives.
Are there any online tools that data scientists can use to prepare for an interview?
“Yes and no,” Stancil said. “There are many tools for example technical questions and many online tutorials for learning technical languages. These tools are useful, and for a lot of interviews I think they help.
But for highly specialized data scientist roles, such platforms may be less useful. “The best preparation is to try to solve a problem with the data,” adds Stancil. “It doesn’t have to be a big problem, but being able to talk about those experiences, the problems you encountered and how you tried to resolve them is far more useful and impressive to me than anyone. ‘one who can make a list. predictive models with which they are familiar.
Nakamori encourages Data Scientists to work via “HackerRank, Leetcode, Interview.io, AlgoExpert ”and the seemingly endless YouTube channels available.
Fordice adds: “Maintenancekickstart.com is a great resource with a six-week course for data scientists.
Sugano notes that if you really want to pass the interview, researching your potential employer can yield great results: “Data scientists should do their research from the perspective of really trying to understand the business model of a company. business and anticipate the ways that business is already or should be leveraging data to improve business decisions. Asking questions about all of a company’s data assets and how they are used today, as well as coming up with potential new applications for their use, is a way for a data scientist to stand out by showing a strong interest for the company and a solid activity. insight.”