Which skills and traits are recruiters currently looking for in a Data Science professional? Let’s find out
Once upon a time people came into Data Science not by choice, but by chance. Back then, anyone with thorough mathematical skills and a fair amount of expertise in computers and/or coding could – and mostly would – steer towards the role of Data Scientist after spending some years as a mathematician or a programmer. They learnt the tricks of Data Science on the job, as it was not yet a full-fledged specialized domain. Data Science was all about application then.
Starting out as an eclectic mix of statistics, mathematics and computing in the mid-20th century, Data Science gradually evolved into an interdisciplinary field that leveraged scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. Around the 1990s, the domain started to attract the business world as it displayed the potential to gather actionable insights which could be of tremendous commercial value. And from then on, Data Science began to grow in leaps and bounds. Talented students deliberately started to choose careers in Data Science.
Admittedly, the basics remain the same – statistics, mathematics and computing still form the core of Data Science. However, as the backbone of digital business practices in the virtual era, the new age data professionals require a host of allied skills that are now crucial both for interpretation and manipulation of data. As we cruise towards another new year, let’s find out which skills and traits are recruiters currently looking for in a Data Science professional.
A good academic track record will never go out of fashion
As always, a PhD in Data Science, computer science, mathematics, or statistics is the greatest resource an emerging Data Scientist can have. But there are other avenues to pursue beyond traditional academia. Dedicated Data Science courses at the undergraduate level are being offered by many new-age universities. One can even pursue certification, diplomas and post-graduation degrees in Data Science after completing a regular graduation in any other allied discipline. Of course, the candidate must possess the basic capabilities required. And a host of very effective online courses on the topic are available too.
Understanding how the industry conducts business will give a definite edge
While specific domain knowledge provides the foundation, a strong understanding of the industry basics allows a candidate to gain insight for that extra mile. A specific skill is only as good as the duration of the related project, and skills can – and must – always be learned, unlearned and re-learned. But knowing which business model the industry is veering towards, will surely make you a winner – because that allows a candidate to remain future-ready. As Michelle Kiem, Director of Program Management and Data Governance at Bio-Rad Laboratories, had recently commented at the DataScienceGO conference held at San Diego: “Problem definition and business acumen alongside maturity are important skills we look for. …We need our data scientists to be able to select and apply appropriate methodology, considering both technical and business constraints.”
Critical thinking and problem-solving should be second nature for a data scientist
Data is a meaningless mass unless you know how to find meaning out of it. And to do that, a keen problem-solving mind is required to figure out what goes where and why. Critical thinking will let you connect the dots, understand where the problem lies, identify a pattern in the mass of data, and structure it effectively to arrive at a solution. Unless you know how the collected data fits into the big picture, you cannot make informed decisions based on facts. We don’t think critically by nature – it is an acquired skill that can be nurtured and sharpened. Grab all available opportunities to be part of stimulating projects, and never stop questioning.
Visualize what you analyse and engagingly narrate what you see
When data speaks in numbers, it speaks to the specialist. But when it speaks in visual images, it is open to all. Data visualization is a powerful communication tool that can take your analysis to its desired consumers. Visually representing data according to the business needs, and connecting the visualization to weave a complete narrative is the best way to present structured data. In this age of online content consumption and short attention spans, the ability to engage your audience with meaningful visuals that tell a story would be an asset in any data professional. Storytelling can be anything from an animation video, an interactive online dashboard, a PPT deck, a PDF or Word-based report, or even a well-designed Excel sheet. The medium does not matter as long as you can visualize your findings and tell it well.
Doing your own programming is the way to go
Data Scientists need not necessarily be computer scientists – but let’s admit it, there’s no data science without programming. Coding is the only way you can instruct the computer to process your data as per specific structuring requirements. Having a spare programmer always at your disposal is neither practical nor cost-effective. To survive, all data scientists need to be comfortable writing codes. Of course they are not required to come up with entire software solutions, but thinking like a programmer and being able to write their own codes – mostly in Python or SQL nowadays – is becoming a regular part of Data Science jobs.
Can you deploy your own models?
Data Scientists work with data not for the fun of it, but because someone wants to extract some actionable insight which that data might throw up if structured the right way. The end-consumer of that insight would, therefore, require a model fashioned out of that data analysis. And however, accurate your model might be, business users will not want it if they can’t use it. Deploying amodel can be done by someone else – just like programming – but you might not be lucky enough to always get that support. Hence, deployment is fast evolving into a mainstream requirement for machine learning engineers. If nothing else, you should be able to create an API around your model and deploy the application – hosted on a Cloud-based virtual machine.