Setting the Standards for Data Science

Setting the Standards for Data Science

Move to define AI roles and set ethical standards gather momentum

As data science becomes business-as-usual both in the field of research and development and in business applications, users and developers are increasingly coalescing their thoughts towards the need for ethical standards as well as clearly defining roles and assigning titles in data analytics positions. In other words, the data science industry, as it matures, needs rules to play the game and a set of standards. They need to define who can be called a data scientist, an analytical professional or someone who simply works with data. Right now, everything is in a state of formative flux, which of course is quite natural.

Currently, the role of data science and analytics is expanding very rapidly – either in creating new business models or changing existing ones. Consequently, the demand for analytics professionals is growing at an increasing rate. But their roles are not well-defined and often overlap. Almost every company in the industry has a unique way of describing roles and assigning titles in positions related to data analytics, leading to a chaotic market that is confusing to employers, academic and training institutions, and candidates alike. Even a large number of unqualified candidates are calling themselves “data scientist”, “data architect”, “data engineer” or “analytics professional” – and no one is sure who is what.

The Initiative for Analytics and Data Science Standards (IADSS) has been launched to address this issue, and it has kicked-off a research study on a global scale. This study aims to gain insight into the analytics profession in the industry and help support the development of standards regarding analytics role definitions, required skills, and career advancement paths.

IADSS comprises representatives from academia like the MIT or Imperial College, and from organizations like LinkedIn, Google, and the like. It will rely on the expertise of academicians and industry experts in the data world to ensure that the defined standards are academically sound and rooted in industry realities. IADSS will also work to create awareness and promote data science and analytics standards globally.

The other key issue is the mounting debate for ethical standards to guide data scientists. This is about what kind of algorithms are beneficial or harmful for humanity or deciding how much autonomy a machine should have. For instance, should it be granted the power to override a human decision? Only last year, OpenAI, voluntarily stopped the release of an algorithm that could create a full logical sentence, based on only one suggested word, with chilling levels of accuracy. It had the potential to create a realistic communique that could mimic any standard United Nations message – thus creating a global catastrophe if manipulated by the wrong people. OpenAI rightly decided that it should not release this code.

Though there has been no lack of discussion and debate about ethics in the past years, but there are still no formal standards to guide data scientists themselves. Now, industry bodies in the UK are hoping to change that, as the British Computer Society (BCS), along with the Royal Statistical Society (RSS) and the Royal Academy of Engineering (RAEng), have kicked off work to establish industry-wide professional standards in data science. The objective: to uphold an ethical use of public data, and ultimately, make data scientists trusted professionals – as trusted as doctors, lawyers, or architects.

The flood of data generated by the Covid-19 pandemic about various metrics ranging from health to mobility has certainly helped propel the discussion forward. This, combined with the imminent departure of the UK from the EU, has made industry leaders realize that establishing a trustworthy data ecosystem should be top priority.

Regardless of current geopolitics and global-health crises, there is no denying the fact that the role of data science is exploding. The World Economic Forum estimates that by 2025, every day 463 exabytes of data will be created globally, and that will be combined with algorithms to determine everything from credit scores to allocating welfare benefits.

In this context, the public needs to know where information is coming from, how it is being used, where it is going, and whether the technologies exploiting data are aligned with public interest. Applying standards to industry operations would ensure that data scientists adhere to strict guidelines.

There is no dearth of evidence that data science could benefit from a fresh approach to standards. From an ethical perspective, artificial intelligence is a field that has repeatedly been in the spotlight for the technology’s failure to stick to principles of fairness and transparency, with some algorithms set to exacerbate pre-existing biases within society.

It is obvious how reinforcing the principles of responsibility and dedication to public interest in the fabric of data science could benefit the industry as a whole. And this would be effective only if done right from the earliest stages of a data professional’s development. Having standards in place to regulate the processes would improve the quality of the science in the long run. And correspondingly, it would provide immediate backing to data scientists’ work. Standards, indeed, could act as a badge of quality for the profession overall.

Leave a comment

Your email address will not be published. Required fields are marked *

© 2023 Praxis. All rights reserved. | Privacy Policy
   Contact Us
Praxis Tech School
PGP in Data Science