All posts by admin

Women in Technology – Praxis Scholarships

Women data in data science can reduce algorithmic bias; Praxis launches a scholarship program to bridge the gender divide

Delve into history of computing and you’d find that women played integral roles in the World War II war efforts in the United States. As men were drafted in to fight, many ‘computers’ (a role performed by humans who computed manually) during WWII were women – frequently with degrees in mathematics. Women were involved in calculating ballistics tables during the Wars. In the sixties too, women employees engaged in computing work were called “Computers”; but since then there has been a steady decline in women in technology.

Studies have shown that only a third of the IT workforce is women, and it is dwindling specially in India when women employees drop out in their third year of employment to raise a family. The gender gap in data science is even sharper. In a class of 35 students hardly six of them would be women.

Some international studies put women in as little as 10% of the industry’s executive leadership positions, compared to 23% across all industries. This is even though diversity, including gender diversity, is just plain good for business. In 2019, the top quarter of companies for gender diverse executive teams were 25% more likely to have above-average profitability than their counterparts in the bottom quartile.

Women hold just 18% of data science jobs in the US, and the problem is worse in most lower-income countries, where women are less likely to have access to the science, technology, engineering, and mathematics (STEM) education that provides a entry-point to a career in data science. In addition to increasing the risk of bias, gender imbalances in STEM and data science training make it harder for women to succeed in high-paying professions linked to the digital economy, further widening gender pay gaps.

Every study indicates that there is a compelling business case to have women in leadership roles in companies and it is even more important to include women data scientists. The reason is simple; bias in artificial intelligence is a major challenge and bringing in more women into the teams is a sure way of eliminating that unfairness in data models to a large extent.

As we navigate the lasting effects of the pandemic and social unrest, mitigating AI bias will continue to become more important. Here are several ways to get your own organization to focus on creating fairer AI:

  • Ensure that training samples include diversity to avoid racial, gender, ethnic, and age discrimination.
  • Whether labelling audio samples or generic data, it is critical to ensure that there are multiple and different human annotations per sample and that those annotators come from diverse backgrounds.
  • Measure accuracy levels separately for different demographic categories to see whether any group is being treated unfairly.
  • Consider collecting more training data from sensitive groups that you are concerned may be at risk of bias – such as different gender variants, racial or ethnic groups, age categories, etc.–and apply de-biasing techniques to penalize errors.
  • Regularly audit (using both automatic and manual techniques) production models for accuracy and fairness, and regularly retrain/refresh those models using newly available data.

It is being noticed that gender bias is creeping into algorithmic models as those are being designed by men. In an attempt by Amazon to design a computer program to guide its hiring decisions, the company used submitted resumes from the previous decade as training data. Because most of these resumes came from men, the program taught itself that male candidates were preferable to women. While Amazon realized this tendency early on and never used the program to evaluate candidates, the example highlights how relying on biased data can reinforce inequality.

We at Praxis have decided to do something about getting more diversity, especially women, in data science. We have launched Praxis Women in Tech (WiT) Scholarships, an initiative to encourage and support women participation in tech, data and management careers. This is in line with our belief that gender diversity in the workforce, brings immense value to the organization, the economy, and the society. As a part of the Praxis WiT Scholarship program, women candidates who get selected to the prestigious PG programs* at Praxis and fulfil the WiT eligibility* criterion, will win a Scholarship of Rs 1 Lakh.

Click here for the details of the Praxis Women in Technology Scholarship program

Top data scientists to follow on Twitter

Data science plays a vital role in the current world and will play a huge role in the future to solve all the problems that seem unsolvable now. As a student or anyone interested in data science, you need to be updated about the latest news regarding data science, Artificial Intelligence, and every other technology that will help save the world. So, what better place to do that than Twitter?

Read our article on: Expert tips to land a data science job in 2021

Twitter, one of the leading social media platforms in the 21st century and also one of the chattiest technologies is the place where people share their opinions and make memes go viral with the entire world. However, instead of using Twitter as a pleasure mechanism that distracts you from deep work, you can transform it into a platform for learning and exploring new ideas. Here are some of the top data scientists to follow on Twitter.

Yann LeCun

Yann LeCun is a famous personality in the tech world. He is a French computer scientist who heads the AI department at Facebook. Yann Specialises in Machine Learning, computer vision, robots, and computational nanoscience. He is the founding director of the NYU center for Data Science. Considered as one of the pioneers in the field, he has also received A.M Turing Award for his innovations in the field of deep learning in 2018. 

Follow Yann LeCun on Twitter 

Read our article on: How is data science as a career?

Kirk Borne

Kirk Borne is an analytics veteran, educator, space scientist with Ph.D. in Astrophysics, and more! He has taught Data Science, statistics, data ethics, computational modeling, and more for 12 years at Mason University. He has also worked with NASA for more than a decade. Currently, Kirk is working at Booz Allen as the principal data scientist. He is an important personality in the world of data science and if you’re serious about data science, you should follow him on Twitter.

Follow Kirk Borne on Twitter 

Read our article on: Do you need to know coding to learn data science?

Sebastian Thurn

Some of you might have heard his name in the tech world- Sebastian Thurn is the founder of Google X, a research project intended to investigate far-off technologies and possibilities. Google X is the project that is behind Google Glass, Autonomous vehicles, and more technologies. He is also the founder of Udacity, an educational platform. Currently, Thurn works as a researcher at Stanford University. He is one of the top data scientists you should follow on Twitter.

Follow Sebastian Thurn on Twitter 

Andrew Ng

Andrew Ng is one of the most prolific specialists in the world of AI and ML. He is the co-founder of the Google Brain project, deep learning, and AI research division of Google. He worked as the vice president and Chief scientist at Baidu, leading its AI group. He also co-founded the popular educational platform, Coursera.  

Follow Andrew Ng on Twitter 

Wes McKinney

If you’re a Python master, Wes McKinney is a data scientist that you should know. Wes McKinney is the founder of the data library Pandas that is intended for Python coding language. He also created Ibis data and Apache Parquet and is also the co-creator of Apache Arrow. If you’re a coding geek, Wes McKinney is one of the top data scientists you must follow on Twitter.

Follow McKinney on Twitter

The application of data science to business and IT is keeping the world moving forward in this data-driven world. So, if you’re curious and want to feed your brain with the latest news and information about data science, this list of top data scientists to follow on Twitter would help you. But remember, Twitter is only as good as the people you follow.  

As a premier business school in India, Praxis offers a 9-month full-time postgraduate program in Data Science. With our vast experience in business education, we offer students both the time to understand the complex theory and practice of data science concepts and the guidance from knowledgeable faculty who are available on campus for mentoring. We also have a well-structured campus placement program that ensures interview opportunities with the most significant companies in the field.

Expert tips to land a Data Science job in 2021

Right, after spending numerous hours learning and coding for more than a year and sacrificing your weekends and holidays to sculpt yourself as a data science graduate, you’ve finally completed your data science program and earned your certificate. Now, a data science job in your favorite company with a hefty salary package would be perfect closure, isn’t it? As much as this sounds great, as a beginner, getting a data science job in 2021 can be hard. Data science job in 2021 is a highly sought-after field by both job-seekers and hiring managers. So, if you’re a beginner looking for a data science job, here are a few expert tips to land a data science job in 2021. 

Read more on: Top trends in Data Science Jobs in 2021

Find your role

Whether it is data science or any other field, finding the right role where you’d fit in is the first step. Pursuing the right role will lead to a better application response rate, interview experience, and makes you a better candidate. One thing you have to keep in mind is that the role of a data scientist can look very different depending on the size and stage of the company. The title data scientist is used to mean different responsibilities for different companies, and sometimes you may even have different responsibilities within the same company. So, analyze and find the right role for you.

Read more on: Do you need to know coding to learn data science?

Make a systematic action plan

Whether you like it or not, job hunting itself is a full-time job. You need to apply to multiple companies, network, prepare, practice, and never miss out on a recruiter a call or email. You need to give a clear daily or weekly target to these activities. This way, you’ll be able to make the most out of your job search. Also, one important thing you need to keep in mind is that job search is a funnel process, you might get multiple rejections, but these are part of the process and you need to continue job hunting till you succeed. 

Read our article on: How to learn data science job from scratch?

Craft your online presence

The current COVID-19 crisis significantly changed how employers hire. Data science recruiters are turning their attention to virtual employer branding, outreach, and assessment tools. So, it is crucial to craft your online presence. One of the most important online platforms where recruiters assess you is LinkedIn. So, choose a professional image, create a summary section optimized for SEO keywords, and make sure you’ve listed all your skills and certificates. You should also brush up on your virtual interviewing skills. Set up practice sessions to practice your personality and behavior in the interview.

Negotiation

Now, this is the last and most important tip to land a data science job in 2021. Once you’ve aced the interview and the recruiter comes to the salary part, you have the leverage to seal the deal. Always assume that the salary is negotiable and never settle for the initial salary offer. Consider the company culture and what you bring to the table, this will help you negotiate a better salary. Many start-ups offer equity and larger companies include stock options, so make sure you negotiate for a better salary. 

Companies across the world are actively searching for data science professionals. By taking advantage of this time to pivot into data science, you can succeed in a high-demand profession and take advantage of a job market that continues to thrive in the midst of a global pandemic. As a premier business school in India, Praxis offers a 9-month full-time postgraduate program in Data Science. With our vast experience in business education, we offer students both the time to understand the complex theory and practice of data science concepts and the guidance from knowledgeable faculty who are available on campus for mentoring. We also have a well-structured campus placement program that ensures interview opportunities with the most significant companies in the field.

PC: Image by Free-Photos from Pixabay

The Year of the Swan

The surge of crisis innovation has only just begun, and Black Swan innovations would unleash remarkable discoveries in all spheres

From digital transformation, mRNA (messenger RNAs) to drone delivered medicines, the black swan event of the pandemic has unleashed unprecedented forces of innovation across industries from MedTech to EdTech. Some of these innovations are a huge scientific leap forward that will deliver remarkable benefits to humanity for years to come.

A black swan is an unpredictable event that is beyond what is normally expected of a situation and has potentially severe consequences. Such crises have always been a force for innovation because they create a sense of urgency amongst us, bring in purpose of action, and break down silos to forge collaboration across disciplines, technologies, teams, and borders. This urgency helps everyone to focus on the task in hand and channelize resources into solving the burning challenge.

The development of the COVID-19 vaccines in record time of one year itself is a remarkable example of global collaboration. The scientists of Wuhan, China, who were the first to face the onslaught of the virus, shared its genome structure with the rest of the world. Using this valuable information, pharma giants of the western world developed the vaccines, and Indian companies collaborated with them to manufacture millions of doses to meet global requirements.

This was the culmination of years of research going on in this field and huge investments being made by pharma companies in using data science to crunch massive volumes of data to extract actionable intelligence. Teams with expertise in vaccine production shifted their focus almost exclusively to producing a COVID-19 vaccine at scale. Similarly, gene-sequencing labs that had been using their sophisticated equipment for research projects reallocated all of their skills, equipment, and lab resources to COVID-19 testing.

The Broad Institute in Cambridge, Massachusetts, and the Sanger Institute in Cambridge, England – both of which normally use cutting-edge DNA sequencing approaches for research and clinical trials for a wide range of diseases – took less than two weeks to turn all of their lab testing resources to the DNA sequencing of COVID-19 samples for patient testing. A new AI COVID-19 screening test, named CURIAL AI, which uses routinely collected clinical data for patients presenting to hospital proved to accurately diagnose disease in a wide range of populations.

Scientists are now experimenting with COVID-19 vaccine technology as a way to treat terminal illnesses like cancer and HIV. That’s because the coronavirus pandemic pushed scientists to create a first-of-its-kind vaccine using mRNA, or a small piece of a coronavirus particle’s spike protein, to create an immune system response that protects from infection. It’s an approach vaccine researcher have been studying for the past 25 years.

Traditional vaccines are made up of small or inactivated doses of the whole disease-causing organism, or the proteins that it produces, which are introduced into the body to provoke the immune system into mounting a response. mRNA vaccines, in contrast, trick the body into producing some of the viral proteins itself.

mRNA-based vaccine hit the headlines in 2020 after the quick development of two candidates to protect against SARS-CoV-2. This sudden breakthrough was built on the back of more than a decade of research into mRNA vaccines, both for infectious diseases and oncology. Moderna and Pfizer/BioNTech’s jabs were the first mRNA-based vaccines to receive even emergency authorisation from key regulators, and real-world data from the global COVID-19 rollout would play an important role in validating their long-term efficacy and safety profiles against the coronavirus and other viral agents.

The potential shift to telemedicine has been a somewhat contentious, decision for almost a decade. Some health care organizations implemented telemedicine half-heartedly while others didn’t, and leadership teams argued about whether it should be a priority. When the COVID-19 crisis hit, telemedicine became an imperative that was no longer debated.

In a contactless world of remote everything in China, robots were designed to deliver medicines, meals, and to collect bed sheets and rubbish in hospitals. The e-commerce giant JD developed a drone program to drop parcels and to spray disinfectant. Smart helmets were made identify anyone with fever within a five-meter radius.

Created by a San Francisco start-up, Imeve, the AVATOUR remote presence platform, offered AR/VR collaboration tools by allowing multiple users to visit a real remote location in real time, providing a new and effective substitute for business travel. Shenzhen-based camera maker Insta360 builds the 360-degree cameras that make AVATOUR possible. Partners are able to walk through an entire factory to review progress, just as if they are there.

For other businesses, especially those where digital and automation technologies are not commonly used, the crisis led to drastic changes in their interactions with consumers. In education, teachers from elementary schools to universities transformed content and delivered it online or through phones. Retailers started to license Amazon’s Just Walk Out technology that combines computer vision and AI to bill customers directly as they walk out of the store, with no checkout required. Many cultural industries – museums and galleries, cinemas, concert halls, independent musicians, and artists – found means to create, perform, and connect with their audience through online platforms, which brings much appreciated comfort to people confined in their homes across the world.

While AI was proving to be invaluable for new drug discovery, it was also being used for drug repurposing which offers rapid and cost-effective solutions for therapeutic development. The pandemic proved to be a good opportunity for introducing advanced AI algorithms combined with network medicine for drug repurposing.

The advantage of drug repurposing is that that drug is already approved. It’s already gone through the regulatory process to show that it’s safe and effective for something. So, if you can find additional uses for that drug, you already know there’s a good safety profile. A classic way to repurpose drugs is through network medicine, which includes the construction of medical knowledge graphs containing relationships between different kinds of medical entities (e.g., diseases, drugs, and proteins) and predicts new links between existing approved drugs and diseases (e.g., COVID-19), this is where AI is proving to be a fantastic tool.

Black swan events are characterized by their extreme rarity, severe impact, and the widespread insistence they were obvious in hindsight. The surge of crisis innovation has only just begun and this year we will see the commercialization of large number of technologies from quantum computing to DNA storage of data. COVID-19 has forced everyone to transform digitally and this is generating Big Data in unprecedented volumes, technologies to store this data will become critical. This volume of data will mean that AI will become even more accurate in its predictiveness than before. Introduction of 5G in more countries would eventually enable the ecosystem of Internet of Things, remote surgery, and open new vistas in driverless cars. A contactless world will witness major innovation in robotic process automation. The use of augmented, virtual reality will become mainstream and miniaturization will perhaps usher in nanobots that will swim in our bloodstreams to deliver medicines to precisely where it is required.

The Data (R)Evolution

The global landscape of Data and Analytics is evolving rapidly. Here’s what you need to know.

Of the plethora of revelations that the COVID-19 pandemic has brought to global businesses, one major aspect is this: for organisations making use of traditional analytics techniques based on large volumes of historical data, several models were found to be rather irrelevant, rendering much of the data useless. Therefore, one of the changes that several (forward-looking) analytics teams worldwide are undergoing now involve pivoting from traditional ‘big’ data AI techniques to analytics that require lesser, or ‘small’ and more varied datasets.

Source: Gartner

This is, of course, just one of the several trends that Consulting giant Gartner outlines as being forerunners for Data and Analytics teams in the coming years. Overall, they have clubbed trends under the following (main) headers: (i) Accelerated Change in analytics through more effective AI and data sources; (ii) Operationalising business value using more effective XOps and (iii) Flexible storytelling for a wider audience using insights from data. According to Gartner, such trends can “help organizations and society deal with disruptive change, radical uncertainty and the opportunities they bring.”

I.        Accelerating Change

Leveraging innovations in Artificial Intelligence technologies, the science of data analytics is set to become much more composite, agile, and efficient in integrating diverse data sources.

  • Smarter, more ethical AI: Smarter scalable AI will not only improve learning algorithms but also increase efficiency and reduce time-to-value. Small data techniques and adaptive machine learning will thus form the pillars of a more responsible and ethical AI.
  • Composable analytics on the Data fabric architecture: Components from multiple data, analytics and AI solutions will be used in conjunction for a much more flexible and user-friendly experience through the use of the smarter data fabric architecture.According to Gartner: “Data fabric reduces time for integration design by 30%, deployment by 30% and maintenance by 70% because the technology designs draw on the ability to use/reuse and combine different data integration styles. Plus, data fabrics can leverage existing skills and technologies from data hubs, data lakes and data warehouses while also introducing new approaches and tools for the future.”
  • Small and wide data: The increasing number of challenges posed by the complexities of AI and scarce data sources will be tackled using small and wide data. Small data will be used to develop newer models with lesser data but similar useful insights, whilst “wide data — leveraging “X analytics” techniques — enables the analysis and synergy of a variety of small and varied (wide), unstructured and structured data sources to enhance contextual awareness and decisions”
    II.     Operationalising Business Value

Enabling improvements in decision-making and making the transformation of data into analytics an integral part of business processes is set to be key in operationalising business value for firms, through the use of more effective XOps.

  • XOps: The objective of the XOps landscape (including aspects like data, modelling, machine learning, platform etc) is to reduce redundancy and achieve scalability using DevOps best practices. This will ensure reliability and reusability while enabling automation and reducing technology and process duplication in flexibly designed and governed decision-making systems.
  • Decision Intelligence: Engineering decision intelligence (including complex adaptive system applications, AI and conventional analytics) in congruence with data fabric to enable organisations to gain insights quickly and drive business processes more accurately and repeatably.
  • Data Analytics as a core business function: The indomitable significance of data and analytics to accelerate business initiatives will no more remain a secondary focus — becoming a primary core function instead. Research has shown, that if Chief Data Officers (CDOs) are used in setting up business strategies, business value can rise by a factor of almost 2.6x.
    III.   Distributed Everything

Digital storytelling through the flexible relating of data and insights in order to reach and empower an even wider audience will be central to the future of businesses.

  • Everyday Graphs: Graphs form the foundation of modern data and analytics allowing for improved collaboration between business verticals and analytics, and businesses have started recognising that. In fact, almost 50% of Gartner inquiries surrounding AI follows a discussion around graph technology.
  • The Augmented Consumer: “Traditionally, business users were restricted to predefined dashboards and manual data exploration. Often, this meant data and analytics dashboards were restricted to data analysts or citizen data scientists exploring predefined questions.

However, Gartner believes that, moving forward, these dashboards will be replaced with automated, conversational, mobile and dynamically generated insights customized to a user’s needs and delivered to their point of consumption. This shifts the insight knowledge from a handful of data experts to anyone in the organization.”

  • The Edge: As more and more data technologies move outside traditional data centre and cloud environments, latency for data-centric solutions will be reduced, enabling greater real-time value. Moving analytics to the edge will allow data teams “to scale capabilities and extend impact into different parts of the business. It can also provide solutions for situations where data can’t be removed from specific geographies for legal or regulatory reasons.”

P for Productivity, P for Process Mining

Why an increasing number of firms worldwide are investing in Process Mining to streamline business models

Regarded informally as the ‘sexiest profession of the coming decade’, data science is all set to be pivotal to the way companies are run and business decisions are made in the coming years. In fact, experts even expect digitally transformed enterprises to lead worldwide GDP growth over the next three years itself. At such a juncture, it is almost an indispensable necessity to ensure one is well versed with the major facets of this transformation – data analytics and process mining – to ensure competitiveness in a rapidly changing world.

Considered direct compliments, process mining and data analytics are two distinct technologies that will be central to future business processes and data-driven decision-making. According to tech conglomerate VentureBeat, “Process mining helps identify inefficiencies or opportunities for improving how companies do things, while analytics help businesses measure performance and identify opportunities. Together, they can deliver the best of both worlds. Better analytics and data prep workflows allow process mining tools to offer a glimpse inside various business processes. And process mining tools help executives understand and improve data science processes used by applications and improve overall reporting.”

Not just a back-burner issue

Historically, some of the most fundamental challenges around business process management has almost always been treated as a ‘back-burner issue’, not being given the primacy it deserves. A major problem, in this regard, is that in most cases, organisations are much more interested in the improved ‘to be’ process, rather than exploring the ‘as is’, or the current process. However, understanding the caveats of the current process is crucial to knowing whether or not the process needs to be reengineered in the first place, where performance problems may exist and the degree of variation in the process across the organisation. This is why some companies often tend to just skip current process analysis altogether, or pay consultants heavily to analyse it. This is where process mining comes in.

Process mining generally refers to a family of techniques in data science and process management to support operational processes based on event logs. The primary objective, is of course, to turn data into insights and actions as quickly and efficiently as possible. Process mining today is currently optimised for processes chiefly within the realm of ERP or CRM applications but requiring hefty manual work to handle other applications. However, owing to the widespread digital transformation and cloud adoption being carried out around the world today, organisations must find a way to automate this rather time-consuming and cumbersome manual process to handle the large swathes of data with utmost efficiency.

According to VentureBeat, “…analytics tools like Alteryx help organize, prepare, and reformat data in a form suited for process analytics. This makes it easier to identify more dependencies or bottlenecks within a process. For example, improved visibility into a driver monitoring app may uncover a manual step that is causing bottlenecks for processing shipping manifest logs. This step can be automated using something like robotic process automation (RPA) technology.”

Bye-bye Bottlenecks

As many organisations have already found, scaling up analytics processes may cause several bottlenecks within enterprises as well. Process mining tools such as ABBYY Timeline, in this regard, makes it much easier to ‘understand, streamline and automate’ analytics as well as facilitate usage in downstream applications.

Consider this example from VentureBeat: “..an Alteryx customer ran a monthly process to calculate its fixed assets that took 40 hours and required a team of 10 contract workers to manage. Modelling this with process mining and creating a repeatable workflow allowed them to reduce that to 2.5 hours. Process mining automatically documented the process, which was useful for compliance and governance.”

Additionally, if one takes into account the challenges involved in coordinating with large dispersed teams across verticals, such as marketing and sales teams interacting with logistics for invoices, shipping or purchase orders, a unified analytics tool can prove to be of major help. By reformatting the data into the format required by the process mining tool, it becomes much easier to generate better data, visualise a larger variety of processes and therefore improve business performance.

The aforementioned, of course, works best when firms have repeatable workflows: such as in analysing volume discounts, running fraud detection or analysing a complex asset mix. Furthermore, process mining comes in especially handy in cases involving a blend of data from multiple databases, applications, spreadsheets or documents maintained by multiple departments, thereby cutting on time and improving efficiency.

Confidential Computing – An Emerging Concept

A new Cloud-based technology that isolates confidential data during computation in a secure CPU enclave, could be the future of data security

The SolarWinds hacking incident last year that hit 18,000 customers worldwide, including US government organizations, has sent a chilling message of cybersecurity experts about a new kind of threat, the ingenuity of which has fooled them all. Hackers did a simple thing; they hid malicious code within good code that was trusted by customers. User of Orion, one of SolarWinds software products, were blissfully unaware that they were using malware infected software for nearly nine months before the breach was detected. According to some estimates American businesses and government agencies could be spending upward of $100 billion over many months to contain and fix the damage from the Russian hack.

The Backstory

SolarWinds is an Austin, Texas-based information technology firm. One of SolarWinds’ products is a software system called Orion that is widely used by companies to manage their IT resources. According to SEC documents, SolarWinds has some 33,000 customers who use Orion. Hackers breached SolarWinds’ systems and inserted malicious code into the software build process. The breach of the CI/CD pipeline went undetected for many months and, as a result, numerous product updates were unwittingly shipped by SolarWinds to customers that included the inserted vulnerabilities. The inserted malicious code introduced a backdoor, allowing hackers to gain access to the software running on SolarWinds’ customers’ infrastructures. The hackers found a way to legitimize the malicious code by injecting it into the build pipeline. The SolarWinds CI/CD build pipeline was producing digitally signed and trusted software for over 18,000 customers worldwide. The real issue for clients is complex. The build pipeline produces builds, and these builds are digitally signed with the SolarWinds certificate trusted by the Certificate authorities in various operating systems and browsers. If clients were to revoke the digital certificate, they would be revoking both the good and the bad coding.

A New Concept

The cleverness of the hack has made cybersecurity experts come up with a new concept to secure the most valuable data assets of the organization, Confidential Computing. Confidential Computing is a cloud computing technology that isolates sensitive data in a protected CPU enclave during processing. The contents of the enclave – the data being processed, and the techniques used to process it – are accessible only to authorized programming code, and invisible and unknowable to anything or anyone else, including the cloud provider.

To simplify the concept, one needs to understand the three states in which data resides – (a) at rest on a storage device; (b) in transit between two locations across a network and; (c) when it is in use as it’s being processed by applications. It is in the third stage that data is most vulnerable and Confidential Computing is about protecting this stage.

What is Confidential Computing?

Confidential Computing is a Cloud-based system that isolates confidential data during computation in a secure CPU enclave. The material of the enclave – data processing and processing methods – is only available to a permitted code, and inaccessible and unknown for everyone. The technology isolates sensitive data in a protected CPU enclave during processing. The contents of the enclave – the data being processed, and the techniques used to process it – are accessible only to authorised programming code, and invisible and unknowable to anything or anyone else, even to the Cloud provider.

Before it can be processed by an application, data must be unencrypted in memory. This leaves the data vulnerable just before, during and just after processing to memory dumps, root user compromises and other malicious exploits. Confidential Computing solves this problem by leveraging a hardware-based trusted execution environment, or TEE, which is a secure enclave within a CPU. The TEE is secured using embedded encryption keys, and embedded attestation mechanisms that ensure the keys are accessible to authorized application code only. If malware or other unauthorized code attempts to access the keys – or if the authorized code is hacked or altered in any way – the TEE denies access to the keys and cancels the computation.

The Benefits

So, what are the derived benefits? A lot, but let us just dwell on the most crucial ones:

  • Protects sensitive data, even while in use
  • Extends Cloud computing benefits to sensitive workloads
  • Protects intellectual property including proprietary business logic, analytics, algorithms, or entire applications.
  • Enables secure collaboration with partners on Cloud platforms.
  • Protects data processed overdistributed Edge computing frameworks.

The Consortium

In 2019, a group of CPU manufacturers, Cloud providers and software companies came together to form the Confidential Computing Consortium (CCC). The two prime goals of the CCC are to define industry-wide standards for Confidential Computing and to promote the development of open source Confidential Computing tools. The Consortium members currently include Alibaba, AMD, Baidu, Fortanix, Google, IBM/Red Hat, Intel, Microsoft, Oracle, Swisscom, Tencent and VMware – all big names for whom data security means a lot! The world will surely hear more on this.

Cassie Learns to Walk

Reinforcement learning techniques successfully enable robotic legs to learn walking the human way through trial and error

A team of scientists at the University of California, Berkeley has developed a pair of robotic legs that has been taught to walk using reinforcement learning. This is the same learning technique that is used to train AI machines perform complex behaviour through repeated trial and error.

The two-legged robot is called Cassie and as of now it comprises …well… just two legs and nothing else! However, those pair of legs can now adroitly perform a wide range of locomotive movements from scratch, including walking in a crouched posture and while carrying an unexpected load.

Various movements of Cassie in real world in different scenarios
Image courtesy: https://arxiv.org/pdf/2103.14295.pdf

The seven-member development team has released a complete paper titled “Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots” describing the complete innovation. Teaching Cassie to walk on its human-size pair of legs all by itself is a huge achievement in terms of robotics. It promises to be able to handle diverse kind of surface terrain and recover whenever it stumbles or misaligns itself. However, Zhongyu Li, the first-named author of the paper, commented to press that “…we still have a long way to go to have humanoid robots reliably operate and live in human environments.”

Reinforcement learning has been used to train many bots to walk inside simulations, but transferring that ability to the real world is hard. Speaking to MIT Technology Review, Chelsea Finn, an AI and robotics researcher at Stanford University who was not involved in the work, said: “Many of the videos that you see of virtual agents are not at all realistic.” The challenges faced by the Berkeley team that developed Cassie were manifold. Minor differences between the simulated physical laws inside a virtual environment and the real physical environment outside, can put a self-learning robot completely off-track when it tries to apply what it has learned.

Even a tiny difference in factors, such as how friction works between the robot’s feet and the walking surface, can cause a heavy two-legged robot such as Cassie to lose balance and fall. As the paper candidly admits in its abstract: “Developing robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot.”

All said and done, training a large robot through trial and error in real world situations can be a risky affair. To skirt around the handicaps involved, the Berkeley team used two levels of virtual environment. In the first, a simulated version of Cassie learned to walk by drawing on a large existing database of robot movements. This simulation was then transferred to a second virtual environment called SimMechanics that mirrors real-world physics with a high degree of accuracy—but at a cost in running speed. Only when Cassie seemed to walk well virtually, the learned walking model was loaded into the actual robot.

The results were stunning. As reported in MIT Technology Review, the real Cassie was able to walk using the model learned in simulation without any extra fine-tuning. It could navigate rough and slippery terrain, carry unexpected loads, and recover after being pushed without notice. Even when Cassie damaged two motors in its right leg at the testing phase, it was able to adjust its movements to compensate. Edward Johns, who heads the Robot Learning Lab at Imperial College London was frank in admitting that “[t]his is one of the most successful examples I have seen”.

The development team are eager to add more movements to Cassie. “To our knowledge, this paper is the first to develop a diverse and robust bipedal locomotion policy that can walk, turn and squat using parameterized reinforcement learning,” they wrote in their conclusion. “An exciting future direction is to explore how more dynamic and agile behaviors can be learned for Cassie, building on the approach presented in this work.”

Anyone interested can access the full paper at: https://arxiv.org/pdf/2103.14295.pdf

Data Science Vs Data Analytics: Know the difference

Data Science Vs Data Analytics: Know the difference

With the amount of data that the consumers are generating on a daily basis, companies are finding it difficult to deal with such huge volumes of data. They are actively looking for individuals who can find practical and scalable ways to deal with it, thus bringing lucrative business opportunities. This is why data science and data analytics are so important in today’s digital world. They help companies identify actionable insights and make data-driven business decisions. However, many students often get confused between the terms data science and data analytics. Though there are some similarities, both these roles have some key differences.

So, what distinguishes a data scientist from a data analyst, and what are the similarities and differences between data science and data analytics?

Data Science Vs Data Analytics

What is it?

Data Science

Data science is a blend of various tools, algorithms, and machine learning principles to discover hidden patterns from raw data. It is primarily used for better decision-making, predictive analytics, and pattern discovery. 

Data Analytics

Data analytics is the process of analyzing raw data to find useful trends and insights that solve a specific problem. The main goal of data analytics is to optimize efficiency, improve performance and enable business growth in this increasingly competitive world. 

The key difference between data science and analytics is that while data science focuses on finding meaningful information from a large dataset, data analytics uncovers the specifics of extracted insights. This is the basic difference between data science and data analytics.

Job Description

Data Scientist

The job description is a huge difference when it comes to data science and data analytics. Technically, data scientists are problem solvers who try different approaches to solve a problem. A data scientist works on complex and specific problems to bring non-linear growth to the company. Some daily tasks of a data scientist include:

  • Looking for patterns or trends
  • Pulling, merging, and analyzing data
  • Developing predictive models
  • Building data visualizations

Data analyst

In simple terms, a Data analyst makes sense out of existing data. A data analyst will gather data, organize it and use it to find or identify useful insights for the company. They develop new processes and systems for collecting data and reaching insightful conclusions to improve business. Some daily tasks of a data scientist include:

  • Examining patterns or trends
  • Delivering reports
  • Consolidating data.
  • Collaborating with stakeholders.

Role requirements for beginners

Data Science

  • Proven ability in math, statistics, or computer science.
  • Deep and in-depth quantitative knowledge and problem-solving skills.
  • Ability to code and build models.
  • Ability to process, filter, and present large quantities of data.

Data Analyst

  • Ability to use data analytics software, data visualization software, and data management programs.
  • Strong analytical skills and attention to detail.
  • Ability to deal with competing objectives in a fast-paced environment.
  • Effective communication skills, creativity, and intellectual curiosity.

Read more on the Requirements of a data scientist job

Required Technical Skills

Data science

  • Python or R
  • SQL
  • Jupyter Notebook
  • Algorithms/modelling

Data Analytics

  • SQL
  • Excel
  • Tableau

Read more on Top Trends in Data Science Jobs

Salary in respective fields

Data Science

According to Payscale (https://www.payscale.com/research/IN/Job=Data_Scientist/Salary), the average salary of a data scientist in India is around 8 Lakhs per annum and it can reach up to 20 lakhs per annum.

Data analytics:

According to Payscale (https://www.payscale.com/research/IN/Job=Data_Analyst/Salary), the average salary of a data analyst in India is around 4.5 lakhs per annum and can reach up to 10 lakhs per annum.

Read our Guide on How to Learn Data Science

Both data science and data analytics are competitive and rapidly-growing fields that are in demand right now. Once you have a firm understanding of the differences between data analytics and data science, you can start evaluating which path is the right fit for you. As a premier business school in India, Praxis offers a 9-month full-time postgraduate program in Data Science. With our vast experience in business education, we offer students both the time to understand the complex theory and practice of data science concepts and the guidance from knowledgeable faculty who are available on campus for mentoring. We also have a well-structured campus placement program that ensures interview opportunities with the most significant companies in the field.

The Skill of the Century?

Large swathes of data are pretty useless if you can’t draw relevant insights from it – which is why data literacy is so crucial

For an economy accelerating at breakneck speed towards complete digitisation, there is still much room for internal growth within the moving parts driving this nascent digital revolution, i.e. the workforce. Research has shown there is much scope for organisations to improve data literacy and data skills internally in order to remain competitive in the world market. This is especially relevant at a time when most industries are engaging head-on in the data race to extract the best possible insights from data to aid their business decisions.

Data Literacy

While it is true that almost every firm in the world today collects data in some form to improve business performance, not all businesses are particularly adept at using this data and drawing relevant insights from it to actually make a difference. Additionally, many firms even believe that just by accruing data and the relevant technologies, their workforce will be able to employ them immediately. This, however, is hardly the case. This is why data literacy is so key – in leveraging said data sources and putting them to beneficial use. In fact, technologist Bernard Marr even suggests that “data literacy is as important to this century as reading/writing literacy was in the past century.”

In order to combat this data literacy issue, a high percentage of firms today employ external services from data experts to aid the process. Yet, this is rather expensive and creates several bottlenecks when the data analytics needs to funnel down to the workforce not equipped with appropriate data skills. Thus, it may prove of much prudence to broaden the scope of data expertise throughout the organisation. According to Marr, firms must “create the culture, build trust and provide the tools to give everyone in the organization the ability to use data by themselves to inform decision-making.”

One may indeed wonder, in this regard, that if the solution was, in fact, this simple, why it was not applied sooner. Marr, in this respect, points out several other considerations that may be causing a barrier to data literacy:

  • Company Culture: “Does your company culture support data literacy? If you have a command-and-control environment, you might as well save your investment in data literacy because it will fail before you even start. Cultures that embrace data literacy start from the top. The culture in your organization needs to allow people to use data, come to conclusions about the data and then make decisions with the data without needing to wait for approval from top leadership before they can act. Leaders need to delegate authority to employees so they can actually use the data and make informed decisions. Without this data-driven decision-making culture in place, it doesn’t matter how data literate your employees are, they won’t be able to actually make decisions and act from that literacy if you don’t allow it.” (Bernard Marr)
  • Data exploration technologies: While data is one of the most valuable assets for any organisation, it is crucial (i) to determine whether or not firms are collecting the right kind of data that can aid their strategic objectives; (ii) to ensure the data is trustworthy and that all employees using it trust its efficacy; and (iii) to ensure the presence of appropriate technologies to store and process the data effectively. The use of data exploration tools that allow users to appropriately map, visualise and dissect the data is imperative in this regard.
  • Data Skills: The primary path to data literacy, is, of course, accruing the right skills. Apart from using expensive external consultants, firms must invest in their own personnel, in developing the right skills. Often, hybrid teams combining external data science teams and internal resources can help build better data skills and literacy within the organisation.

The creation of new positions — data translators, for example (as in energy giants Shell) – can also prove to be beneficial in this regard. Data translators are roles that sit between business function and data science teams to help bridge skill gaps and facilitate conversations.