The Art of Defence

Efficiency and network security must go hand in hand ― here’s how Shallow-Deep Networks and DeepSloth attacks are helping redefine Deep Learning

The rapid proliferation of technology goes hand-in-hand with those attempting to breach and dismember its’ defences. So, while developing novel deep learning algorithms to increase efficiency and optimise performance is important, it is as important (if not more) to ensure that these models stand robustly against external adversaries.

It is in this accord that deep learning researchers from the University of Maryland have developed ‘DeepSloth’ – an attack mechanism aimed to target ‘adaptive deep neural networks’ and slow them down, sometimes even to the point of standstill. Understanding the nuances of DeepSloth and preparing defences for the same is going to be an integral step in designing a solid plinth for network security – something that will prove to be invaluable to the future of industrial applications of deep learning technology.

Back in the Shallows now

Currently, the primary hurdle standing in the way of executing deep learning models are the huge requirements for (i) computational power and (ii) memory. Given this, it is true that full deep learning algorithms can only be executed on servers with “abundant resources”. Tech-news conglomerate VentureBeat thus opines: “this makes them unusable for applications that require all computations and data to remain on edge devices or need real-time inference and can’t afford the delay caused by sending their data to a cloud server.”

Image: Neural Networks: Shallow v Deep; Source: Mulbah Kallen, medium.com/analytics-vidhya

To combat this, researchers in machine learning have developed several strategies to reduce the hefty requirements of power and cost. One of these strategies entails the use of a ‘multi-exit architecture’ where computations are stopped as and when neural networks can achieve a degree of acceptable accuracy. VentureBeat writes: “Experiments show that for many inputs, you don’t need to go through every layer of the neural network to reach a conclusive decision. Multi-exit neural networks save computation resources and bypass the calculations of the remaining layers when they become confident about their results.” And thus was born a multi-exit architecture that has since become an object of widespread curiosity: the ‘Shallow-Deep Network’.

The roots behind the concept of a Shallow-Deep Network architecture can initially be traced back to 2019, when doctoral researcher Yigitcan Kaya at the University of Maryland first developed a multi-exit architecture that could potentially reduce average inference costs of deep neural networks by almost 50%. He primarily addressed the issue of ‘overthinking’, where “deep neural networks start to perform unneeded computations that result in wasteful energy consumption and degrade the model’s performance”. The idea, however, could not yet be regarded as a ‘robust’-enough architecture to protect against adversarial attacks.

No Pandas, just Sloths

The researchers at the University of Maryland then sought to address this directly, and thus was born the DeepSloth attack. An excerpt from VentureBeat:

“Like adversarial attacks, DeepSloth relies on carefully crafted input that manipulates the behavior of machine learning systems. However, while classic adversarial examples force the target model to make wrong predictions, DeepSloth disrupts computations. The DeepSloth attack slows down shallow-deep networks by preventing them from making early exits and forcing them to carry out the full computations of all layers.”

Recent research from the University of Maryland has found that slowdown attacks of the DeepSloth kind, for example, have the ability to completely wipe off any of the gains received from the reduced inference times guaranteed by multi-exit architectures, often to the degree of almost 90-100%. Essentially, this would cause a deep learning model to waste much memory and computing resources thereby losing out on efficiency drastically.

In more serious cases, such as when deep learning models are bifurcated between the edge (using IoT devices) with more complex modelling done on hosted cloud-based web servers, severe slowdown attacks may amplify the latency by upto five times, thereby rendering the model almost completely useless. “This could cause the edge device to miss critical deadlines, for instance in an elderly monitoring program that uses AI to quickly detect accidents and call for help if necessary”, according to Assistant Professor Tudor Dumitras.

Furthermore, the researchers added several other complexities into their DeepSloth attack model as well, such as assuming that the attacker has full or partial knowledge of existing model conditions that would allow them to dismember the victim model. They also assumed that the attacker may use surrogate models to carry out their attacks; and found that even a surrogate model can cause 20-50% slowdowns in victim models.

Such slowdown attacks could become norm in the years to come, and DeepSloth will perhaps become one of the benchmarks by which future security of deep learning models is analysed. It is a major step taken towards total network security

Reference: Find the original paper released by the researchers at the University of Maryland at https://arxiv.org/pdf/2010.02432.pdf