Monitoring strategies and metrics for AI accelerator activations using activation patterns

Monitoring strategies and metrics for AI accelerator activations using activation patterns

. .
.
. .
.
. .
.

Context

The classical approach for highest safety requirements (ASIL-D for automotive) aims at finding all systematic system and software faults in the development phase. For certain classes of faults, careful work (processes, tools, etc.) can largely achieve this goal. However, for systems that include AI accelerators, there are always sporadic effects resulting in malfunctions or short-term performance degradation. These can occur, for example, as a result of colliding accesses to shared resources such as memory or I/O, or an unrecognized need for synchronization between processes, as well as effects caused by attacks or input distortions. Since the explanation of the black box behavior of an AI accelerator is a new field of research, occurring effects cannot be explained with conventional or researched methods. One attempt to explain classification results of the accelerator would be to look at the activation patterns within the AI accelerator. The importance of the activations of single neurons for the explainability should be included. The aim of this work is to take a step towards explainability by considering different strategies and patterns of activation monitoring.

Tasks

  • Literature research into AI explainability and accelerators
  • Evaluation and implementation of strategies to monitor activations and their impact on result validity
  • Analysis of activation patterns inside AI accelerators
  • Proposal of a metric for impact of neuron activation on result validity

Following tasks are needed for a ‘very good’ grade

  • Proposal of strategy for intrusion detection based on activation patterns

  • Automatic generation of monitors based on introduced metric

Prerequisites

  • Interest in embedded systems, AI research and new design methods
  • Very good knowledge in Python (preferably with knowledge in AI learning libraries, such as PyTorch, Tensorflow)
  • Preferably knowledge in hardware description or HLS languages such as VHDL or Chisel, respectively
  • Ability to work independently

Before the start of the concrete work, an exposé has to be written and approved by the supervisor.

(Pictures generated with Dalle 2)