Thesis Data Science: Robustness analysis of a feedback-based online data selection for live use in the vehicle

  • Subject:Data Science/Big Data
  • Type:Master thesis
  • Date:ab 02 / 2026
  • Tutor:

    M.Sc. Philipp Reis


Thesis Data Science: Robustness analysis of a feedback-based online data selection for live use in the vehicle

Context

Modern vehicles generate very large amounts of data (up to 2.5 GB/s) while driving. In practice, the complete storage, transmission and processing of this data is cost-intensive and often not scalable. At the same time, this data is essential for data-driven methods such as machine learning, and since vehicle data typically follows a long-tail distribution, many data points are redundant, while rare situations (corner cases, anomalies) only occur very rarely. A feedback-driven data collection addresses this problem by using a model based on already collected data to evaluate the novelty value of new data in order to store diverse and informative data and avoid redundancy. The approach currently used is based on the assumption of a Gaussian distributed input data stream. However, this assumption is often violated in real driving data (e.g. temporal redundancies, non-stationary distributions, multi-modality). In this work, the robustness of the feedback mechanism is therefore to be systematically investigated and improved.

Tasks
  • Familiarization with feedback data collection for vehicle data

  • Research of robustness approaches and evaluation of their transferability to a feedback data collection

  • Integration of a method into a feedback data collection framework

  • Comparison of methods for feedback data collection

Prerequisites
  • You work independently and in a structured manner, are motivated and committed.

  • Python knowledge

  • You have a very good command of written and spoken German and English

  • Knowledge of machine learning / statistics, ideally in streaming algorithms and anomaly detection / distribution estimation