Bachelor's thesis submission talk (Informatics), advised by Felix Dietrich.
Previous talks at the SCCS Colloquium
Christian Kellinger: Data Preprocessing for Sign Language Detection with Machine Learning Models
SCCS Colloquium |
In the modern era real time sign language video-to-text translation is an important tool for a barrier-free and inclusive society. For this purpose, applications using supervised machine learning concepts and algorithms seem fitting because they have proven to be powerful in dealing with classification problems. There already exist effective tools for real time handshape and -motion tracking solutions which even work on smartphones, like MediaPipe Hands. They deliver 3D landmarks for multiple hands and fingers, which in turn can be used for sign language detection.
In a machine learning approach, the quality of the learning dataset is of critical importance for the quality of the application itself, which is reflected in the implementation: around 80% of the work time are allocated to the learning dataset.
This thesis deals with topics regarding data-preprocessing for sign language detection with machine learning models. It explores already existing solutions for sign language detection, and tries to discover their preprocessing. It investigates the structure and character of sign language, to discover which part of the human body need to be examined to recognize sign language. Later I focus on MediaPipe, a programming framework for human pose estimation, and try to estimate the impact of different preprocessing approaches on the accuracy of the pose estimation. I then generalize my findings on real-time sign language detection.