Links zu weiteren Portalen

Seiteninterne Suche


Seminar: Automatic building and cataloging audio databases from web


Web scraping methods can nowadays be very helpful to automatically collect data and build large-sized datasets which can be necessary when developing machine learning algorithms. The project aims at implementing a convenient algorithm for the automatic extraction and tagging of audio data by using web scraping, crowd-sourcing and publicly available repositories.


Apply web scraping-based algorithms to extract and catalogue automatically audio data from publicly available repositories.

Learning objectives

  • Analyze audio data from open source repositories
  • Apply web-scraping methods to collect datasets from publicly available repositories
  • Use crowd-sourcing to catalog the audio data
  • Apply machine learning algorithms to automate the process


Project type Seminar (optional: Master thesis)
ECTS 2.5, 5, 7.5, default: 5
Language English
Period Winter term 2020/21
Presence time Virtual seminar, working from remote
Useful knowledge Python, Machine Learning
Work distribution 100% algorithm development
Med. Eng. designation Advanced Context Recognition (ACR)
StudOn link Please join
First Meeting online-introduction-vorbesprechung-of-winter-term-2020-seminars, on 4th November 2020 at 16:15
Registration Via StudOn, obligatory after introduction


Up-to-date literature recommendations are provided during the lectures.


Final presentation and final report.


Annalisa Baronetto

  • Job title: Researcher
  • Address:
    Henkestraße 91, Haus 7, 1. OG
    91052 Erlangen
  • Phone number: +49 9131 85 23608
  • Email: