Seminar: Automatic building and cataloging audio databases from web

Symbolbild zum Artikel. Der Link öffnet das Bild in einer großen Anzeige.

Background

Web scraping methods can nowadays be very helpful to automatically collect data and build large-sized datasets which can be necessary when developing machine learning algorithms. The project aims at implementing a convenient algorithm for the automatic extraction and tagging of audio data by using web scraping, crowd-sourcing and publicly available repositories.

Aim

Apply web scraping-based algorithms to extract and catalogue automatically audio data from publicly available repositories.

Learning objectives

  • Analyze audio data from open source repositories
  • Apply web-scraping methods to collect datasets from publicly available repositories
  • Use crowd-sourcing to catalog the audio data
  • Apply machine learning algorithms to automate the process

Data

Project type Seminar (optional: Master thesis)
ECTS 2.5, 5, 7.5, default: 5
Language English
Period Winter term 2020/21
Presence time Virtual seminar, working from remote
Useful knowledge Python, Machine Learning
Work distribution 100% algorithm development
Med. Eng. designation Advanced Context Recognition (ACR)
StudOn link Please join
First Meeting online-introduction-vorbesprechung-of-winter-term-2020-seminars, on 4th November 2020 at 16:15
Registration Via StudOn, obligatory after introduction

Literature

Up-to-date literature recommendations are provided during the lectures.

Examination

Final presentation and final report.

Contact

Annalisa Baronetto

  • Job title: Researcher
  • Address:
    Henkestraße 91, Haus 7, 1. OG
    91052 Erlangen
    Germany
  • Phone number: +49 9131 85 23608
  • Email: annalisa.baronetto@fau.de

Friedrich-Alexander-Universität Erlangen-Nürnberg