Seminar: Automatic building and cataloging audio databases from web

Symbolbild zum Artikel. Der Link öffnet das Bild in einer großen Anzeige.

July 23, 2020

Background

Web scraping methods can nowadays be very helpful to automatically collect data and build large-sized datasets which can be necessary when developing machine learning algorithms. The project aims at implementing a convenient algorithm for the automatic extraction and tagging of audio data by using web scraping, crowd-sourcing and publicly available repositories.

Aim

Apply web scraping-based algorithms to extract and catalogue automatically audio data from publicly available repositories.

Learning objectives

Analyze audio data from open source repositories
Apply web-scraping methods to collect datasets from publicly available repositories
Use crowd-sourcing to catalog the audio data
Apply machine learning algorithms to automate the process

Data

Project type	Seminar (optional: Master thesis)
ECTS	2.5, 5, 7.5, default: 5
Language	English
Period	Winter term 2020/21
Presence time	Virtual seminar, working from remote
Useful knowledge	Python, Machine Learning
Work distribution	100% algorithm development
Med. Eng. designation	Advanced Context Recognition (ACR)
StudOn link	Please join
First Meeting	online-introduction-vorbesprechung-of-winter-term-2020-seminars, on 4th November 2020 at 16:15
Registration	Via StudOn, obligatory after introduction

Literature

Up-to-date literature recommendations are provided during the lectures.

Examination

Final presentation and final report.

Contact

Annalisa Baronetto

Job title: Researcher
Address:

Henkestraße 91, Haus 7, 1. OG
91052 Erlangen
Germany
Phone number: +49 9131 85 23608
Email: annalisa.baronetto@fau.de