Seminar: Automatic building and cataloging audio databases from web
Background
Web scraping methods can nowadays be very helpful to automatically collect data and build large-sized datasets which can be necessary when developing machine learning algorithms. The project aims at implementing a convenient algorithm for the automatic extraction and tagging of audio data by using web scraping, crowd-sourcing and publicly available repositories.
Aim
Apply web scraping-based algorithms to extract and catalogue automatically audio data from publicly available repositories.
Learning objectives
- Analyze audio data from open source repositories
- Apply web-scraping methods to collect datasets from publicly available repositories
- Use crowd-sourcing to catalog the audio data
- Apply machine learning algorithms to automate the process
Data
Project type | Seminar (optional: Master thesis) |
ECTS | 2.5, 5, 7.5, default: 5 |
Language | English |
Period | Winter term 2020/21 |
Presence time | Virtual seminar, working from remote |
Useful knowledge | Python, Machine Learning |
Work distribution | 100% algorithm development |
Med. Eng. designation | Advanced Context Recognition (ACR) |
StudOn link | Please join |
First Meeting | online-introduction-vorbesprechung-of-winter-term-2020-seminars, on 4th November 2020 at 16:15 |
Registration | Via StudOn, obligatory after introduction |
Literature
Up-to-date literature recommendations are provided during the lectures.
Examination
Final presentation and final report.
Contact
Annalisa Baronetto
- Job title: Researcher
- Address:
Henkestraße 91, Haus 7, 1. OG
91052 Erlangen
Germany - Phone number: +49 9131 85 23608
- Email: annalisa.baronetto@fau.de