Summary
Publications
Datasets
Summary
Student in the Data Engineering and Analytics master's program at the Technical University of Munich, with a research focus on the medical field. Holds a bachelor's degree in Information Systems Engineering. Passionate about problem-solving, with six years of experience as a contestant, coach, problem-setter, and judge in the International Collegiate Programming Contest. DAAD scholar, volunteering as an academic mentor with the Syrian Youth Empowerment NGO to support Syrian students in pursuing postgraduate studies abroad.
Publications
- Ghandoura, Abdulkader, Farouk Hjabo, and Oumayma Al Dakkak. “Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting.” Engineering Applications of Artificial Intelligence 102 (2021): 104267.
Datasets
- Arabic Speech Commands Dataset (CC BY 4.0): Our dataset is a list of pairs (x, y), where x is the input speech signal, and y is the corresponding keyword. The final dataset consists of 12000 such pairs, comprising 40 keywords. Each audio file is one-second in length sampled at 16 kHz. We have 30 participants, each of them recorded 10 utterances for each keyword. Therefore, we have 300 audio files for each keyword in total (30 * 10 * 40 = 12000), and the total size of all the recorded keywords is ~384 MB. The dataset also contains several background noise recordings we obtained from various natural sources of noise. We saved these audio files in a separate folder with the name background_noise and a total size of ~49 MB.