In the last couple of weeks, you have learned about the classic and end-to-end paradigms of speech recognition. In this assignment, you will build an end-to-end ASR for a subset of Librispeech, using the CTC framework that you learned in class. All the questions and related instructions can be found on this Google Colab link.
Instructions for submission can be found on the notebook. You will turn in the completed notebook on Gradescope along with necessary model checkpoint files. Remember to:
1. Paste the viewable link in the box provided in the notebook.
2. Upload the full complete notebook along with necessary checkpoint files (using OneDrive with public sharing link).
3. Select pages for the respective questions on Gradescope under the writing part.
For this assignment, you can collaborate with other students to form a team of up to 3 people. Beyond your teammates, you are allowed to discuss general concepts and ideas related to the assignment, but you must not discuss actual solutions. You have to handle one single group submission to Gradescope with your teammates added.
Enroll yourself in the Gradescope class (entry code provided on the website) and submit the PDF before 11.59 PM (EDT) on November 27, 2024 (Wednesday). Late policy please refer to the website. If you face any technical/other difficulties, please contact the TA/CAs on Piazza.
1