601.467/667 Introduction to Human Language Technology


Fall 2023

Coordinator: Philipp Koehn (phi@jhu.edu)
TA: Elina Baral (ebaral1@jhu.edu)
CAs: Nikhil Sharma (nsharm27@jhu.edu), Qianqi Huang (qhuang35@jhu.edu), Muskaan Muskaan (mmuskaa1@jhu.edu)
Class: Tuesday and Thursday 9:00-10:15am, Hodson 210
Office hours: Coordinator: on request
Office hours: TA: Elina Baral, Thursday 3-4pm, Malone 216
CA: Nikhil Sharma, Friday 3-4pm, Malone 216;
Qianqi Huang, Tuesday 1-2pm, Malone 216
Muskaan Muskaan, Monday 3:30-4:30 pm, Malone 216
Gradescope (entry code: GP55NB)Piazza (access code qlnu01yhhel)Lecture recordings

Assignments

Note: For Fall 2023, please *do not* proceed before assignments being announced. They are subject to change.
You can confirm whether the homework is released by checking the date listed on each homework.

Late submissions: For each student, we allow a total of 10 days of late submission for all homeworks.
It is counted on a daily basis, for example, if you submit a homework even a few minutes late, you will lose 1 day of your quota.
After you use up all 10 days of late submission, each late day would cost 5% points penalty.
Late submission for teamwork would use 1-day for each teammate.
For each homework, you *are not allowed to submit* after 20 days.
  1. N-gram language modeling, CYK parsing: Due on September 20 (Wednesday)
  2. RNNLMs, word2vec: Due on October 11 (Wednesday)
  3. Seq2seq for pronunciation prediction: Due on November 15 (Wednesday)
  4. Speech recognition with CTC: Due on December 6 (Wednesday)

Exam

There will be two mid-terms and final exam. You are allowed to bring 1 sheet of paper with notes to the exam.

The final exam time takes place Thursday, December 14 at 6pm in Maryland 110.

Lectures

Date Topic Instructor
Tu Aug 29IntroductionKoehn
Text
Th Aug 31Words and Language ModelsYarowsky
Tu Sep 5MorphologyYarowsky
Th Sep 7SyntaxPost
Tu Sep 12SemanticsLippincott
Th Sep 14Deep learning IMurray
Tu Sep 19Deep learning II (Python notebook)Murray
Th Sep 21Distributional Semantics and Large Language ModelsKoehn
Tu Sep 26Machine TranslationMurray
Th Sep 28Information ExtractionKoehn
Tu Oct 3Information RetrievalDuh
Th Oct 5First Midterm Exam-
Speech
Tu Oct 10Classic speech recognition1 (additional slides)Khudanpur
Th Oct 12Speech basicsMoro-Velazquez
Th Oct 17Speaker recognitionVillalba
Th Oct 24End-to-end neural speech recognitionWiesner
Th Oct 26Auditory systemElhilali
Tu Oct 31Enhancement and DiarizationMaciejewski
Th Nov 2Hands on: Kaldi (K2, ESPnet)Wiesner and Maciejewski
Tu Nov 7Hands on: Kaldi (Transducer-based ASR, CTC ASR from pretrained models)
Th Nov 9Second Midterm Exam
Applications
Tu Nov 14Question AnsweringDuh
Th Nov 16NLP for Digital HumanitiesLippincott
Tu Nov 28Interpretable and Explainable NLPHanjie Chen
Th Nov 30Ethical ProblemsMoro-Velazquez
Tu Dec 5Scaling Large Language ModelsDaniel Khashabi
Th Dec 7Computational Social ScienceField
1These slides present an incomplete picture of what will be discussed in class. Attentive listening is recommended for gaining maximal benefit.