601.467/667 Introduction to Human Language Technology


Fall 2024

Coordinator: Philipp Koehn (phi@jhu.edu)
TA: Elina Baral (ebaral1@jhu.edu)
CAs: TBA
Class: Tuesday and Thursday 9:00-10:15am, Hodson 210
Office hours: Coordinator: on request
Office hours: TA: Elina Baral, Thursday 1:00-2:00, Malone 216
CA: Muhan Gao, Monday 1:00-2:00, Malone 216
      Shabeg Singh Gill, Tuesday 5:00-6:00, Malone 216
      Hirtika Mirghani, Wednesday 11:30-1:00, Malone 216
      Huiqi Zou, Friday 1:00-2:00, Malone 216
Gradescope (entry code: 8KX4DX)Piazza (access code qlnu01yhhel)Old lecture recordings

Assignments

Note: For Fall 2024, please *do not* proceed before assignments being announced. They are subject to change.
You can confirm whether the homework is released by checking the date listed on each homework.

Late submissions: For each student, we allow a total of 10 days of late submission for all homeworks.
It is counted on a daily basis, for example, if you submit a homework even a few minutes late, you will lose 1 day of your quota.
After you use up all 10 days of late submission, each late day would cost 5% points penalty.
Late submission for teamwork would use 1-day for each teammate.
For each homework, you *are not allowed to submit* after 14 days.
  1. N-gram language modeling, CYK parsing: Due on September 11 (Wednesday)
  2. RNNLMs, word2vec: Due on September 25 (Wednesday)
  3. Seq2seq for pronunciation prediction: Due on October 30 (Wednesday)
  4. Speech recognition with CTC: Due on November 27 (Wednesday)

Exam

There will be two mid-terms and final exam. You are allowed to bring 1 sheet of paper with notes to the exam.

The final exam time takes place Tuesday, December 17, 6-9pm.

Lectures

Date Topic Instructor
Tu Aug 27IntroductionKoehn
Text
Th Aug 29Words and Language ModelsYarowsky
Tu Sep 3MorphologyYarowsky
Th Sep 5SyntaxPost
Tu Sep 10Deep learning IMurray
Th Sep 12Deep learning II (Python notebook)Murray
Tu Sep 17SemanticsLippincott
Th Sep 19Distributional Semantics and Large Language ModelsKoehn
Tu Sep 24Machine TranslationDuh
Th Sep 26Information ExtractionKoehn
Tu Oct 1Information RetrievalYang
Th Oct 3First Midterm Exam-
Speech
Tu Oct 8Speech basicsMoro-Velazquez
Th Oct 10Classic speech recognition1 (additional slides, video 0:12-1:25)Khudanpur
Tu Oct 15Speaker recognitionVillalba
Tu Oct 22End-to-end neural speech recognitionWiesner
Th Oct 24Auditory systemElhilali
Tu Oct 29Enhancement and DiarizationMaciejewski
Th Oct 31Hands on: Kaldi (K2, ESPnet)Wiesner and Maciejewski
Tu Nov 5Hands on: Kaldi (Transducer-based ASR, CTC ASR from pretrained models)
Th Nov 7Second Midterm Exam
Applications
Tu Nov 12Question AnsweringDuh
Th Nov 14NLP for Digital HumanitiesLippincott
Tu Nov 19Interpretable and Explainable NLPHanjie Chen
Th Nov 21Ethical ProblemsMoro-Velazquez
Tu Dec 3Scaling Large Language ModelsDaniel Khashabi
Th Dec 5Computational Social ScienceField
1These slides present an incomplete picture of what will be discussed in class. Attentive listening is recommended for gaining maximal benefit.