Machine Translation Course, Information, Yearly Results

Academic year 2022-2023

\[ T(f \rightarrow e) = \arg \max_{e} P(e)P(f|e) \]

Contents

Schedule of MT classes 2022

Week Date Topic Materials Presenters
01 7. Oct. 2022 Teaching slides
02 14. Oct. 2022 Teaching reading
03 21. Oct. 2022 Teaching reading
MT metrics(video)
04 28. Oct. 2022 Teaching reading
*Seq2seq Models Tutorial
05 4. Nov. 2022 Teaching reading
*Step-by-step debug
06 11. Nov. 2022 Human and
Automatic Evaluation
Metrics
🤔main paper
additional readings
BERTScore,
WMT2020 Metrics,
Significance
Rebeca Oprea (engineer),
Teodor Dumitrescu (author),
ChiruÈ› Veronica (reviewer)
07 18. 22 Nov. 2022 (room 119, at 10:00) Data Acquisition 🤔main paper
additional important readings
training LASER
teacher-student
references
Ahmad Wali (engineer),
Daniel Sava (author),
Iordăchescu Anca (reviwer)
08 25. Nov. 2022 Language Models,
Translation Models,
Tokenizers
🤔main paper
additional readings
BPE dropout
references
language models, tokenizers
Stan Flavius (author),
Bazavan Cristian (engineer),
Blăgescu Alex (reviewer),
Stegarescu Ana (visionary)
09 2. 8. Dec. 2022 (room 119, at 12:00) Neural MT,
Attention,
Multilingualilty
🤔main paper
additional readings
Annotated Transformer,
Illustrated Transformer,
Lena Voita’s Tutorial
Ranete Cristian (reviewer),
Nedelcu Mihai (visionary),
Ilicea Anca (author),
Mărilă Mircea (engineer)
10 9. Dec. 2022 Tokenizers,
Transformers,
Explainability
🤔main paper
additional readings
Visualizing Attention,
Quantifying Attention Flow in Transformers
Bleoţiu Eugen (visionary),
Antal Mihaela (reviewer),
Zăvelcă Miruna (engineer),
Dăscălescu Dana (author)
11 16. Dec. 2022 Diffusion Models ChatGPT 🤔main paper
additional readings
ChatGPT
blog
RLHF1,
RLHF2
BLOOM📖
Galactica
Istrati Lucian📖 (engineer),
Lazăr Dorian (author),
Creanga Claudiu (reviewer),
Aldea Gabriela (visionary)
12 23. Dec. 2022 Projects 🌲
13 13. Jan. 2023 Projects
14 20. Jan. 2023 Projects

Roles

Each person is assigned a role (almost randomly) and must prepare the reading from the Materials column from the row where their name is added. Materials will be announced shortly. Consider taking these roles seriously as they account for half of your grade. Since December, 2 is during a public holiday, we can postpone the presentations for Thursday, December, 8.

Author

Pretend you are the main author of the papers, prepare a presentation and talk about:

Scientific reviewer

You must make a critical evaluation of the paper, not necessarily negative; read the guidelines and examples from NIPS

Engineer

Implement something related to the paper either on the same dataset or on a new one; prepare to share the code and some empirical intuition behind the paper.

Visionary

Propose a follow-up research project or a new application; take into account the previous work and existing work being done; take into account ethics and the socio-economic impact:

Attendees

Everyone must ask a question at the end of the presentations to qualify as being present. Being present at all the presentations will account for 1 bonus point at the end.

Lab Projects

  1. gather some colleagues and make a team of maximum 3 people
  2. choose an MT topic that you would like to research (see the project list on the website or propose your own)
  3. make sure your topic does not overlap with other topics that are in progress and that have been chosen by your colleagues
  4. email sergiu to announce your team, your proposal and to discuss how to approach it
  5. after you obtain the approval, mark it as being in progress on the kanban list and start working on it
  6. prepare the project, a presentation, and a report using this template
  7. place everything in a digital storage space somewhere: a git repo, a drive, some file on a server etc.; don’t send large files by email, send only URLs
  8. current deadlines are December 23, January 13, and January 20

MT Bibliography (expanding…)

* blog posts, tutorials, visual explanations

Prerequisites: general ML concepts, blogs, tutorials

Prerequisites, general ML concepts, books

Statistical Machine Translation & Language Models

Evaluation

Data Collection, Alignment

Neural MT with RNNs

Neural MT with CNNs

Tokenizers

*Transformers - Tutorials

Transformers - Essential Readings

Other Transformer Models

Transformers and Explainability

Machine Translation Frameworks

Extra Readings on Machine Translation

Recent / Interesting Research

Neural MT with Diffusion Models

Back to the future

Other Courses