Important Details

  • Location: Baker Hall A51
  • Time: Tuesdays and Thursdays 2 PM - 3:20 PM
  • Instructor email: llms-11-667 @ andrew.cmu.edu

Course Description

Large Language Models Methods and Applications (11-667) is a graduate-level course that aims to provide a holistic view of the current state of large language models. The first half of this course starts with the basic of language models, including network architectures, training, inference, and evaluation. Then it discusses the interpretation (or attempts of), alignments, and emergent capabilities of large language models, followed by its popular applications in language tasks and new utilizations beyond texts. In the second half, this course first presents the techniques of scaling up language model pretraining and recent approaches in making the pretraining of large models and their deployment more efficient. It then discusses various concerns surrounding the deployment of large language models and wraps up with the challenges and frontiers of LLM developments.

This course is designed to give graduate-level students an overview of the techniques behind LLMs and a thorough grounding on the fundamentals and cutting-edge developments of LLMs, to prepare them for further research or applied endeavors in this new AI era.

Learning Goals

Students who successfully complete this course will be able to:

  • Compare and contrast different models in the LLM ecosystem in order to determine the best model for any given task.
  • Implement and train a neural language model from scratch in Pytorch.
  • Utilize open-source libraries to finetune and do inference with popular pre-trained language models.
  • Understand how to apply LLMs to a variety of downstream applications, and how decisions made during pre-training affect suitability for these tasks.
  • Read and comprehend recent, academic papers on LLMs and have knowledge of the common terms used in them (alignment, scaling laws, RLHF, prompt engineering, instruction tuning, etc.).
  • Design new methodologies to leverage existing large scale language models in novel ways.

Prerequisites

Students should have a basic understanding of machine learning, equivalent to the material covered by 10-301/10-601, and be familiar with concepts in natural language processing, equivalent to those covered by 11-411/11-611.

Students are expected to be fluent in Python. Familiarity with deep learning frameworks such as PyTorch will also be helpful.

Class Format

Classes will be in person, every Tuesday and Thursday 2:00PM-3:20PM at Baker Hall A51.

Readings: There will be reading materials for each lecture, which students are required to read through before the class.

Interactive Activities: There will be ungraded, interactive activities interspersed through the lectures. These will be things like discussing a topic from the class with those sitting near you or answering questions via polling software.

Homework: There will be six homework assignments, to be completed individually.

Exams: There will be a midterm exam and a final exam.

Grading

  • 60%: Homeworks
    • Each homework is worth 10% of your grade.
  • 20%: Midterm exam
    • Date: 10/22/2024 (in class)
  • 20%: Final exam
    • Date TBD

Late Policy

Each student has six free late days to use across the six homeworks. If you are out of late days, then you will not be able to get credit for subsequent late homeworks. One “day” is defined as anytime between 1 second and 24 hours after the homework deadline. The intent of the late day policy it to allow you to take extra time due to unforseen circumstances like illnesses. To use your late days on a homework, you MUST fill out this form.

In the event of a medical emergency, please make your personal health, physical and mental, your first priority. Seek help from medical and care providers such as University Health Services. Students can request medical extensions afterwards with proof/note from providers. These will not count toward your 5 days. For other emmergencies and absences, students can request extensions with corresponding documentation in a case-by-case basis with instructors.

Accomodations

If you have a disability and require accommodations, please contact Catherine Getchell, Director of Disability Resources, 412-268-6121, getchell@cmu.edu. If you have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate.

Policy on Missing Class

We will try to record classes, but do not offer a guarantee of this. If you must miss class, you should arrange to get notes from a friend. Hwoever, we plan to make classes interactive, so please try to attend.

Policy on Missing Exams

The final exam cannot be missed. If you have a valid reason for not being able to make the midterm (for example, presenting a paper at a conference or a medical emergency), you should let us know as soon as you are aware of the conflict, and we will discuss accomodations.

Academic Integrity

Please take some time to read through CMU’s Academic Integrity Policy. Students who violate this policy will be subject to the disciplinary actions described in the Student Handbook.

Collaboration on Homeworks

The six homeworks should be completed individually. However, we encourage you to ask questions on Piazza and in office hours. While you may discuss strategies amongst yourselves, all experiments and analyses should be your own.

Use of Language Models

Using a language model to generate any part of a homework answer without attribution will be considered a violation of academic integrity. This means that if you use ChatGPT or CoPilot to assist you on a homework, you must state so explicitly within your response. On each homework, you will be asked to attest to whether you used AI systems to assist on the homework, and if so, in what manner. If you have used AI systems to generate any part of your homework, you must submit the prompts/instructions/inputs you used to obtain the generated output. Your grading will be based on both the correctness of your homework response and the quality of your prompts/instructions. Errors in the generated outputs that appear in your homework response , and non-interesting prompts, e.g., merely putting in the homework questions to the language model, are not intellectual efforts and are unlikely to receive a good grade.