Large Language Models Methods and Applications

Table of contents

  1. Large Language Models Methods and Applications
    1. Important Details
    2. Course Description
    3. Learning Goals
    4. Prerequisites
    5. Class Format
    6. Grading
    7. Late Policy
    8. Accomodations
    9. Policy on Missing Class
    10. Academic Integrity
      1. Collaboration on Homeworks
      2. Use of Language Models

Important Details

  • Location: Baker Hall A51
  • Time: Tuesdays and Thursdays 2 PM - 3:20 PM
  • Instructor email: llms-11-667 @ andrew.cmu.edu

Course Description

Large Language Models Methods and Applications (11-667) is a graduate-level course that aims to provide a holistic view of the current state of large language models. The first half of this course starts with the basic of language models, including network architectures, training, inference, and evaluation. Then it discusses the interpretation (or attempts of), alignments, and emergent capabilities of large language models, followed by its popular applications in language tasks and new utilizations beyond texts. In the second half, this course first presents the techniques of scaling up language model pretraining and recent approaches in making the pretraining of large models and their deployment more efficient. It then discusses various concerns surrounding the deployment of large language models and wraps up with the challenges and frontiers of LLM developments.

This course is designed to give graduate-level students an overview of the techniques behind LLMs and a thorough grounding on the fundamentals and cutting-edge developments of LLMs, to prepare them for further research or applied endeavors in this new AI era.

Learning Goals

Students who successfully complete this course will be able to:

  • Compare and contrast different models in the LLM ecosystem in order to determine the best model for any given task.
  • Implement and train a neural language model from scratch in Pytorch.
  • Utilize open-source libraries to finetune and do inference with popular pre-trained language models.
  • Understand how to apply LLMs to a variety of downstream applications, and how decisions made during pre-training affect suitability for these tasks.
  • Read and comprehend recent, academic papers on LLMs and have knowledge of the common terms used in them (alignment, scaling laws, RLHF, prompt engineering, instruction tuning, etc.).
  • Design new methodologies to leverage existing large scale language models in novel ways.

Prerequisites

Students should have a basic understanding of machine learning, equivalent to the material covered by 10-301/10-601, and be familiar with concepts in natural language processing, equivalent to those covered by 11-411/11-611.

Students are expected to be fluent in Python. Familiarity with deep learning frameworks such as PyTorch will also be helpful.

Class Format

Classes will be in person, every Tuesday and Thursday 2:00PM-3:20PM at Baker Hall A51.

Readings: There will be reading materials for each lecture, which students are required to read through before the class.

Quizes: Each class will start with an in-person quiz about the reading materials for the lecture or the material from previous lectures.

Interactive Activities: There will be ungraded, interactive activities interspersed through the lectures. These will be things like discussing a topic from the class with those sitting near you or answering questions via polling software.

Homework: There will be three homework assignments, to be completed individually.

Project: There will be a group project with several checkpoints along the semester. The project will be completed in groups of 3-5 people.

Exams: There will be a midterm exam on October 10. There will not be a final exam.

Grading

  • 30%: Homeworks
    • Each homework is worth 10% of your grade.
  • 45%: Course Project
    • Proposal: 10%
    • Midpoint report: 10%
    • Final report and presentation: 20%
    • Peer feedback: 5%
  • 5%: In-Class Quizzes
    • These will be given in the last 10 minutes of class.
    • Only your top 20 quiz grades will be considered. (This means you can miss 5 quizzes and still get full marks.)
  • 20%: Midterm

Late Policy

Each student has five free late dats to use on the three homeworks. If you are out of late days, then you will not be able to get credit for subsequent late homeworks. One “day” is defined as anytime between 1 second and 24 hours after the homework deadline. The intent of the late day policy it to allow you to take extra time due to unforseen circumstances like illnesses. To use your late days on a homework, you MUST fill out this form.

In the event of a medical emergency, please make your personal health, physical and mental, your first priority. Seek help from medical and care providers such as University Health Services. Students can request medical extensions afterwards with proof/note from providers. These will not count toward your 5 days. For other emmergencies and absences, students can request extensions with corresponding documentation in a case-by-case basis with instructors.

There will be no makeups for the quizzes. There will be no late days on the project milestones. If your team cannot complete a project milestone, we expect you to submit to us your incomplete work by the due date, and we will consider accepting subsequent revisions on a case-by-base basis.

Accomodations

If you have a disability and require accommodations, please contact Catherine Getchell, Director of Disability Resources, 412-268-6121, getchell@cmu.edu. If you have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate.

Policy on Missing Class

There will be no makeups for the quizzes if you miss class. The graded quizes are intended to encourage class attendance and participation, as there will be a variety of ungraded interactive activites in class which are important to your learning experience. In keeping with the principles of universal design for learning, we intend to be flexible with the quizzes; you can miss up to 5 quizzess for any reason (religious observances, visa issues, illness, etc.) and still get full marks, as we will only grade your top 20.

In extenuating circumstances where students have no option but to miss class (such as visa issues or a medical emergency), we may provide video recordings to individual students. If you need to request this, please email us llms-11-667 @ andrew.cmu.edu. We make no guarantees about the production quality or learning experience from these videos.

Academic Integrity

Please take some time to read through CMU’s Academic Integrity Policy. Students who violate this policy will be subject to the disciplinary actions described in the Student Handbook.

Collaboration on Homeworks

The three homeworks should be completed individually. However, we encourage you to ask questions on Piazza and in office hours. While you may discuss strategies amongst yourselves, all experiments and analyses should be your own.

Use of Language Models

Using a language model to generate any part of a homework answer without attribution will be considered a violation of academic integrity. This means that if you use ChatGPT or CoPilot to assist you on a homework, you must state so explicitly within your response. On each homework, you will be asked to attest to whether you used AI systems to assist on the homework, and if so, in what manner. If you have used AI systems to generate any part of your homework, you must submit the prompts/instructions/inputs you used to obtain the generated output. Your grading will be based on both the correctness of your homework response and the quality of your prompts/instructions. Errors in the generated outputs that appear in your homework response , and non-interesting prompts, e.g., merely putting in the homework questions to the language model, are not intellectual efforts and are unlikely to receive a good grade.