Teaching
2023 Fall 3 credits · Year 3-4

Large Language Models

대규모언어모델

Architecture, application paradigms, and ethics of LLMs — hands-on training, paper analysis, and practical development.

LLM Deep Learning Ethics

Course Description

This advanced course imparts a comprehensive understanding of Large Language Models (LLMs) with a focus on their architecture, application paradigms, and ethical implications. Structured over 15 weeks, the course is tailored for students with a background in machine learning and natural language processing. It features hands-on training, in-depth analysis of scholarly papers, a midterm examination, and a final project centering on the development of a practical LLM application.

Learning Goals

Upon completing this course, students will be able to:

  1. Understand the architectural intricacies of leading LLMs like BERT, T5, and GPT-3.
  2. Utilize specialized techniques such as few-shot learning, prompt engineering, and in-context learning in LLMs.
  3. Investigate and address ethical concerns including bias and data privacy.
  4. Implement a real-world LLM application as part of the final project.
  5. Critically evaluate peer projects through a formal review process.

Grading

  • Participation: 10%
  • Midterm Exam: 25%
  • Peer Reviews of Final Project: 5%
  • Final Project: 60%

Final Project

The final project mandates students to create a real-world application using a large language model. The project involves data pre-processing, model training/fine-tuning, evaluation, and documentation. Students will also participate in peer reviews to critically evaluate the projects of their peers. The deliverables include a functional LLM application and a research paper.

Course Outline

Week 1: Introduction to Large Language Models

  • Architectures: BERT, T5, GPT-3
  • Recommended Readings: BERT paper, T5 paper, GPT-3 paper

Week 2: Prompting Techniques

  • Few-Shot Learning, In-Context Learning
  • Readings: Making Pre-trained Language Models Better Few-shot Learners, How Many Data Points is a Prompt Worth?

Week 3: Efficient Fine-Tuning

  • Parameter-Efficient Techniques
  • Readings: Prefix-Tuning, The Power of Scale for Parameter-Efficient Prompt Tuning

Week 4: Calibration and Reasoning

  • Calibration Methods, Eliciting Reasoning
  • Readings: Calibrate Before Use, Chain of Thought Prompting

Week 5: Data in LLMs

  • Data Quality and Documentation
  • Readings: Documenting Large Webtext Corpora

Week 6: Industry Applications of LLMs

  • Real-world Use-cases and Challenges
  • Open Discussion and Guest Lecture

Week 7: Bias and Toxicity I

  • Evaluation of Bias and Toxicity
  • Readings: RealToxicityPrompts, OPT paper, Section 4

Week 8: Midterm Exam

Week 9: Bias and Toxicity II

  • Mitigation Strategies
  • Readings: Self-Diagnosis and Self-Debiasing

Week 10: Scaling LLMs

  • Compute-Optimal Training
  • Readings: Training Compute-Optimal Large Language Models

Week 11: Privacy Concerns

  • Data Extraction Risks
  • Readings: Extracting Training Data from LLMs

Week 12: Alternative Architectures

  • Sparse Models, Retrieval-Based Models
  • Readings: Switch Transformers, Improving language models by retrieving

Week 13: Human Feedback in LLMs

  • Training with Human Feedback
  • Readings: Training language models to follow instructions with human feedback

Week 14: Code in LLMs

  • Code-Based Language Models
  • Readings: Evaluating Large Language Models Trained on Code

Week 15: Final Project Presentations and Recap

  • Project Presentations and Peer Reviews
  • Course Recap and Concluding Remarks