Schedule
-
EventDateDescriptionCourse Material
-
Lecture01/26/2026
MondayJanuary 26, 2026[slides]Introduction to Senior Project I.
-
Lecture01/28/2026
WednesdayJanuary 28, 2026[slides]Overview of course expectations, deliverables, and grading criteria.
-
Lecture02/02/2026
MondayFebruary 2, 2026[slides]Suggested Readings:
Overview of research methods, how to read research papers, and an introduction to AI evaluation metrics (Precision, Recall, F-Measure).
-
Lecture02/04/2026
WednesdayFebruary 4, 2026[slides]Introduction to Single-Agent vs. Multi-Agent systems, GridWorld examples, Markov Decision Processes (MDP), and the curse of dimensionality in multi-agent environments.
-
Lecture02/09/2026
MondayFebruary 9, 2026[slides]Overview of Generalizability in AI Agents versus simple memorization. The lecture covers Reward Modeling strategies, including the trade-offs between Dense (hackable) and Sparse (slow) rewards, as well as advanced techniques like Curriculum Learning and Inverse Reinforcement Learning (IRL).
-
Lecture02/11/2026
WednesdayFebruary 11, 2026[slides]Overview of the hierarchy of decision-making models, moving from simple MDPs to Partially Observable Stochastic Games (POSG). The lecture also explores the necessity of distributed processing for AI agents, covering Data versus Model Parallelism and the differences between Synchronous and Asynchronous execution.
-
Lecture02/18/2026
WednesdayFebruary 18, 2026[slides]Exploration of the physical limits of AI agents regarding latency, bandwidth, and throughput. The lecture contrasts Cloud versus Edge architectures and discusses engineering solutions for system constraints, such as model compression, quantization, and activation offloading, particularly in latency-critical scenarios like autonomous braking.
-
Lecture02/23/2026
MondayFebruary 23, 2026[slides]The Transformer architecture and its applications in Deep Reinforcement Learning (RL) and Multi-Agent Deep Reinforcement Learning (MARL). The lecture covers training and finetuning AI models, the Bellman Equation, Q Learning, and Deep Q Networks, providing insights into their roles in decision-making processes.
-
Lecture02/25/2026
WednesdayFebruary 25, 2026[slides]Walkthroughs on DQL, Experience Buffers, and comparison of Q-Learning with REINFORCE.
-
Lecture03/02/2026
MondayMarch 2, 2026[slides]Discussion on Sample Complexity, Generalizability, Independent Q Learning (IQL), Actor-Critic methods, Multi-Agent Deep Deterministic Policy Gradient (MADDPG), and Imitation Learning methods (ehavioral Cloning, DAgger) in the context of Reinforcement Learning (RL) and Multi-Agent Deep Reinforcement Learning (MARL).
-
Lecture03/04/2026
WednesdayMarch 4, 2026[slides]Decision Making Algorithms for AI Agents like AlphaGo (RL + MCTS), TRPO, PPO, DPO, GRPO, StrateGo, and Intelligent AI Delegation
-
Lecture03/09/2026
MondayMarch 9, 2026[slides]Discussion on Robustness, Consistency and Risk Mitigation in AI Agents, Offline Reinforcement Learning, and Inverse Reinforcement Learning (IRL) in the context of Reinforcement Learning (RL) and Multi-Agent Deep Reinforcement Learning (MARL).
-
Lecture03/11/2026
WednesdayMarch 11, 2026[slides]Discussion on Model Predictive Control (MPC): Real-time constraints, mistake correction and daily control tasks by AI Agents in the context of Reinforcement Learning (RL) and Multi-Agent Deep Reinforcement Learning (MARL).
-
Lecture03/16/2026
MondayMarch 16, 2026[slides]Discussion on Game Theory for AI Agents in the context of Reinforcement Learning (RL) and Multi-Agent Deep Reinforcement Learning (MARL).

