Login with HarvardKey to view all events.

Building and Deploying Large Language Model Applications Efficiently and Verifiably

This is a past event.

Thursday, March 21, 2024 4pm to 5pm

Image of Building and Deploying Large Language Model Applications Efficiently and Verifiably

Event Dates

Thursday, March 21, 2024 4pm to 5pm

Science and Engineering Complex (SEC), SEC LL2.224
Add to calendar

The applications of large language models (LLMs) are increasingly complex and diverse, necessitating efficient and reliable frameworks for building and deploying them. In this talk, I will begin with algorithms and systems for serving LLMs for everyone (FlexGen, S-LoRA, VTC), highlighting the growing trend of personalized LLM services. My work addresses the need to run LLMs locally for isolated individual needs. It also tackles the problem of efficiency and service fairness when resource sharing among many users is required. Once we have efficient deployment, a primary concern is the reliability of generation. The second part of this talk aims to address this issue by exploring verifiable code generation. To achieve this, I adopt tools in formal verification to facilitate LLMs in generating correctness certificates alongside other artifacts (Clover). Finally, I will touch on future research avenues, such as integrating formal methods with LLMs and developing programming systems for generative AI.

Event Details