Hyung Won Chung

Wiki Powered byIconIQ
Hyung Won Chung

We've just announced IQ AI.

Check it out

Hyung Won Chung

Hyung Won Chung (Korean: 정형원) is a South Korean artificial intelligence research scientist recognized for his contributions to the development and scaling of large language models (LLMs). He is currently a part of and has held research positions at OpenAI, and Google Brain, where he contributed to prominent models and frameworks such as PaLM, Flan-T5, T5X, and OpenAI's o1. [1] [2]

Early Life

Chung is originally from South Korea. He currently resides in Mountain View, California, a key hub for the technology industry. [1]

Education

Hyung Won Chung completed his doctoral studies at the Massachusetts Institute of Technology (MIT), where he earned a PhD. His academic background provided the foundation for his subsequent research career in machine learning and artificial intelligence. [2]

Career

Chung began his industry career as a research scientist at Google Brain, where his work centered on overcoming challenges related to the scaling of large AI models. He was a key contributor to T5X, a JAX-based framework designed to facilitate large-scale training of models, and was involved in training major models like the Pathways Language Model (PaLM). His research also significantly advanced the field of instruction fine-tuning, leading to the development of the Flan-PaLM and Flan-T5 model families, which improved the ability of LLMs to follow user instructions. [1]

In February 2023, Chung transitioned to OpenAI. At OpenAI, his research focused on enhancing the reasoning capabilities of AI systems and developing autonomous agents. He was a foundational contributor to several of the organization's major initiatives, including the o1-preview (September 2024), the full o1 model (December 2024), and the Deep Research project (February 2025). During this time, he also led the training efforts for the Codex mini model, a smaller, specialized version of the code-generation model. [1] [2]

In July 2025, Chung joined as an AI Research Scientist. He made the move from OpenAI alongside his colleague Jason Wei, with whom he had a close working relationship at both Google and OpenAI. [4] [5]

Major Works and Publications

Chung has co-authored numerous influential papers in the field of machine learning and natural language processing. His research has been published in top-tier journals and presented at major conferences.

  • Scaling Instruction-Finetuned Language Models (2022): This paper, published in the Journal of Machine Learning Research, systematically explored how scaling various aspects of model development—including model size, number of tasks, and chain-of-thought data—impacts performance. The research demonstrated significant improvements from instruction tuning and led to the release of the Flan-T5 models.
  • PaLM: Scaling Language Modeling with Pathways (2022): As a co-author, Chung contributed to the development of the 540-billion parameter Pathways Language Model (PaLM). The paper detailed how the model, trained on the Pathways system, achieved state-of-the-art few-shot performance across numerous language tasks, showcasing breakthroughs in reasoning, code generation, and translation.
  • Scaling Up Models and Data with t5x and seqio (2022): This work introduced T5X, a modular, JAX-based framework for high-performance training of large-scale Transformer models, and SeqIO, a task-based library for data preprocessing. Chung was a lead author on this paper, which provided the infrastructure for much of the large model research at Google.
  • OpenAI o1 System Card (2024): Chung was a contributor to the official system card for OpenAI's o1 model. This document provides a comprehensive overview of the model's capabilities, performance benchmarks, limitations, and the safety protocols implemented during its development.
  • GPT-4 Technical Report (2023): He was part of the team that produced the technical report for GPT-4. The report detailed the multimodal model's architecture, training process, and its substantially improved performance over previous generations on a wide array of professional and academic benchmarks.
  • Large Language Models Encode Clinical Knowledge (2023): Published in Nature, this research investigated the potential of LLMs in the medical domain. The study found that models like Flan-PaLM could achieve high accuracy on medical competency exams and provide coherent, long-form answers to clinical questions.
  • The Flan Collection: Designing Data and Methods for Effective Instruction Tuning (2023): Presented at the International Conference on Machine Learning (ICML), this paper described the creation and design of the "Flan Collection," a large dataset of tasks formatted as instructions. The work detailed the methods used to scale up instruction tuning and was foundational to the Flan-T5 models.
  • UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining (2023): This paper, presented at the International Conference on Learning Representations (ICLR), proposed a new data sampling method to improve the performance and fairness of multilingual language models by balancing data representation across different languages.

These publications highlight Chung's focus on model scaling, instruction tuning, and the practical application of large language models. [1]

Public Speaking and Lectures

Chung frequently shares his research and insights with the broader academic and technical communities through invited lectures and seminars at universities. His presentations cover topics such as the evolution of large language models, the principles of instruction fine-tuning, Reinforcement Learning from Human Feedback (RLHF), and high-level perspectives on paradigm shifts in AI research. He has delivered talks at institutions including:

  • Stanford University (for the CS 25 course)
  • Massachusetts Institute of Technology (MIT Embodied Intelligence seminar)
  • Seoul National University
  • New York University (for the CSCI 2590 course)
  • Cornell University

These lectures are often made publicly available and serve as educational resources for students and researchers in the field. [1] [3]

REFERENCES

HomeCategoriesRankEventsGlossary