Haozhu Wang's webpage

Haozhu Wang

I'm an AI researcher building superintelligence at Meta Superintelligence Labs. My research currently focuses on reinforcement learning, reasoning, and alignment. Before joining Meta, I spent three years at AWS Bedrock working on LLM research and building GenAI cloud services.

I earned my Ph.D. degree in Electrical and Computer Engineering (Machine Learning track) from the University of Michigan, Ann Arbor, where I pioneered reinforcement learning and foundation model research for inverse design problems, a key area in AI4Science. I also hold dual B.S. degrees in Electrical Engineering and Optics from a joint program between Tianjin University and Nankai University, China.

LinkedIn / Google Scholar

News

[May-2025] We presented our work Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive Guidance at WWW'25.

[APR-2025] We released Llama4! I'm a core contributor on safety and value alignment. [Launch Post].

[Nov-2024] We presented our work LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning at EMNLP Findings 2024.

[July-2024] Our work OptoGPT, a foundation model for optical inverse design, has been published as a cover article in Opto-Electronic Advances (IF: 14.1). The work has been reported in over 15 news outlets [News1] [News2] [News3].

[Dec-2023] Our work Graph Neural Prompting with Large Language Models is accepted by AAAI-24.

[Mar-2022] I joined AWS as a Research Scientist!

Selected Publications/Projects

The full list of my publications can be found on Google Scholar.

	The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation [LLMs, reinforcement learning] Llama Team 2025 Launch post Llama 4 family of models. I'm a core contributor to alignment post-training.
	Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive Guidance [reinforcement learning, foundation model] Zhe Wang, Haozhu Wang, Yanjun Qi WWW'25 paper We developed a hierarchical prompting method for transformer-based reinforcement learning models to enable efficient few-shot policy adaptation.
	LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning [LLMs, reasoning] Zifan Xu, Haozhu Wang, Dmitriy Bespalov, Xian Wu, Peter Stone, Yanjun Qi EMNLP Findings, 2024 paper We developed an unsupervised method for discovering latent skills to guide the demonstration selection for in-context learning with large langauge models.
	Graph Neural Prompting with Large Language Models [LLMs] Yijun Tian, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V.Chawla, Panpan Xu AAAI, 2024 arxiv A knowledge graph prompting method for large language models to improve their commonsense and biomedical reasoning performance.
	A Review of Reinforcement Learning for Natural Language Processing, and Applications in Healthcare [reinforcement learning, LLMs, ML for healthcare] Ying Liu, Haozhu Wang, Huixue Zhou, Mingchen Li, Yu Hou, Sicheng Zhou, Fang Wang, Rama Hoetzlein, Rui Zhang Journal of the American Medical Informatics Association, 2024 paper A comprehensive review of reinforcement learning applied to NLP and its healthcare applications.
	OptoGPT: A Foundation Model for Inverse Design in Optical Multilayer Thin Film Structures [AI for science, foundation model] Taigao Ma, Haozhu Wang, L. Jay Guo Opto-Electronic Advances, 2024 paper We developed OptoGPT, the first foundation model for optical thin film structure inverse design. After being trained on a large dataset of 10 million optical thin film designs, OptoGPT demonstrates remarkable capabilities including: 1) autonomous global design exploration, 2) efficient designs for various tasks, 3) the ability to output diverse designs, and 4) seamless integration of user-defined constraints. We believe OptoGPT is a major leap towards accelerating optical science with foundation models.
	Reinforcement Learning-Enabled Environmentally Friendly and Multi-functional Chrome-looking Plating [AI for science, reinforcement learning] Taigao Ma, Anwesha Saha, Haozhu Wang, L. Jay Guo, NeurIPS AI for Science Workshop, 2023, [Oral, selection rate: 10/150=6.7%] OpenReview Using reinforcement learning, we designed and fabricated two multilayer thin film structures that can mimic the visual appearance of decorative chrome plating, serving as a environmentally friendly and multi-functional replacement.
	Dynamic prediction of work status for workers with occupational injuries: assessing the value of longitudinal observations [ML for healthcare] Erkin Ötleş, Jon Seymour, Haozhu Wang, Brian T Denton Journal of the American Medical Informatics Association, 2022 paper We developed a forecasting model to predict return-to-work after occupational injuries based on longitudinal claim data. The model may allow case managers to better allocate medical resources and help speed up patients' recover process.
	NEUTRON: Neural Particle Swarm Optimization for Material-Aware Inverse Design of Structural Color [AI for science] Haozhu Wang, L. Jay Guo iScience, 2022 paper/ code We propose a hybrid machine learning and optimization method that combines mixture density networks and particle swarm optimization for accurate and efficient structural color inverse design.
	Benchmarking Deep Learning-based Models on Nanophotonic Inverse Design Problems [AI for science] Taigao Ma, Mustafa Tobah, Haozhu Wang* , L. Jay Guo* Opto-Electronic Science, 2022 (*: correspondence) paper We provide extensive benchmarking results on accuracy, diversity, robustness for commonly used deep learning models in nanophotonic inverse designs. The findings can help researchers select models that best suit their design problems.
	Automated Optical Multi-layer Design via Deep Reinforcement Learning [AI for science, reinforcement learning] Haozhu Wang , Zeyu Zheng, Chengang Ji, L. Jay Guo Machine Learning: Science and Technology, 2021 paper/ code/ abridged NeurIPS workshop version/ Training a novel sequence generation network with Proximal Policy Optimization for automatically discovering near-optimal optical designs.
	Learning Credible Models [Trustworthy AI] Jiaxuan Wang, Jeeheh Oh, Haozhu Wang , Jenna Wiens KDD, 2018 paper/ code Expert-yielded-estimates regularizer for incorporating expert knowledge into linear models.
	Learning to Share: Simultaneous Parameter Tying and Sparsification in Deep Learning [model compression] Dejiao Zhang, Haozhu Wang* , Mario A.T. Figueiredo, Laura Balzano ICLR, 2018 (: equal contribution) paper/ code Group-ordered-weighted lasso (GrOWL) for deep model compression.

The source code of this website is from Jon Barron.

(last update: 07/2025)