Publications

# corresponding author; * equal contribution; + project lead / mentorship.

Conference Papers


Order Matters in Retrosynthesis: Structure-aware Generation via Reaction-Center-Guided Discrete Flow Matching

Authors: Chenguang Wang*, Zihan Zhou*, Lei Bai, Tianshu Yu#

Published in ICML2026

This paper introduces a structure-aware framework for retrosynthesis that encodes chemical reactions’ two-stage nature as a positional inductive bias: placing reaction center atoms at the sequence head transforms implicit chemical knowledge into explicit patterns the model can learn. Combined with a graph diffusion transformer backbone and discrete flow matching, the approach achieves state-of-the-art performance on USPTO-50k (61.2%) and USPTO-Full (51.3%) with 6× faster training and 25× fewer sampling steps than prior diffusion methods. Notably, a 280K-parameter model with proper ordering matches a 65M-parameter model without it, demonstrating that well-designed inductive biases outperform brute-force scaling for efficient molecular design.

Order Matters in Retrosynthesis: Structure-aware Generation via Reaction-Center-Guided Discrete Flow Matching

Incomplete Data, Complete Dynamics: A Diffusion Approach

Authors: Zihan Zhou, Chenguang Wang, Hongyi Ye, Yongtao Guan, Tianshu Yu#

Published in ICLR2026

This paper tackles the challenge of learning physical dynamics from incomplete observational data, a fundamental constraint in real-world scientific applications where complete measurements are unavailable. The proposed diffusion-based framework learns directly from partial observations through strategic context-query partitioning that adapts to the underlying observation patterns. Theoretical analysis proves asymptotic convergence to the true complete data distribution, while empirical evaluations on synthetic PDEs and ERA5 climate data demonstrate substantial improvements over baselines, particularly in sparse observation regimes (1-20% coverage). This work provides a principled approach for imputing partially observed dynamics with strong theoretical guarantees and practical applicability.

Incomplete Data, Complete Dynamics: A Diffusion Approach

TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles

Authors: Yaoyao Xu, Di Wang, Zihan Zhou, Tianshu Yu#, Mingchen Chen#

Published in NeurIPS2025

This paper addresses the challenge of generating temporally coherent and physically realistic protein conformational ensembles by explicitly modeling the multi-scale nature of protein dynamics. While existing diffusion-based approaches generate conformational states independently and fail to capture causal dependencies in protein motion, the proposed TEMPO framework introduces a hierarchical autoregressive architecture that models dynamics as a Markovian stochastic process. The method decomposes motions into two temporal scales: a low-resolution model capturing slow collective transitions, and a high-resolution model generating detailed local fluctuations conditioned on these large-scale movements. Comprehensive evaluations on mdCATH and ATLAS datasets demonstrate superior performance in structural accuracy, temporal coherence, and computational efficiency, highlighting its potential for efficient and physically grounded protein dynamics simulation.

TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles

Generating Physical Dynamics under Priors

Authors: Zihan Zhou, Xiaoxue Wang, Tianshu Yu#

Published in ICLR2025

This paper addresses the challenge of generating physically feasible dynamics in a data-driven context by incorporating physical priors into diffusion-based generative models. While traditional generative approaches often fail to enforce fundamental physical laws, the proposed framework integrates two types of priors: distributional priors, such as roto-translational invariance, and physical feasibility priors, including conservation laws and PDE constraints. By embedding these priors into the generative process, the method efficiently produces realistic physical dynamics, such as trajectories and flows. Empirical evaluations demonstrate its effectiveness across various physical systems, highlighting its potential for advancing AI-driven scientific modeling.

Generating Physical Dynamics under Priors

Learning to Decouple Complex Systems

Authors: Zihan Zhou, Tianshu Yu#

Published in ICML2023

This work addresses the challenge of learning from cluttered and irregularly sampled sequential data by proposing a novel decoupling-based approach. The method explicitly separates a complex system into multiple latent sub-systems and a meta-system that captures their interactions over time. To achieve this, the interactions are modeled using projected differential equations (ProjDEs) with neural-friendly projection operators inspired by Bregman divergence. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of this approach in handling complex and cluttered sequential data.

Learning to Decouple Complex Systems

Preprint


AMix-2: Establishing Protein as a Native Modality in Large Language Models

Authors: Keyue Qiu*, Yixin Wu*, Zihan Zhou, Changze Lv, Lihao Wang, et al., Hao Zhou# (listed: Model core, co-first, and corresponding authors

Published in arxiv

We present AMix-2, a protein–text foundation model that establishes protein as a native modality in large language models, unifying protein understanding and conditional sequence design in a single model. Natural language and protein sequences share a token space for instruction-driven biological reasoning and design, while a block-wise diffusion language modeling backbone combines causal generation across blocks with bidirectional refinement within blocks—better aligned with non-local protein dependencies than strict left-to-right factorization. We also introduce ProteinArena, a time-aware and homology-aware benchmark spanning QA, EC/CATH classification, and function-conditioned design. On ProteinArena, AMix-2 outperforms frontier LLMs and is competitive with task-specific protein models; controlled studies further show the diffusion paradigm surpasses its autoregressive counterpart.

AMix-2: Establishing Protein as a Native Modality in Large Language Models

Observation-Aligned Mask Priors for Learning Physical Dynamics from Authentic Occlusions

Authors: Chiyuan Ma, Zihan Zhou+, Tianshu Yu#

Published in arxiv

Learning physical dynamics from incomplete observations is difficult when authentic occlusions are structured, sample-dependent, and often missing not at random, while existing context-query methods typically rely on heuristic masking that mismatches real sensing topologies. We propose Observation-Aligned Mask Priors: a Bayesian Flow Network is pretrained on binary observation masks to capture authentic occlusion patterns, then mask sampling is guided by a globally normalized cross-entropy objective to build sample-specific context-query partitions. Intersection-based partitioning assigns every valid observed dimension a strictly positive query probability, eliminating zero-query dead zones and local generative collapse. On three real-world oceanographic datasets with genuine satellite occlusions, at resolutions up to 256×256, our method consistently improves over strong diffusion baselines in MSE and PSNR without requiring fully observed training fields.

Observation-Aligned Mask Priors for Learning Physical Dynamics from Authentic Occlusions

Physics-Mediated Diffusion: Leveraging Intermediate Field Representations for Sparse PDE Inversion

Authors: Zihan Zhou*, Chiyuan Ma*, Tianshu Yu#

Published in arxiv

Inferring physical parameters from sparse observations is a severely ill-posed inverse problem: multiple parameter configurations can fit the same sparse data (equifinality), and optimization-based methods often stall in complex loss landscapes. We propose Physics-Mediated Diffusion, which treats the complete spatiotemporal field as a learned intermediate representation. A conditional diffusion model imposes a strong structural prior on plausible fields, and parameters are extracted via a differentiable operator derived from the governing PDE algebra (e.g., diffusivity as the ratio of temporal to spatial derivatives), ensuring end-to-end consistency without a full forward solver. We prove that recovery error is governed by the spectral properties of the extraction operator, and experiments show substantial gains over baselines in parameter recovery, especially for well-conditioned systems under extreme sparsity.

Physics-Mediated Diffusion: Leveraging Intermediate Field Representations for Sparse PDE Inversion

On Diffusion Process in SE(3)-invariant Space

Authors: Zihan Zhou, Ruiying Liu, Jiachen Zheng, Xiaoxue Wang, Tianshu Yu#

Published in arxiv

This paper analyzes the diffusion process in SE(3)-invariant space using differential geometry and proposes projection-free diffusion SDEs and ODEs. These formulations improve sampling efficiency and accuracy, benefiting applications like molecular conformation and human pose generation.

On Diffusion Process in SE(3)-invariant Space

Molecule Conformation Generation via Shifting Scores

Authors: Zihan Zhou, Ruiying Liu, Chaolong Ying, Ruimao Zhang, Tianshu Yu#

Published in arxiv

This paper introduces SDDiff, a diffusion-based model for molecular conformation generation that operates on inter-atomic distances to ensure SE(3)-equivariance. Instead of assuming a Gaussian distribution for distance perturbations, SDDiff derives a shifting score function based on molecular thermodynamics, modeling how inter-atomic distance changes transition from a Gaussian to a Maxwell-Boltzmann distribution under increasing noise. This formulation provides a more physically grounded way to reverse the diffusion process, ensuring more feasible molecular geometries.

Molecule Conformation Generation via Shifting Scores