Loading beautiful scenery...

Tongyuan Bai

Ph.D. Student

ICL Group, School of Artificial Intelligence

Jinlin university

I am a Ph.D. student working at the intersection of artificial intelligence and computer graphics. My current work focuses on generating 3D spatial layouts for indoor and outdoor scene using diffusion models and large language models. My research vision is to automate the creation of diverse 3D scene, particularly those that exist solely in the realm of human imagination, contributing to the development of richly immersive and limitless virtual worlds.

Google Scholar GitHub LinkedIn

📧baity23@mails.jlu.edu.cn

📍Changchun, China

About Me

My background and expertise

Research Interests

3D Scene Generation

Computer Graphics

Diffusion Model

Large Language Model

Agent&MCP

Reinforcement Learning

Any Technology that Helps Build Interesting Worlds

Education

Bachelor

Automation

Tianjin University

graduated in 2018

Master

Control Engineering

Dalian University of Technology

2020 - 2023

Ph.D. Candidate

Artificial Intelligence

Jilin University

2023 - Present

Work Experience

FPGA Radar Engineer

Worked on FPGA-based radar systems development and signal processing.

Leijiu Technology Co., Ltd.

2018 - 2019

Backend Engineer(Intern)

Worked as a backend engineer intern in the Overseas Recommendation System team, responsible for backend development.

ByteDance-Data-AML

2022.6 - 2022.9

Publications

My research contributions and academic publications

FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts

CVPR 2025

CCFAConferenceaccepted

Tongyuan Bai, Wangyuanfan Bai, Dong Chen, Tieru Wu, Manyi Li, Rui Ma

Controllability plays a crucial role in the practical applications of 3D indoor scene synthesis. Existing works either allow rough language-based control, that is convenient but lacks fine-grained scene customization, or employ graph based control, which offers better controllability but demands considerable knowledge for the cumbersome graph design process. To address these challenges, we present FreeScene, a user-friendly framework that enables both convenient and effective control for indoor scene this http URL, FreeScene supports free-form user inputs including text description and/or reference images, allowing users to express versatile design intentions. The user inputs are adequately analyzed and integrated into a graph representation by a VLM-based Graph Designer. We then propose MG-DiT, a Mixed Graph Diffusion Transformer, which performs graph-aware denoising to enhance scene generation. Our MG-DiT not only excels at preserving graph structure but also offers broad applicability to various tasks, including, but not limited to, text-to-scene, graph-to-scene, and rearrangement, all within a single model. Extensive experiments demonstrate that FreeScene provides an efficient and user-friendly solution that unifies text-based and graph based scene synthesis, outperforming state-of-the-art methods in terms of both generation quality and controllability in a range of applications.

Paper Project Code

SigStyle: Signature Style Transfer via Personalized Text-to-Image Models

AAAI 2025

CCFAConferenceaccepted

Ye Wang, Tongyuan Bai, Xuping Xie, Zili Yi, Yilin Wang, Rui Ma

Style transfer enables the seamless integration of artistic styles from a style image into a content image, resulting in visually striking and aesthetically enriched outputs. Despite numerous advances in this field, existing methods did not explicitly focus on the signature style, which represents the distinct and recognizable visual traits of the image such as geometric and structural patterns, color palettes and brush strokes etc. In this paper, we introduce SigStyle, a framework that leverages the semantic priors that embedded in a personalized text-to-image diffusion model to capture the signature style representation. This style capture process is powered by a hypernetwork that efficiently fine-tunes the diffusion model for any given single style image. Style transfer then is conceptualized as the reconstruction process of content image through learned style tokens from the personalized diffusion model. Additionally, to ensure the content consistency throughout the style transfer process, we introduce a time-aware attention swapping technique that incorporates content information from the original image into the early denoising steps of target image generation. Beyond enabling high-quality signature style transfer across a wide range of styles, SigStyle supports multiple interesting applications, such as local style transfer, texture transfer, style fusion and style-guided text-to-image generation. Quantitative and qualitative evaluations demonstrate our approach outperforms existing style transfer methods for recognizing and transferring the signature styles.

Paper Project Code

Feature Fusion Deep Reinforcement Learning Approach for Stock Trading

CCC 2022

EIConferenceaccepted

Tongyuan Bai, Qi Lang, Shifan Song, Yan Fang, Xiaodong Liu

This paper presents a novel feature fusion deep reinforcement learning approach for stock trading. The proposed method combines multiple feature extraction techniques with deep reinforcement learning algorithms to improve trading decision-making in financial markets. By integrating various market indicators and technical features, the approach aims to capture complex market dynamics and enhance trading performance through intelligent automated trading strategies.

Paper

News

Latest academic updates and research progress

Our paper accepted to CVPR 2025

March 20, 2025

Our paper 'FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts' has been accepted to CVPR 2025!

Learn More

Our paper accepted to AAAI 2025

December 10, 2024

Our paper 'SigStyle: Signature Style Transfer via Personalized Text-to-Image Models' has been accepted to AAAI 2025!

Learn More