Heng Fan - Selected Publications

For full publication list, please go to my Google Scholar profile.
(*equal contribution, †equal advising and co-last authors)

2026 / in press

	Towards Long-Form Spatio-Temporal Video Grounding X. Gu, B. Fan, J. Yao, Z. Zhang, Y. Huang, C. Han, H. Fan†, and L. Zhang† European Conference on Computer Vision (ECCV), 2026. paper code (coming)
	Learning to Segment Liquids in Real-world Images J. Li, M. Li, L. Liu, X. Yuan, and H. Fan IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2026. paper code
	ProMCP: Profiling Token Flows and Latency Costs in Model Context Protocol–Based LLM Agents S. Anjum, W. Zheng, R. Kettimuthu, H. Fan, and Y. Feng Findings of the Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2026. paper code
	DMTrack: Spatio-Temporal Multimodal Tracking via Dual-Adapter W. Li, S. Dong, H. Lu, Y. Zhang, H. Fan†, and L. Zhang† IEEE International Conference on Robotics and Automation (ICRA), 2026. paper code
	OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding J. Yao, X. Gu, X. Deng, M. Dai, B. Fan, Z. Zhang, Y. Huang, H. Fan†, and L. Zhang† International Conference on Learning Representations (ICLR), 2026. paper code-data
	IRDFusion: Iterative Relation-Map Difference guided Feature Fusion for Multispectral Object Detection J. Shen, H. Zhan, X. Zuo, H. Fan, X. Yuan, J. Li, and W. Yang Pattern Recognition (PR), 176: 113189, 2026. paper code
	Harmful Factuality: LLMs Correcting What They Shouldn't M. Li, H. Zhang, H. Fan, J. Ding, and Y. Feng Findings of the European Chapter of the Association for Computational Linguistics (EACL Findings), 2026. paper code
	Structured Context Learning for Generic Event Boundary Detection X. Gu, C. Li, X. Wang, D. Hong, L. Zhang, T. Luo, L. Wen, and H. Fan IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026. paper

2025

	PlanarTrack: A High-quality and Challenging Benchmark for Large-scale Planar Object Tracking Y. Jiao, X. Liu, X. Liu, X. Yuan, H. Fan, and L. Zhang Computer Vision and Image Understanding (CVIU), 306: 130848, 2025. paper data
	Robust Ego-Exo Correspondence with Long-Term Memory Y. Hu, B. Fan, X. Gu, H. Ren, D. Liu, H. Fan†, and L. Zhang† Advances in Neural Information Processing Systems (NeurIPS), 2025. paper code
	LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers L. Lin, H. Fan, Z. Zhang, Y. Huang, Y. Wang, Y. Xu, and H. Ling Advances in Neural Information Processing Systems (NeurIPS), 2025. (Spotlight) paper code
	All You Need is One: Capsule Prompt Tuning with a Single Vector Y. Liu, J. Liang, H. Fan, W. Yang, Y. Cui, X. Han, L. Huang, D. Liu, Q. Wang, and C. Han Advances in Neural Information Processing Systems (NeurIPS), 2025. paper
	DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting M. Li, H. Fan, S. Fu, J. Ding, and Y. Feng Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2025. paper code
	PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization B. Fan, Y. Feng, Y. Tian, J. Liang, Y. Lin, Y. Huang, and H. Fan IEEE/CVF International Conference on Computer Vision (ICCV), 2025. paper code
	GSOT3D: Towards Generic 3D Single Object Tracking in the Wild Y. Jiao, Y. Li, J. Ding, Q. Yang, S. Fu, H. Fan†, and L. Zhang† IEEE/CVF International Conference on Computer Vision (ICCV), 2025. paper code-data
	Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking Y. Li, Y. Jiao, D. Meng, H. Fan†, and L. Zhang† IEEE/CVF International Conference on Computer Vision (ICCV), 2025. paper code
	Edge-Aware Token Halting for Efficient and Accurate Medical Image Segmentation Y. Guo, B. Song, H. Fan, and E. Cheng International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2025. paper code
	Efficient and Accurate Low-Resolution Transformer Tracking S. Dong, Y. Feng, J. Liang, Q. Yang, Y. Lin, and H. Fan IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025. (Oral) paper code
	G3CN: Gaussian Topology Refinement Gated Graph Convolutional Network for Skeleton-Based Action Recognition H. Ren, Z. Luo, H. Fan, X. Yuan, G. Wang, and L. Zhang IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025. (Oral) paper code
	High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion L. Zhang, Y. Yu, J. Yao, and H. Fan International Journal of Computer Vision (IJCV), 133: 5788-5805, 2025. paper
	DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration H. Zhang, H. Fan, K. Sha, Y. Huang, and Y. Feng Findings of the Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2025. paper code
	CorrBEV: Multi-View 3D Object Detection by Correlation Learning with Multi-modal Prototypes Z. Xue, M. Guo, H. Fan, S. Zhang, and Z. Zhang IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. paper
	LaMOT: Language-Guided Multi-Object Tracking Y. Li, X. Liu, L. Liu, H. Fan†, and L. Zhang† IEEE International Conference on Robotics and Automation (ICRA), 2025. paper code
	CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking W. Li, X. Liu, H. Fan†, and L. Zhang† IEEE International Conference on Robotics and Automation (ICRA), 2025. paper code
	The Devil is in the Quality: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection Z. Zhang, Z. Li, H. Wang, H. Yuan, K. Wang, and H. Fan IEEE International Conference on Robotics and Automation (ICRA), 2025. paper
	Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding X. Gu, Y. Shen, C. Luo, T. Luo, Y. Huang, Y. Lin, H. Fan†, L. Zhang† International Conference on Learning Representations (ICLR), 2025. (Oral) paper slide poster code
	AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary Pedestrian Attributes Y. Li, Z. Xiao, L. Yang, D. Meng, X. Zhou, H. Fan, and L. Zhang IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 36(3): 5454-5468, 2025. paper code

2024

	VastTrack: Vast Category Visual Object Tracking L. Peng, J. Gao, X. Liu, W. Li, S. Dong, Z. Zhang, H. Fan†, and L. Zhang† Advances in Neural Information Processing Systems (NeurIPS*), 2024. paper poster code-benchmark
	Optical Flow as Spatial-Temporal Attention Learners Y. Lu, C. Han, Q. Wang, H. Fan, Z. Kong, D. Liu, and Y. Chen IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 11491-11506, 2024. paper
	Cyclic Refiner: Object-Aware Temporal Representation Learning for Multi-View 3D Detection and Tracking M. Guo, Z. Zhang, L. Jing, Y. He, K. Wang, and H. Fan International Journal of Computer Vision (IJCV), 132: 6184–6206, 2024. paper
	Beyond MOT: Semantic Multi-Object Tracking Y. Li, Q. Li, H. Wang, X. Ma, J. Yao, S. Dong, H. Fan†, and L. Zhang† European Conference on Computer Vision (ECCV), 2024. paper code-data
	Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance L. Lin, H. Fan, Z. Zhang, Y. Wang, Y. Xu, and H. Ling European Conference on Computer Vision (ECCV), 2024. paper code
	Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning S. Dong, Y. Feng, Q. Yang, Y. Huang, D. Liu, and H. Fan IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024. (Oral) paper code
	SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles D. Qu, Q. Chen, T. Bai, A. Qin, H. Lu, H. Fan, S. Fu, and Q. Yang IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024. (Oral) paper code
	Robust Domain Adaptive Object Detection with Unified Multi-Granularity Alignment L. Zhang, W. Zhou, H. Fan‡, T. Luo, and H. Ling (‡corresponding author) IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 9161-9178, 2024. paper code
	Divert More Attention to Vision-Language Object Tracking M. Guo, Z. Zhang, L. Jing, H. Ling, and H. Fan IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 46(12): 8600-8618, 2024. paper code
	Context-Guided Spatio-Temporal Video Grounding X. Gu, H. Fan, Y. Huang, T. Luo, and L. Zhang IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. paper poster code
	ProMotion: Prototypes As Motion Learners Y. Lu, D. Liu, Q. Wang, C. Han, Y. Cui, Z. Cao, X. Zhang, Y. Chen, and H. Fan IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 paper
	Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction J. Zheng, H. Fan, and L. Zhang IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 paper
	MaGIC: Multi-modality Guided Image Completion H. Wang, Y. Yu, T. Luo, H. Fan, and L. Zhang International Conference on Learning Representations (ICLR), 2024. paper project code
	Local Compressed Video Stream Learning for Generic Event Boundary Detection L. Zhang, X. Gu, C. Li, T. Luo, and H. Fan International Journal of Computer Vision (IJCV), 132: 1187-1204, 2024. paper code
	SSPNet: Scale and Spatial Priors Guided Generalizable and Interpretable Pedestrian Attribute Recognition J. Shen, T. Guo, X. Zuo, H. Fan, and W. Yang Pattern Recognition (PR), 148: 110194, 2024. paper
	ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection J. Shen, Y. Chen, Y. Liu, X. Zuo, H. Fan, and W. Yang Pattern Recognition (PR), 145: 109913, 2024. paper code

2023

	A Multi-granularity Decade-Long Geo-Tagged Twitter Dataset for Spatial Computing Y. Feng, Z. Meng, C. Clemmer, H. Fan, and Y. Huang ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL), 2023. paper project-data
	PIDray: A Large-scale X-ray Benchmark for Real-World Prohibited Item Detection L. Zhang, L. Jiang, R. Ji, and H. Fan International Journal of Computer Vision (IJCV), 131: 3170-3192, 2023. paper code-data
	Collaborative Three-Stream Transformers for Video Captioning H. Wang, L. Zhang, H. Fan, and T. Luo Computer Vision and Image Understanding (CVIU), 235: 103799, 2023. paper code
	Unsupervised Domain Adaptive Detection with Network Stability Analysis W. Zhou, H. Fan, T. Luo, and L. Zhang IEEE/CVF International Conference on Computer Vision (ICCV), 2023. paper code
	Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers B. Gu, H. Fan, and L. Zhang IEEE/CVF International Conference on Computer Vision (ICCV), 2023. paper code
	Accurate and Fast Compressed Video Captioning Y. Shen, X. Gu, K. Xu, H. Fan, L. Wen, and L. Zhang IEEE/CVF International Conference on Computer Vision (ICCV), 2023. paper code
	PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking X. Liu, X. Liu, Z. Yi, X. Zhou, T. Le, L. Zhang, Y. Huang, Q. Yang, and H. Fan IEEE/CVF International Conference on Computer Vision (ICCV), 2023. paper code-data
	AnimalTrack: A Benchmark for Multi-Animal Tracking in the Wild L. Zhang, J. Gao, Z. Xiao, and H. Fan International Journal of Computer Vision (IJCV), 131: 496-513, 2023. paper project with data

2022

	SwinTrack: A Simple and Strong Baseline for Transformer Tracking L. Lin, H. Fan, Z. Zhang, Y. Xu, and H. Ling Advances in Neural Information Processing Systems (NeurIPS), 2022. paper poster code
	Divert More Attention to Vision-Language Tracking M. Guo, Z. Zhang, H. Fan, and L. Jing Advances in Neural Information Processing Systems (NeurIPS), 2022. paper poster code
	High-Fidelity Image Inpainting with GAN Inversion Y. Yu, L. Zhang, H. Fan, and T. Luo European Conference on Computer Vision (ECCV), 2022. paper supplementary
	Towards Bridging the Distribution Gap: Instance to Prototype Earth Mover’s Distance for Distribution Alignment Q. Zhou, R. Wang, G. Zeng, H. Fan, and G. Zheng Medical Image Analysis (MedIA), 82: 102607, 2022. paper
	Detection and Tracking Meet Drones Challenge P. Zhu, L. Wen, D. Du, X. Bian, H. Fan, Q. Hu, and H. Ling IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 44(11): 7380-7399, 2022. paper code and project
	GL-GAN: Adaptive Global and Local Bilevel Optimization for Generative Adversarial Network Y. Liu, H. Fan, X. Yuan, and J. Xiang Pattern Recognition (PR), 123: 108375, 2022. paper code
	Learning Target-aware Representation for Visual Tracking via Informative Interactions M. Guo, Z. Zhang, H. Fan, L. Jing, Y. Lyu, B. Li, and W. Hu International Joint Conference on Artificial Intelligence (IJCAI), 2022. (Long Oral) paper code

2021

	Transparent Object Tracking Benchmark H. Fan, H. Miththanthaya, Harshit, S. Rajan, X. Liu, Z. Zou, Y. Lin, and H. Ling IEEE International Conference on Computer Vision (ICCV), 2021. paper code and project
	CRACT: Cascaded Regression-Align-Classification for Robust Visual Tracking H. Fan and H. Ling IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021. paper project
	LaSOT: A High-quality Large-scale Single Object Tracking Benchmark H. Fan, H. Bai, L. Lin, F. Yang, P. Chu, G. Deng, S. Yu, Harshit, M. Huang, J. Liu, Y. Xu, C. Liao, L. Yuan, and H. Ling International Journal of Computer Vision (IJCV), 129: 439-461, 2021. paper code and benchmark
	ClsGAN: Selective Attribute Editing Based On Classification Adversarial Network Y. Liu, H. Fan, F. Ni, and J. Xiang Neural Networks (NN), 133: 220-228, 2021. paper code
	TracKlinic: Diagnosis of Challenge Factors in Visual Tracking H. Fan, F. Yang, P. Chu, Y. Lin, L. Yuan, and H. Ling IEEE Winter Conference on Applications of Computer Vision (WACV), 2021. paper project
	MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking H. Fan and H. Ling IEEE Winter Conference on Applications of Computer Vision (WACV), 2021. paper project
	Robust and Efficient Graph Correspondence Transfer for Person Re-identification Q. Zhou, H. Fan, H. Yang, H. Su, S. Zheng, S. Wu, and H. Ling IEEE Transactions on Image Processing (T-IP), 30: 1623-1638, 2021. paper code

2020

	Weighted Bilinear Coding Over Salient Body Parts for Person Re-identification Z. Chang, Q. Zhou, H. Fan, H. Yang, H. Su, S. Zheng, and H. Ling Neurocomputing, 407: 454-464, 2020. paper
	Detection of Trabecular Landmarks for Osteoporosis Prescreening in Dental Panoramic Radiographs J. Ren, H. Fan, J. Yang, and H. Ling IEEE Engineering in Medicine and Biology Society (EMBC), 2020. paper

2019

	Clustered Object Detection in Aerial Images F. Yang, H. Fan, P. Chu, E. Blasch, and H. Ling IEEE International Conference on Computer Vision (ICCV), 2019. paper supplementary code
	Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking H. Fan and H. Ling IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. paper code
	LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking H. Fan, L. Lin, F. Yang, P. Chu, G. Deng, S. Yu, H. Bai, Y. Xu, C. Liao, and H. Ling IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. paper code and benchmark
	Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection H. Fan, P. Chu, L. Latecki, and H. Ling IEEE Winter Conference on Applications of Computer Vision (WACV), 2019. paper
	Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment P. Chu, H. Fan, C. Tan, and H. Ling IEEE Winter Conference on Applications of Computer Vision (WACV), 2019. paper
	Parallel Tracking and Verifying H. Fan and H. Ling IEEE Transactions on Image Processing (T-IP), 28(8): 4130-4144, 2019. paper code

2018

	Multi-level Contextual RNNs with Attention Model for Scene Labeling H. Fan, X. Mei, D. Prokhorov, and H. Ling IEEE Transactions on Intelligent Transportation Systems (T-ITS), 19(11): 3475-3485, 2018. paper
	Graph Correspondence Transfer for Person Re-identification Q. Zhou, H. Fan, S. Zheng, H. Su, X. Li, S. Wu, and H. Ling AAAI Conference on Artificial Intelligence (AAAI), 2018. (Oral) paper code

2017

	Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking H. Fan and H. Ling IEEE International Conference on Computer Vision (ICCV), 2017. paper poster slide code
	SANet: Structure-Aware Network for Visual Tracking H. Fan and H. Ling IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, 2017. paper code
	Robust Visual Tracking via Local-Global Correlation Filter H. Fan and J. Xiang AAAI Conference on Artificial Intelligence (AAAI), 2017. (Oral) paper
	Robust Visual Tracking with Multitask Joint Dictionary Learning H. Fan and J. Xiang IEEE Transactions on Circuits and Systems for Video and Technology (T-CSVT), 27(5): 1018-1030, 2017. paper project

2016

Cross Datasets Vegetation Detection with Spatial Prior and Local Context
H. Fan, X. Mei, D. Prokhorov, and H. Ling
IEEE Intelligent Vehicles Symposium (IV), 2016.
paper code and data (by request)

PhD Dissertation

Algorithms and Benchmarks for Robust Visual Object Tracking
H. Fan
Advisor: Prof. Haibin Ling
Committee: Professors Xianfeng Gu, Dimitris Samaras, Jie Yang
Department of Computer Science, State University of New York at Stony Brook, 2021.
pdf link slide

Copyright Notice: The papers presented above are to ensure timely dissemination of scholarly and technical work and only for personal or classroom use. Copyright and all rights therein are retained by authors and/or by other copyright holders.