About Me
I am currently a postdoctoral researcher at Cyber Physical Systems (CPS) group at University of Oxford under supervisions of Profs. Niki Trigoni and Andrew Markham. My research is centered on multi-modal sensing for localization and cross-domain generalization, especially for building robust perception in real-world settings.
I obtained my D.Phil. (PhD) in Computer Science at the University of Oxford, co-supervised by Profs. Niki Trigoni and Andrew Markham in CPS group. My study was generously supported by Global Korea Scholarship (GKS) Program and ACE-OPS grant. Prior to my D.Phil., I worked as a research assosiate supervised by Prof. Yong-Guk Kim at Sejong University, South Korea, where I completed my master's and undergraduate courses in Computer Science.
Research Interests
My research aims to advance multi-modal and multi-view sensing with diverse physical sensors, including RGB-D cameras, LiDAR, mmWave radar, event cameras, etc. A key challenge in real-world settings is achieving reliable long-term localization of these sensors in a shared reference space, enabling accurate multi-view fusion while leveraging the unique capabilities of each modality.
Beyond localization, I am interested in how foundation models can develop reasoning abilities across modalities—understanding the complementary strengths of each sensor type and dynamically selecting and combining the most informative modalities for a given task. This supports robust, context-aware performance on downstream applications, including detection, tracking, re-identification, anomaly detection, etc.
Ultimately, my work seeks to bridge physical sensing with adaptive multi-modal learning, building perception systems that are accurate, resilient, and capable of operating reliably in complex real-world environments.
Research Platform
We have built a multi-modal sensing platform that integrates diverse physical sensors—including RGB-D cameras, LiDAR, mmWave radar, and event cameras—into a unified system. The goal is to enable robust, long-term perception in complex real-world environments by leveraging the complementary strengths of each modality.
By co-registering and fusing data from these heterogeneous sensors, the platform supports research in cross-domain generalization, multi-view localization, and adaptive modality fusion for various essential downstream tasks in ML.
To evaluate models for multi-modal research, we have recorded longterm large-scale multi-view, multi-modal datasets from wide-range of domains such as campus, industrial, and wildlife environments, capturing diverse real-world conditions that unimodal peception cannot address.
Ongoing Research
Multi-Modal Localization with Thermal, RGB, and mmWave Radar
Leveraging our multi-modal sensing platform (Frankenstein), this project explores robust localization by fusing thermal imaging, RGB cameras, and mmWave radar. By combining these complementary modalities, the system achieves reliable pose estimation across challenging conditions such as poor lighting, adverse weather, and visually degraded environments.
Extended Perception via Multi-Modal Fusion
In real-world, there are many circumstances where models trained on unimodal can fail significantly. For example, rain and water droplets on camera lenses severely degrade RGB-based perception, causing even SOTA monocular depth estimation algorithms to fail. In safety-critical applications, this can create critical problems. This project investigates how complementary modalities can compensate for vision degradation and overcome the unimodal limitations, enabling reliable perception under all weather conditions.
Unimodal Enhancement
While RGB-based perception has been extensively studied, there are many other modalities that can address the limitation of RGB. For example, mmWave radar and event cameras remain less explored despite offering essential complementary information. Radar provides robust sensing in adverse weather and lighting conditions, while event cameras capture high-temporal-resolution motion with minimal latency. Enhancing the individual capabilities of these underexplored modalities can extend the perception capability of all downstream intelligent systems.
News
- [Mar. 2026] Our paper WildDepth on a large-scale dataset with calibrated multi-modal sensors for 3D wildlife perception and depth estimation is now available on arXiv!
- [Nov. 2025] Our survey paper Revisiting U-Net on U-Net as a foundational backbone for modern generative AI got published in Artificial Intelligence Review (Springer)!
- [Nov. 2025] Our paper Thermal-to-RGB on enhancing low-resolution thermal imagery with diffusion models for wildlife monitoring got published at ACM International Workshop on Thermal Sensing and Computing 2025!
- [Jun. 2025] Our paper DiffRefine on generative cross-domain detection in 3D got accepted into ICCV 2025 for spotlight presentation!
- [Mar. 2025] Our paper WildPose on multi-modal sensing dataset for deformable animals got accepted into Journal of Experimental Biology and selected as the cover of the issue!
- [Jan. 2025] Completed my thesis correction and obtained D.Phil. (PhD) status!
- [Jan. 2025] Our paper SoundLoc3D on sound source localization got accepted into WACV 2025 for oral presentation!
- [Oct. 2024] Started my role as a postdoctoral researcher in CPS group.
- [Oct. 2024] Successfully defended my D.Phil viva! (Internal Examiner: Prof. Christian Rupprecht at VGG, External Examiner: Prof. Dimitrios Kanoulas at UCL).
- [Sep. 2024] Our paper GroupExp-DA on Domain-Adaptive 3D Detection got accepted into NeurIPS 2024!
- [May. 2024] Our paper on stereo depth estimation with visual foundation models got accepted into ICRA 2024!
- [Mar. 2024] Our paper Spherical Mask on 3D instance segmentation got accepted into CVPR 2024!
- [Feb. 2024] Defended my Confirmation viva (Examiners: Profs. Ronald Clark and Alessandro Abate).
- [Jan. 2024] Our paper Sound3DVDet on sound source localization got accepted into WACV 2024!
- [Jul. 2023] Our paper on view synthesis with NeRF got accepted into NeurIPS 2023!
- [Jan. 2023] Our paper Sample, Crop, Track on self-supervised 3D object detection got accepted into ICRA 2023!
- [Aug. 2022] Our paper on monocular Night-time Depth Estimation got accepted into CORL 2023!
- [Mar. 2022] Our paper on monocular SLAM on UAV got accepted into IROS 2022!
- [Jan. 2022] Defended my Transfer of Status viva (Examiners: Profs. Alex Rogers and Alessandro Abate).
- [Oct. 2020] Started my D.Phil. (PhD) at the University of Oxford.
- [Aug. 2020] Concluded my research at Sejong University, South Korea. Plase see my exciting researches at Sejong on reinforcement learning and sensing with videos here!
Selected Publications
Contact
Email: sangyun.shin@cs.ox.ac.uk
I'm open to opportunities, collaborations and research discussions. Feel free to reach out!