Publications

(2024). Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection. In ArXiv.

Cite

(2024). Bridging the Domain Gap by Clustering-based Image-Text Graph Matching. In ArXiv.

PDF Cite

(2024). Robust Sound-guided Image Manipulation. In Neural Networks.

PDF Cite Project

(2024). Higher-order Relational Reasoning for Pedestrian Trajectory Prediction. In CVPR2024.

Cite

(2024). EGTR: Extracting Graph from Transformer for Scene Graph Generation. In CVPR2024.

Cite

(2024). Mitigating the Linguistic Gap with Phonemic Representations for Robust Multilingual Language Understanding. In ArXiv.

PDF Cite

(2024). CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-based 3D Object Detection. In AAAI2024.

PDF Cite

(2024). InstructBooth: Instruction-following Personalized Text-to-Image Generation. In ArXiv.

PDF Cite Project

(2024). Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation. In WACV2024.

PDF Cite

(2024). BEVMap: Map-Aware BEV Modeling for 3D Perception. In WACV2024.

PDF Cite Video

(2023). Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models. In EMNLP 2023.

PDF Cite Video

(2023). Distillation for High-Quality Knowledge Extraction via Explainable Oracle Approach. In BMVC2023.

PDF Cite Code

(2023). The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion. In ICCV2023.

PDF Cite Code Project

(2023). The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion. In ICMLW2023.

PDF Cite Code Project

(2023). Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation. In ICMLW 2023.

PDF Cite

(2023). Bridging the Domain Gap by Clustering-based Image-Text Graph Matching. In ICMLW 2023.

PDF Cite

(2023). RUFI: Reducing Uncertainty in Behavior Prediction with Future Information. In CVPRW2023.

PDF Cite

(2023). CloudNet: A LiDAR-Based Face Anti-Spoofing Model That Is Robust Against Light Variation. In IEEE Access.

PDF Cite

(2023). An Embedding-Dynamic Approach to Self-supervised Learning. In WACV 2023.

PDF Cite

(2022). Zero-shot Visual Commonsense Immorality Prediction. In BMVC 2022.

Cite Code

(2022). ORA3D Overlap Region Aware Multi-view 3D Object Detection. In BMVC 2022.

PDF Cite

(2022). Sound-guided Semantic Video Generation. In ECCV 2022.

PDF Cite Code Project

(2022). Grounding Visual Representations with Texts for Domain Generalization. In ECCV 2022.

PDF Cite Code Dataset

(2022). Bridging the Domain Gap towards Generalization in Automatic Colorization. In ECCV 2022.

PDF Cite Code

(2022). Zero-shot Visual Commonsense Immorality Prediction (Abstracted Version). In CVPRW 2022.

Cite

(2022). Sound-Guided Semantic Image Manipulation. In CVPR 2022.

PDF Cite Code Project

(2022). StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving. In ICRA.

PDF Cite

(2022). Occupancy Flow Fields for Motion Forecasting in Autonomous Driving. In RA-L/ICRA.

PDF Cite

(2021). Towards Explainable and Advisable Model for Self-driving Cars. In Applied AI Letters.

PDF Cite

(2021). Sound-guided Semantic Image Manipulation. In NeurIPS Workshop.

PDF Cite

(2021). Audio-Semantic Image Synthesis for Artistic Paintings. In NeurIPS Workshop.

PDF Cite

(2021). A Scenario-Based Platform for Testing Autonomous Vehicle Behavior Prediction Models in Simulation. In NeurIPS Workshop 2021.

PDF Cite Code

(2021). Inter-domain curriculum learning for domain generalization. In ICT Express.

PDF Cite Code

(2021). SelfReg: Self-supervised Contrastive Regularization for Domain Generalization. In ICCV.

PDF Cite Code