1

RUFI: Reducing Uncertainty in Behavior Prediction with Future Information

Autonomous driving has shown significant progress in recent years, but accurately predicting the movements of surrounding traffic agents remains a challenge for ensuring safety. Previous studies have focused on behavior prediction using large-scale …

CloudNet: A LiDAR-Based Face Anti-Spoofing Model That Is Robust Against Light Variation

Face anti-spoofing (FAS) is a technology that protects face recognition systems from presentation attacks. The current challenge faced by FAS studies is the difficulty in creating a generalized light variation model. This is because face data are …

An Embedding-Dynamic Approach to Self-supervised Learning

A number of recent self-supervised learning methods have shown impressive performance on image classification and other tasks. A somewhat bewildering variety of techniques have been used, not always with a clear understanding of the reasons for their …

Resolving Class Imbalance Problem for LiDAR-based Object Detector by Balanced Gradients and Contextual Ground Truth Sampling

An autonomous driving system requires a 3D object detector, which must perceive all present road agents reliably to navigate an environment safely. However, real world driving datasets often suffer from the problem of data imbalance, which causes …

ORA3D Overlap Region Aware Multi-view 3D Object Detection

Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network. Moreover, objects in the overlap …

Zero-shot Visual Commonsense Immorality Prediction

Artificial intelligence is currently powering diverse real-world applications. These applications have shown promising performance, but raise complicated ethical issues, i.e. how to embed ethics to make AI applications behave morally. One way toward …

Bridging the Domain Gap towards Generalization in Automatic Colorization

We propose a novel automatic colorization technique that learns domain-invariance across multiple source domains and is able to leverage such invariance to colorize grayscale images in unseen target domains. This would be particularly useful for …

Grounding Visual Representations with Texts for Domain Generalization

Reducing the representational discrepancy between source and target domains is a key component to maximize the model generalization. In this work, we advocate for leveraging natural language supervision for the domain generalization task. We …

Sound-guided Semantic Video Generation

The recent success in StyleGAN demonstrates that pre-trained StyleGAN latent space is useful for realistic video generation. However, the generated motion in the video is usually not semantically meaningful due to the difficulty of determining the …

Zero-shot Visual Commonsense Immorality Prediction (Abstracted Version)

Artificial intelligence is currently powering diverse realworld applications. These applications have shown promising performance, but raise complicated ethical issues, i.e. how to embed ethics to make AI applications behave morally. One way toward …