Research
I'm interested in probability, optimization and deep learning. Currently, I am interested in topics intersection between information theory, sampling and deep learning.
My long-term research goal is develop reliable AI.
|
|
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
Buu Phan, Brandon Amos, Itai Gat,
Marton Havasi,
Matthew Muckley,
Karen Ullrich
Preprint, preliminary version accepted at ICML 2024 Workshop on Theoretical Foundations of Foundation Models , 2024
We provide an algorithmic process to convert any tokenized LLMs to token-free (byte-level) LLMs. This naturally solves the odd generation problem in LLM when the prompt ends with mid-token or white space. We also demonstrate the application in model ensemble where each LLM has a distinct tokenization.
|
|
Importance Matching Lemma for Lossy Compression with Side Information
Buu Phan*,
Ashish Khisti*,
Christos Louizos,
International Conference on Artificial Intelligence and Statistics (AISTATS) , 2024
Code
We provide a channel simulation method based on Poisson Matching Lemma (Li and Anatharam 2021) for a fixed number of proposals. We also demonstrated its application in distributed compression and vertical federated learning setup, focusing on the scenario where the correlated side information is available only at the decoder.
|
|
On the Choice of Perception Loss Function for Learned Video Compression
Sadaf Salehkalaibar*, Buu Phan*,
Jun Chen,
Wei Yu,
Ashish Khisti,
Conference on Neural Information Processing Systems , 2023
We provide a information-theoretic analysis on perceptual video compression. Surprisingly, our results show that preserving temporal consistency comes with a cost of higher distortion!
|
|
Rate-Distortion-Perception Tradeoff Based on the Conditional Perception Measure
Sadaf Salehkalaibar, Buu Phan,
Ashish Khisti,
Wei Yu,
Biennial Symposium on Communications (BSC) , 2023
We provide a information-theoretic analysis on the conditional perceptual metric for image compression.
|
|
Adversarial imaging pipelines
Buu Phan,
Felix Heide,
Fahim Mannan
CVPR, 2021
project page /
Paper /
video /
Contact me to reproduce the results on your camera.
We proposed a new kind of adversarial attack by targeting the ISP and camera optics. Our method produces real-world adversarial scenes without relying on image augmentation.
|
|
Seeing Around Street Corners: Non-Line-of-Sight Detection and Tracking In-the-Wild Using Doppler Radar
Nicolas Scheiner, Florian Kraus, Fangyin Wei, Buu Phan,
Fahim Mannan,
Nils Appenrodt, Werner Ritter, Jürgen Dickmann, Klaus Dietmayer,
Felix Heide,
CVPR, 2020
project page /
Paper /
video /
code /
We show how to scale up NLOS imaging technique in real-world using radar. The method can reliably detect and track NLOS pedestrian and cyclist.
|
|
Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output
Vahdat Abdelzad,
Krzysztof Czarnecki,
Rick Salay
Taylor Denounden, Sachin Vernekar, Buu Phan
Arxiv, 2019
Paper /
code /
We show that OOD images can be detected by applying the one-class SVM on early-layer's embeddings. The results outperforms several baseline approaches on MNIST, CIFAR and Tiny-Imagenet.
|
|
Bayesian uncertainty quantification with synthetic data
Buu Phan, Samin Khan,
Rick Salay,
Krzysztof Czarnecki,
(Best paper) Workshop on Artificial Intelligence Safety Engineering - SAFECOMP , 2019
Paper
We emperically show that Bayesian deep networks captures some notions of uncertainty that are similar to human's view of uncertainty. Contrary to expectation, we also find that the estimated aleatoric uncertainty from Bayesian deep models can be reduced with more training data.
|
|
ProcSy: Procedural Synthetic Dataset Generation Towards Influence Factor Studies Of Semantic Segmentation Networks
Samin Khan, Buu Phan,
Rick Salay,
Krzysztof Czarnecki,
CVPR Workshop , 2019
Paper/
Dataset
We provide a tool for synthetic self-driving dataset generation. Our tool can simulate adverse environmental condition such as rain and cloud. We also provide some analysis regarding distribution shift on this dataset.
|
|
Analysis of confident-classifiers for out-of-distribution detection
Sachin Vernekar, Ashish Gaurav, Taylor Denouden, Buu Phan, Vahdat Abdelzad,
Rick Salay,
Krzysztof Czarnecki,
Safe ML Workshop, ICLR , 2019
Paper/
Code
We show some theoretical disadvantages of using classifier for out of distribution detection. We show that training a classifier for "unseen" class can leverage the issue.
|
|
Improving reconstruction autoencoder out-of-distribution detection with mahalanobis distance
Taylor Denouden,
Rick Salay,
Krzysztof Czarnecki,
Vahdat Abdelzad, Buu Phan, Sachin Vernekar.
Arxiv , 2018
Paper/
Reconstruction autoencoder was one of the go-to approach for out-of-distribution detection. The idea is to use the reconstruction error for OOD score. Nevertheless, in many cases, we observe that OOD images are perfectly reconstructed. This paper demystifies the above phenomenon from the view of PCA.
|
|
Calibrating Uncertainties in Object Localization Task
Buu Phan,
Rick Salay,
Krzysztof Czarnecki,
Vahdat Abdelzad, Taylor Denouden, Sachin Vernekar.
Neurips Bayesian Deep Learning Workshop , 2018
Paper/
We observe that Bayesian Deep Networks do not provide well-calibrated regression uncertainty. In this paper, we address this problem for the single-object localization task by adapting an existing technique for calibrating regression models.
|
|
An automated vehicle safety concept based on runtime restriction of the operational design domain
Ian Colwell,
Buu Phan,
Shahwar Saleem,
Rick Salay,
Krzysztof Czarnecki.
IEEE Intelligent Vehicles Symposium , 2018
Paper/
We proposes an approach and architectural design for achieving maximum functionality in the case of system failures.
|
|