Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Paradox of AlphaZero: Strategic vs. Optimal Plays

Published in International Performance Computing and Communications Conference, 2020

Abstract. This article analyzes AlphaZero-type algorithms quantitatively from the viewpoint of local and global optimal sequences of play on a 7×7 board. Through targeted evaluation of the AI agent, the authors reveal the strategic, that is, winrate-dominated, nature of such algorithms, and expose thereby certain inherent obstacles against optimal play. Possible remedies are then explored, leading to techniques that may help further quantitative analysis of those algorithms and for the search for optimal solutions, on 7×7 as well as larger boards.

Recommended citation: Ze-Li Dou, Liran Ma, Khiem Nguyen, and Kien X. Nguyen. "Paradox of AlphaZero: Strategic vs. Optimal Plays." In The 39th IEEE International Performance Computing and Communications Conference, 2020.
Download Paper | Download Bibtex

CoConv: Learning Dynamic Cooperative Convolution for Image Recognition

Published in International Conference on Multimedia & Expo (Oral), 2021

Abstract. In this paper, we present a conceptually simple, yet powerful method for image recognition. The method, called Cooperative Dynamic Convolution (CoConv), introduces a cooperative learning of dynamic convolution from multiple convolutional experts. CoConv can be used as a substitute for the traditional static convolution, and can be seamlessly integrated in various visual models. Moreover, CoConv is easy to train with only a minimal computational overhead introduced in the inference phase. CoConv is trained by using multiple convolutional experts simultaneously, and the convolutional weights are merged by a weighted summation before convolutional operations for efficiency during inference. Results from extensive experiments show that CoConv leads to consistent improvement for image classification on various datasets, independent of the choice of the base convolutional network. Remarkably, CoConv improves the top-1 classification accuracy of ResNet18 by 3.06% on ImageNet.

Recommended citation: Kien X. Nguyen, Tiffany Ryu, Jocelyn Zhang, Xu Ma, Qing Yang, Song Fu, Paparao Palacharla, Nannan Wang, and Xi Wang. "CoConv: Learning Dynamic Cooperative Convolution for Image Recognition." In IEEE International Conference on Multimedia & Expo, 2021.
Download Paper | Download Bibtex

Deep Learning-based Estimation of Whole-body Kinematics from Multi-view Images

Published in Computer Vision and Image Understanding, 2023

Abstract. It is necessary to analyze the whole-body kinematics (including joint locations and joint angles) to assess risks of fatal and musculoskeletal injuries in occupational tasks. Human pose estimation has gotten more attention in recent years as a method to minimize the errors in determining joint locations. However, the joint angles are not often estimated, nor is the quality of joint angle estimation assessed. In this paper, we presented an end-to-end approach on direct joint angle estimation from multi-view images. Our method leveraged the volumetric pose representation and mapped the rotation representation to a continuous space where each rotation was uniquely represented. We also presented a new kinematic dataset in the domain of residential roofing with a data processing pipeline to generate necessary annotations for the supervised training procedure on direct joint angle estimation. We achieved a mean angle error of 7.19° on the new Roofing dataset and 8.41° on the Human3.6M dataset, paving the way for employment of on-site kinematic analysis using multi-view images.

Recommended citation: Kien X. Nguyen, Liying Zheng, Ashley L. Hawke, Robert E. Carey, Scott P. Breloff, Kang Li, and Xi Peng. "Deep Learning-based Estimation of Whole-body Kinematics from Multi-view Images; In Computer Vision and Image Understanding, Volume 235, October 2023.
Download Paper | Download Bibtex

Adaptive Cascading Network for Continual Test-time Adaptation

Published in ACM International Conference on Information and Knowledge Management, 2024

Abstract. We study the problem of continual test-time adaption where the goal is to adapt a source pre-trained model to a sequence of unlabelled target domains at test time. Existing methods on test-time training suffer from several limitations: (1) Mismatch between the feature extractor and classifier; (2) Interference between the main and self-supervised tasks; (3) Lack of the ability to quickly adapt to the current distribution. In light of these challenges, we propose a cascading paradigm that simultaneously updates the feature extractor and classifier at test time, mitigating the mismatch between them and enabling long-term model adaptation. The pre-training of our model is structured within a meta-learning framework, thereby minimizing the interference between the main and self-supervised tasks and encouraging fast adaptation in the presence of limited unlabelled data. Additionally, we introduce innovative evaluation metrics, average accuracy and forward transfer, to effectively measure the model’s adaptation capabilities in dynamic, real-world scenarios. Extensive experiments and ablation studies demonstrate the superiority of our approach in a range of tasks including image classification, text classification, and speech recognition.

Recommended citation: Kien X. Nguyen Fengchun Qiao and Xi Peng. "Adaptive Cascading Network for Continual Test-time Adaptation." In the 33rd ACM International Conference on Information and Knowledge Management, 2024.
Download Paper | Download Bibtex

SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey

Published in Annual Conference on Neural Information Processing Systems, 2024

Abstract. A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale. To bridge this gap, we introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers that is curated in collaboration with marine scientists. We further extend the dataset to SeafloorGenAI by incorporating the language component in order to facilitate the development of both vision- and language-capable machine learning models for sonar imagery. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions and approximately 7M question-answer pairs. By making our data processing source code publicly available, we aim to engage the marine science community to enrich the data pool and inspire the machine learning community to develop more robust models. This collaborative approach will enhance the capabilities and applications of our datasets within both fields.

Recommended citation: Kien X. Nguyen, Fengchun Qiao, Arthur Trembanis, and Xi Peng. "SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey." In Proceedings of the Annual Conference on Neural Information Processing Systems, 2024.
Download Paper | Download Bibtex

Interpretable Failure Detection with Human-Level Concepts

Published in The AAAI Conference on Artificial Intelligence (Oral), 2025

Abstract. Reliable failure detection holds paramount importance in safety-critical applications. Yet, neural networks are known to produce overconfident predictions for misclassified samples. As a result, it remains a problematic matter as existing confidence score functions rely on category-level signals, the logits, to detect failures. This research introduces an innovative strategy, leveraging human-level concepts for a dual purpose: to reliably detect when a model fails and to transparently interpret why. By integrating a nuanced array of signals for each category, our method enables a finer-grained assessment of the model’s confidence. We present a simple yet highly effective approach based on the ordinal ranking of concept activation to the input image. Without bells and whistles, our method significantly reduce the false positive rate across diverse real-world image classification benchmarks, specifically by 3.7% on ImageNet and 9.0% on EuroSAT.

Recommended citation: Kien X. Nguyen, Tang Li and Xi Peng. "Interpretable Failure Detection with Human-Level Concepts." In Proceedings of the AAAI Conference on Artificial Intelligence, 2025.
Download Paper | Download Bibtex

Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model’s Prediction Rationality

Published in The AAAI Conference on Artificial Intelligence, 2025

Abstract. Vision-Language Models (VLMs), such as CLIP, have already seen widespread applications. Researchers actively engage in further fine-tuning VLMs in safety-critical domains. In these domains, prediction rationality is crucial: the prediction should be correct and based on valid evidence. Yet, for VLMs, the impact of fine-tuning on prediction rationality is seldomly investigated. To study this problem, we proposed two new metrics called Prediction Trustworthiness and Inference Reliability. We conducted extensive experiments on various settings and observed some interesting phenomena. On the one hand, we found that the well-adopted fine-tuning methods led to more correct predictions based on invalid evidence. This potentially undermines the trustworthiness of correct predictions from fine-tuned VLMs. On the other hand, having identified valid evidence of target objects, fine-tuned VLMs were more likely to make correct predictions. Moreover, the findings are also consistent under distributional shifts and across various experimental settings. We hope our research offer fresh insights to VLM fine-tuning.

Recommended citation: Qitong Wang, Tang Li, Kien X. Nguyen and Xi Peng. "Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model's Prediction Rationality." In Proceedings of the AAAI Conference on Artificial Intelligence, 2025.
Download Paper | Download Bibtex

Cross-Problem Parameter Transferability in Quantum Approximate Optimization Algorithm: A Machine Learning Approach

Published in arXiv, 2025

Abstract. Quantum Approximate Optimization Algorithm (QAOA) is one of the most promising candidates to achieve the quantum advantage in solving combinatorial optimization problems. The process of finding a good set of variational parameters in the QAOA circuit has proven to be challenging due to multiple factors, such as barren plateaus. As a result, there is growing interest in exploiting parameter transferability, where parameter sets optimized for one problem instance are transferred to another that could be more complex either to estimate the solution or to serve as a warm start for further optimization. But can we transfer parameters from one class of problems to another? Leveraging parameter sets learned from a well-studied class of problems could help navigate the less studied one, reducing optimization overhead and mitigating performance pitfalls. In this paper, we study whether pretrained QAOA parameters of MaxCut can be used as is or to warm start the Maximum Independent Set (MIS) circuits. Specifically, we design machine learning models to find good donor candidates optimized on MaxCut and apply their parameters to MIS acceptors. Our experimental results show that such parameter transfer can significantly reduce the number of optimization iterations required while achieving comparable approximation ratios.

Recommended citation: Kien X. Nguyen, Bao Bach, and Ilya Safro. "Cross-Problem Parameter Transferability in Quantum Approximate Optimization Algorithm: A Machine Learning Approach." arXiv, 2025.
Download Paper | Download Bibtex

talks

teaching

Intro to Programming II

Undergraduate course, University of Delaware, Department of Computer and Information Sciences, 2022

I taught about Java and OOP concepts.

Intro to Machine Learning

Graduate course, University of Delaware, Department of Computer and Information Sciences, 2023

I taught about machine learning fundamentals, deep learning (DNN, CNN, RNN), and some applications.