Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

academicpages is a ready-to-fork GitHub Pages template for academic personal websites

About me

Welcome to Wei Huang's Homepage

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

publications

Adaptive multi-GPU exchange Monte Carlo for the 3D random field Ising model

Published in Computer Physics Communications, 2016

This work presents an adaptive multi-GPU Exchange Monte Carlo approach for the simulation of the 3D Random Field Ising Model (RFIM).

Download here

Critical percolation clusters in seven dimensions and on a complete graph

Published in Physical Review E, 2018

We study critical bond percolation on a seven-dimensional (7D) hypercubic lattice with periodic boundary conditions and on the complete graph (CG) of finite volume (number of vertices).

Download here

Mean field theory for deep dropout networks: digging up gradient backpropagation deeply

Published in ECAI, 2020

We perform theoretical computation on linear dropout networks and a series of experiments on dropout networks with different activation functions.

Download here

On the neural tangent kernel of deep networks with orthogonal initialization

Published in IJCAI, 2021

In this work, we study the dynamics of ultra-wide networks across a range of architectures, including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) with orthogonal initialization via neural tangent kernel (NTK).

Download here

Gaussian process latent variable model factorization for context-aware recommender systems

Published in Pattern Recognition Letters, 2021

In order to address such shortcomings, we propose a Gaussian Process Latent Variable Model Factorization (GPLVMF) method, where we apply an appropriate prior to the original GP model.

Download here

On the Equivalence between Neural Network and Support Vector Machine

Published in NeurIPS, 2021

We propose to establish the equivalence between NN and SVM, and specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent.

Download here

Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Published in ICLR, 2022

This work exploits the Graph Neural Tangent Kernel (GNTK), which governs the optimization trajectory under gradient descent for wide GCNs. We formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate in the optimization process.

Download here

Auto-scaling Vision Transformers without Training

Published in ICLR, 2022

This work targets automated designing and scaling of Vision Transformers (ViTs). We propose As-ViT, an auto-scaling framework for ViTs without training, which automatically discovers and scales up ViTs in an efficient and principled manner.

Download here

Pruning graph neural networks by evaluating edge properties.

Published in Knowledge-Based Systems, 2022

We formulate the performance of GNNs mathematically with respect to the properties of their edges, elucidating how the performance drop can be avoided by pruning negative edges and nonbridges. This leads to our simple but effective two-step method for GNN pruning, leveraging the saliency metrics for the network pruning while sparsifying the graph with preservation of the loss performance.

Download here

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis.

Published in NeurIPS, 2022

We theoretically characterize the impact of connectivity patterns on the convergence of DNNs under gradient descent training in fine granularity. By analyzing a wide network’s Neural Network Gaussian Process (NNGP), we are able to depict how the spectrum of an NNGP kernel propagates through a particular connectivity pattern, and how that affects the bound of convergence rates.

Download here

Interpreting Operation Selection in Differentiable Architecture Search: A Perspective from Influence-Directed Explanations.

Published in NeurIPS, 2022

In this work, we leverage influence functions, the functional derivatives of the loss function, to theoretically reveal the operation selection part in DARTS and estimate the candidate operation importance by approximating its influence on the supernet with Taylor expansions. We show the operation strength is not only related to the magnitude but also secondorder information, leading to a fundamentally new criterion for operation selection in DARTS, named Influential Magnitude.

Download here

Deep Active Learning by Leveraging Training Dynamics.

Published in NeurIPS, 2022

In this paper, by exploring the connection between the generalization performance and the training dynamics, we propose a theory-driven deep active learning method (dynamicAL) which selects samples to maximize training dynamics. In particular, we prove that the convergence speed of training and the generalization performance are positively correlated under the ultra-wide condition and show that maximizing the training dynamics leads to better generalization performance.

Download here

Weighted Mutual Learning with Diversity-Driven Model Compression.

Published in NeurIPS, 2022

This paper, for the first time, leverages a bi-level formulation to estimate the relative importance of peers with a close-form, to further boost the effectiveness of the distillation from each other. Extensive experiments show the generalization of the proposed framework, which outperforms existing online distillation methods on a variety of deep neural networks.

Download here

Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection.

Published in TMLR, 2023

This paper proposes a theoretical convergence and generalization analysis for Deep PAC-Bayesian learning. For a deep and wide probabilistic neural network, our analysis shows that PAC-Bayesian learning corresponds to solving a kernel ridge regression when the probabilistic neural tangent kernel (PNTK) is used as the kernel.

Download here

No Free Lunch in Neural Architectures? A Joint Analysis of Expressivity, Convergence, and Generalization.

Published in Auto-ML, 2023

Download here

talks

An Introduction to the Neural Tangent Kernel.

Published: July 01, 2020

Recently researcher find that the dynamics of infinitely-wide neural networks under gradient desent training are captured by neural tangent kernel. With the help of neural tangent kernel, researcher can prove that over-paramterized neural network can find global minimum, which is a milestone in the area of deep learning theory. This talk will present the basic properties of neural tangent kernel and my research output regarding neural tangent kernel.

Understanding deep neural networks through over-parameterization.

Published: June 01, 2021

The learning dynamics of neural networks trained by gradient descent are captured by the so-called neural tangent kernel (NTK) in the infinite-width limit. The NTK has been a powerful tool for researchers to understand the optimization and generalization of over-parameterized networks. In this talk, the foundation of the NTK in addition to its application to orthogonally-initialized networks and ultra-wide graph networks will be introduced.

Closing the Gap between Theory and Applications in Deep Learning.

Published: December 01, 2021

Deep learning has been responsible for a step-change in performance across machine learning, setting new benchmarks in a large number of applications. During my Ph.D. study, I seek to understand the theoretical properties of deep neural networks and close the gap between the theory and application sides. This presentation will introduce three concrete works with respect to the neural tangent kernel (NTK), one of the seminal advances in deep learning theory recently.

Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective.

Published: June 12, 2022

We formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate in the optimization process. Additionally, we extend our theoretical framework to analyze residual connection-based techniques, which are found to be merely able to mitigate the exponential decay of trainability mildly. Inspired by our theoretical insights on trainability, we propose Critical DropEdge, a connectivity-aware and graph-adaptive sampling method, to alleviate the exponential decay problem more fundamentally.

Understanding Deep Learning through Over-parameterization: from Kernel Regime to Feature Learning.

Published: February 15, 2023

Understanding the learning dynamics of neural networks with (stochastic) gradient descent is a long-term goal for deep learning theory research. In this talk, the trend from the neural tangent kernel (NTK) regime to feature learning dynamics will be introduced. The NTK has been a powerful tool for researchers to understand the optimization and generalization of over-parameterized networks. We first introduce the foundation of the NTK in addition to its application to neural architecture search and active learning. Furthermore, more recent works found that the neural networks are performing feature learning during gradient descent training. We will then introduce how feature learning emerges and its application in understanding the role of graph convolution in graph neural networks.

Graph neural networks provably benefit from structural information: a feature learning perspective.

Published: July 28, 2023

Slides

teaching

Thermodynamics and Statistical Physics

Undergraduate course, University of Science and Technology of China, 2015

Teaching assistant of the Thermodynamics and Statistical Physics course for undergraduate students.

Statistical Physics

Undergraduate course, University of Science and Technology of China, 2016

Teaching assistant of the Statistical Physics course for undergraduate students. Achieved outstanding teaching assisant award.