Date：2021-06-12 Visits：

Four papers from Gaoling School of Artificial Intelligence, Renmin University of China (GSAI) was accepted by International Conference on Machine Learning (ICML), according to the results released on May 8th. ICML is one of the top international conferences in the field of machine learning hosted by International Machine Learning Society (IMLS), with an acceptance rate of 21.48%.

Since January 2021, GSAI has published (including those being accepted) 47 papers in CCF A-category international journals or conferences, 5 papers in CCF B-category journals and conferences. Among them, 47 papers have GSAI students or faculties listed as their first or corresponding authors.

**Paper Introduction**

**Paper Title: Sharper Generalization Bounds for Clustering**

**Author: Shaojie Li, Yong Liu**

**Corresponding author: Yong Liu**

**Paper Overview: **Existing generalization analysis of clustering mainly focuses on specific instantiations, such as (kernel) k-means, and a unified framework for studying clustering performance is still lacking. Besides, the existing excess clustering risk bounds are mostly the order of K divided by square-root n provided that the underlying distribution has bounded support, where n is the sample size and K is the cluster numbers, or the order of square K divided by n under strong assumptions on the underlying distribution, where these assumptions are hard to be verified in general. In this paper, we propose a unified clustering learning framework and investigate its excess risk bounds, obtaining state-of-the-art upper bounds under mild assumptions. Specifically, we derive sharper bounds of order square K divided by n under mild assumptions on the covering number of the hypothesis spaces, where these assumptions are easy to be verified. Moreover, for the hard clustering scheme, such as (kernel) k-means, if just assume the hypothesis functions to be bounded, we improve the upper bounds from the order of K divided by square-root n to square-root K divided by square-root n. Furthermore, state-of-the-art bounds of faster order K divided by n are obtained with the covering number assumptions.

**Paper Introduction**

**Paper Title: Distributed Nystrom Kernel Learning with Communications**

**Authors: Rong Yin, Yong Liu, Weiping Wang, Dan Meng**

**Paper overview: **We study the statistical performance for distributed kernel ridge regression with Nystr\"{o}m (DKRR-NY) and with Nystr\"{o}m and iterative solvers (DKRR-NY-PCG) and successfully derive the optimal learning rates, which can improve the ranges of the number of local processors $p$ to the optimal in existing state-of-art bounds. More precisely, our theoretical analysis show that DKRR-NY and DKRR-NY-PCG achieve the same learning rates as the exact KRR requiring essentially $\mathcal{O}(|D|^{1.5})$ time and $\mathcal{O}(|D|)$ memory with relaxing the restriction on $p$ in expectation, where $|D|$ is the number of data, which exhibits the average effectiveness of multiple trials. Furthermore, for showing the generalization performance in a single trial, we deduce the learning rates for DKRR-NY and DKRR-NY-PCG in probability. Finally, we propose a novel algorithm DKRR-NY-CM based on DKRR-NY, which employs a communication strategy to further improve the learning performance, whose effectiveness of communications is validated in theoretical and experimental analysis.

**Paper Introduction**

**Paper title: Estimating alpha-Rank from A Few Entries with Low Rank Matrix Completion**

**Authors: Yali Du, Xue Yan, Xu Chen, Haifeng Zhang, Jun Wang**

**Paper Overview: **Multi-agent evaluation aims at the assessment of an agent’s strategy on the basis of interaction with others. Typically, existing methods such as α- rank and its approximation still require to exhaustively compare all pairs of joint strategies for an accurate ranking, which in practice is computationally expensive. In this paper, we aim to reduce the number of pairwise comparisons in recovering a satisfied ranking for n players in two-player meta-games, by exploring the fact that agents with similar skills may achieve similar payoffs against others. Two situations are considered: the first one is when we can obtain the true payoffs (e.g., noise-free evaluation); the other one is when we can only access noisy payoff observations (e.g., noisy evaluation). Based on these formulations, we leverage low-rank matrix completion and design two novel algorithms for noise-free and noisy evaluations respectively leveraging low-rank matrix completion. For both of these settings, we theorize that O(nr log n) (n is the number of agents and r is the rank of the payoff matrix) comparisons are required to achieve sufficiently well evaluation performance. Empirical results on evaluating the players in three synthetic games and twelve real world games demonstrate that payoff

evaluation from a few entries can lead to comparable performance to algorithms that know the complete payoff matrix.

**Paper Introduction**

**Paper Title: Graph Neural Networks Inspired by Classical Iterative Algorithms**

**Authors: Yongyi Yang, Tang Liu, Yangkun Wang, Jinjing Zhou, Quan Gan, Zhewei Wei, Zheng Zhang, Zengfeng Huang, David Wipf**

**Paper Overview:**Despite the recent success of graph neural networks (GNN), common architectures often exhibit significant limitations, including sensitivity to oversmoothing, long-range dependencies, and spurious edges. To at least partially address these issues within a simple transparent framework, we consider a new family of GNN layers designed to mimic and integrate the update rules of two classical iterative algorithms, namely, proximal gradient descent and iterative reweighted least squares (IRLS). The former defines an extensible base GNN architecture that is immune to oversmoothing while nonetheless capturing long-range dependencies by allowing arbitrary propagation steps. In contrast, the latter produces a novel attention mechanism that is explicitly anchored to an underlying end-to-end energy function, contributing stability with respect to edge uncertainty. When combined we obtain an extremely simple yet robust model that we evaluate across disparate scenarios including standardized benchmarks, adversarially-perturbated graphs, graphs with heterophily, and graphs involving long-range dependencies. We compare against SOTA GNN approaches that have been explicitly designed for the respective task, achieving competitive or superior node classifications accuracy.