
讲座信息
时间:2025年11月27日(周四)10:00 – 11:30
地点:中国人民大学立德楼1826报告厅
报告人:Mikkel Thorup 丹麦哥本哈根大学教授、丹麦皇家科学院院士
报告题目:
Hash functions: bridging the gap from theory to practice
报告摘要:
Hash functions are used everywhere in computing, e.g., hash tables, sketching, dimensionality reduction, sampling, and estimation. Abstractly, we like to think of hashing as fully-random hashing, assigning independent hash values to every possible key, but essentially this requires us to store the hash values for all keys, which is unrealistic for most key universes, e.g., 64-bit keys. In practice, we have to settle for implementable hash functions, and often practitioners settle for implementations that are too simple in that the algorithms ends up working only for sufficiently random input. However, the real world is full of structured/non-random input. For simplistic hash functions that often work very well in tests with random input, error events that should never happen in practice happen with way too high probability.
Over the last decade there has been major developments in simple-to-implement tabulation-based hash functions offering strong theoretical guarantees, so as to support fundamental properties such as Chernoff bounds, Sparse Johnson-Lindenstrauss transforms, and fully-random hashing on a given set w.h.p. etc. I will discuss some of the principles of these developments and offer insights on how far we can bridge from theory (assuming fully random hash functions) to practice (needing something that can actually implemented efficiently).
报告人:Mikkel Thorup 丹麦哥本哈根大学教授、丹麦皇家科学院院士
报告人简介:
Mikkel Thorup has a D.Phil. from Oxford University from 1993. From 1993 to 1998 he was at the University of Copenhagen. From 1998 to 2013 he was at AT&T Labs-Research. Since 2013 he has been back as Professor at the University of Copenhagen. He is currently a VILLUM Investigator heading Center for Basic Algorithms Research Copenhagen (BARC). Mikkel is a Fellow of the ACM and of AT&T, and a Member of the Royal Danish Academy of Sciences and Letters. He is co-winner of the 2011 MAA Robbins Award in mathematics and winner of the 2015 Villum Kann Rasmussen Award for Technical and Scientific Research, which is Denmark's biggest individual prize for research. More recently he was co-winner of the 2021 AMS-MOS Fulkerson Prize and an ACM STOC 20-year test of time award. Mikkel's main work is in algorithms and data structures, where he has worked on both upper and lower bounds.
报告摘要:
哈希函数在哈希表、摘要、降维、采样、近似估计等问题中被频繁使用,其在理论分析中通常被假设为是完全随机的(fully random),即其可以将各个输入数据独立均匀地映射到输出空间中,但这种哈希函数结构复杂,需要指数级别的空间才可以构建,这在实际场景中很难实现。在实际使用时人们普遍采用结构简单的哈希函数,但这些哈希函数大多因为结构过于简单而仅在输入数据足够随机时才具有良好的理论性质。但实际场景中的输入数据普遍存在内部依赖关系而不具有很高的随机性,这导致这些结构简单的哈希函数大多在随机性较高的测试数据上表现良好,但在实际场景中的表现却差强人意。哈希函数的理论研究与实际使用之间存在背离之处。
过往十余年内,既往研究工作针对上述问题取得了诸多进展,表格哈希(tabulation hashing)是其中的代表方法,其实现简单且具有良好的理论性质,满足Chernoff bounds、稀疏Johnson-Lindenstrauss变换等,在给定集合上的数学性质与理想的完全随机哈希高概率趋同。本次报告将对相关技术作以介绍,探讨调和哈希函数理论研究与实际应用间背离之处的方法。
讲者简介:
Mikkel Thorup于1993年博士毕业于英国牛津大学,1993-1998年在丹麦哥本哈根大学工作,1998-2013年在美国AT&T研究院工作。2013年起至今在丹麦哥本哈根大学任教授,同时担任BARC基础算法研究中心主任。Mikkel是ACM Fellow、AT&T Fellow,是丹麦皇家科学院院士。他于2011年获MAA Robbins Award in mathematics,于2015年获Villum Kann Rasmussen Award(丹麦最大的个人科研奖励),于2021年获AMS-MOS Fulkerson Prize和STOC 20年时间检验奖(ACM STOC 20-year test of time award)。Mikkel的研究方向是算法与数据结构,关注复杂度上、下界的提升。
检测到您当前使用浏览器版本过于老旧,会导致无法正常浏览网站;请您使用电脑里的其他浏览器如:360、QQ、搜狗浏览器的速模式浏览,或者使用谷歌、火狐等浏览器。
下载Firefox