高瓴人工智能学院学术前沿讲座总第42期 2022年第23期
智能社会治理跨学科平台AI+X讲座2022年第6期
报告题目:Connecting Visual Representation and Robot Manipulation
讲座时间:2022年11月16日(周三)14:00-15:30
讲座地点:线下:文化大厦2101 线上:腾讯会议:779-366-651
报告摘要
Visual pre-training with large-scale real-world data has made great progress in recent years. However, the recipes of visual pre-training for robot manipulations are yet to be built. In this talk, I present two works contributing to this topic. First, I present iBOT, a self-supervised visual representation work. iBOT performs masked image modeling via self-distillation, achieving state-of-the-art results on most downstream tasks related to semantic reasoning. Furthermore, I present a visual pretraining scheme for robot manipulation (Vi-PRoM). In Vi-PRoM, we investigate the effects of visual pre-training strategies on robot manipulation tasks from three fundamental perspectives: datasets, model architectures and training methods. Vi-PRoM employs contrastive learning, visual semantics learning and temporal dynamics learning to facilitate robot manipulation tasks in the real-world.
主讲嘉宾
Tao Kong (孔涛) is a Senior Researcher at ByteDance AI Lab. He received his Ph.D. degree from Tsinghua University, advised by Fuchun Sun. He also visited the University of Pennsylvania, working with Jianbo Shi. His research mission is to develop robot techniques and systems to perform intelligent perception and interaction in the real-world. Dr. Kong has published over 30 papers at top-tier AI/robot conferences and journals, receiving over 6,000 citations so far. He is the recipient of the CAAI Excellent Doctoral Dissertation Nomination Award 2020, IROS Robotic Grasping and Manipulation Competition Winner Award 2016, and Habitat ObjectNav Challenge Winner Award 2022.
检测到您当前使用浏览器版本过于老旧,会导致无法正常浏览网站;请您使用电脑里的其他浏览器如:360、QQ、搜狗浏览器的速模式浏览,或者使用谷歌、火狐等浏览器。
下载Firefox