您所在的位置: 首页- 新闻公告- 学术讲座-

学术讲座

BDAI重点实验室研究生沙龙第19期:COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
日期:2022-03-09访问量:


大数据管理与分析方法研究北京市重点实验室(BDAI)研究生沙龙由中国人民大学高瓴人工智能学院师生组织定期举行。本周研讨会由卢志武教授指导的博士生卢浩宇和高欣讲座教授指导的学生周觉晓(Juexiao Zhou)分别介绍各自的研究工作。欢迎同学们积极参与研讨!

jiang.png

报告标题:COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

报告人:卢浩宇,博士一年级

导师:卢志武

研究方向:多模态预训练模型,多模态检索

摘要:Large-scale single-stream pre-training has shown dramatic performance in image-text retrieval. Regrettably, it faces low inference efficiency due to heavy attention layers. Recently, two-stream methods like CLIP and ALIGN with high inference efficiency have also shown promising performance, however, they only consider instance-level alignment between the two streams (thus there is still room for improvement). To overcome these limitations, we propose a novel COllaborative Two-Stream vision-language pre-training model termed COTS for image-text retrieval by enhancing cross-modal interaction. In addition to instance-level alignment via momentum contrastive learning, we leverage two extra levels of cross-modal interactions in our COTS: (1) Token-level interaction -- a masked vision-language modeling (MVLM) learning objective is devised without using a cross-stream network module, where variational autoencoder is imposed on the visual encoder to generate visual tokens for each image. (2) Task-level interaction -- a KL-alignment learning objective is devised between text-to-image and image-to-text retrieval tasks, where the probability distribution per task is computed with the negative queues in momentum contrastive learning. Under a fair comparison setting, our COTS achieves the highest performance among all two-stream methods and comparable performance (but with 10,800x faster in inference) w.r.t. the latest single-stream methods. Importantly, our COTS is also applicable to text-to-video retrieval, yielding new state-of-the-art on the widely-used MSR-VTT dataset.

报告标题:PPML-Omics: a Privacy-Preserving federated Machine Learning system protects patients’ privacy from omic data

报告人简介:Juexiao Zhou

导师:Prof. Xin Gao

研究方向:Researching on applications of deep learning in bioinformatics and medical imaging, together with privacy issues in intelligent healthcare.

摘要:Individual privacy in biology and biomedicine is emerging as a big concern with the development of biomedical data science in recent years. A deluge of genetic data from millions of individuals is generated from massive research projects in the past few decades, such as the Cancer Genome Atlas (TCGA), the 100,000 genome project and the Earth BioGenome Project (EBP) from high-throughput sequencing platforms. Those datasets may lead to potential leakage of genetic information and the privacy concern on ethical problems like genetic discrimination. Modern machine learning models towards various tasks with omic data analysis give rise to threats of privacy leakage of patients involved in those datasets. Despite the advances in different privacy technologies, existing methods tend to introduce too much noise, which hampers model accuracy and usefulness. Thus, we built a secure and privacy-preserving machine learning (PPML) system by combining federated learning (FL),differential privacy (DP) and shuffling mechanism. We applied this system to analyze data from three sequencing technologies, and addressed the privacy concern in three major tasks of omic data, namely cancer classification with bulk RNA-seq, clustering with single-cell RNA-seq, and the integration of spatial gene expression and tumour morphology with spatial transcriptomics, under three representative deep learning models. We also examined privacy breaches in depth through privacy attack experiments and demonstrated that our PPML-Omics system could protect patients’ privacy. In each of these applications, our system was able to outperform state-of-the-art systems under the same level of privacy guarantee, demonstrating the versatility of the system in simultaneously balancing the privacy-preserving capability and utility in omic data analysis. Furthermore, we gave the theoretical proof of the privacy-preserving capability of our system, suggesting the first mathematically guaranteed model with robust and generalizable empirical performance.

欢迎各位同学积极参会!

检测到您当前使用浏览器版本过于老旧,会导致无法正常浏览网站;请您使用电脑里的其他浏览器如:360、QQ、搜狗浏览器的速模式浏览,或者使用谷歌、火狐等浏览器。

下载Firefox