您所在的位置: 首页- 新闻公告- 学术讲座-

学术讲座

BDAI重点实验室研究生沙龙第30期:JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding
日期:2022-10-12访问量:

大数据管理与分析方法研究北京市重点实验室(BDAI)研究生沙龙由中国人民大学高瓴人工智能学院师生组织定期举行。10月12日研讨会由学院文继荣教授、赵鑫教授指导的博士生周昆和窦志成教授指导的硕士生陈浩楠分别介绍自己的研究工作。欢迎同学们积极参与研讨!

BDAI30图.png

标题:JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

讲者:周昆,博士三年级;导师:文继荣,赵鑫;

研究方向:预训练语言模型,信息检索

Abstract:We aim to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model~(PLM) for effectively understanding and representing mathematical problems. Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement. Typically, it requires complex mathematical logic and background knowledge for solving mathematical problems. Considering the complex nature of mathematical texts, we design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses. Specially, we first perform token-level pre-training based on a position-biased masking strategy, and then design logic-based pre-training tasks that aim to recover the shuffled sentences and formulas, respectively. Finally, we introduce a more difficult pre-training task that enforces the PLM to detect and correct the errors in its generated solutions. We conduct extensive experiments on offline evaluation (including nine math-related tasks) and online A/B test. Experimental results demonstrate the effectiveness of our approach compared with a number of competitive baselines.

标题:Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search

讲者:陈浩楠,硕士二年级  导师:窦志成

研究方向:信息检索,会话级别搜索

Abstract: Users' search tasks have become increasingly complicated, requiring multiple queries and interactions with the results. Recent studies have demonstrated that modeling the historical user behaviors in a session can help understand the current search intent. Existing context-aware ranking models primarily encode the current session sequence (from the first behavior to the current query) and compute the ranking score using the high-level representations. However, there is usually some noise in the current session sequence (useless behaviors for inferring the search intent) that may affect the quality of the encoded representations. To help the encoding of the current user behavior sequence, we propose to use a decoder and the information of future sequences and a supplemental query. Specifically, we design three generative tasks that can help the encoder to infer the actual search intent: (1) predicting future queries, (2) predicting future clicked documents, and (3) predicting a supplemental query. We jointly learn the ranking task with these generative tasks using an encoder-decoder structured approach. Extensive experiments on two public search logs demonstrate that our model outperforms all existing baselines, and the designed generative tasks can actually help the ranking task. Besides, additional experiments also show that our approach can be easily applied to various Transformer-based encoder-decoder models and improve their performance.

检测到您当前使用浏览器版本过于老旧,会导致无法正常浏览网站;请您使用电脑里的其他浏览器如:360、QQ、搜狗浏览器的速模式浏览,或者使用谷歌、火狐等浏览器。

下载Firefox