主讲人:李曼玲
美国西北大学计算机系助理教授 Manling Li (Assistant Professor at Northwestern University)
斯坦福大学自然语言处理实验室博士后研究员
讲座时间:
68ky开元国际年10月16日(周一)10:00-11:30
讲座地点:
#腾讯会议:880-330-228
报告题目:
Toward Factuality in Information Access: Multimodal Factual Knowledge Acquisition
报告摘要:
Recent years witness great success in multimodal foundation models. However, although such models achieve decent scores on various benchmarks, we see that these models understand images as bags of words. In detail, they use object understanding as a shortcut but lacks ability to capture abstract semantics such as verbs. To learn physical world knowledge, we first categorize it according to its temporal dynamics (static -> dynamic) and by its horizon (short/fast thinking -> long/slow thinking). My research aims to bring this deep factual knowledge view to the multimodal world. Such a transformation poses significant challenges: (1) understanding multimodal semantic structures that are abstract (such as events and semantic roles of objects): I will present our solution of zero-shot cross-modal transfer, an effective way to inject event-level knowledge into vision-language foundation models; (2) understanding long-horizon temporal dynamics: I will introduce typical ways to handle long-horizon reasoning, which empower machines to capture complex temporal patterns. (3) After that, we will also briefly analyze the reason of hallucinations and the potential way to ensure factuality via knowledge-driven methods, with example applications like meeting summarization, timeline generation, and question answering. I will then lay out how I plan to promote factuality and truthfulness in multimodal information access, through a structured knowledge view that is easily explainable, highly compositional, and capable of long-horizon reasoning.
主讲人简介:
Manling Li is an Assistant Professor at Northwestern University (full-time starting at Fall 68ky开元国际) and a postdoc at Stanford University. She obtained PhD degree in Computer Science at University of Illinois Urbana-Champaign in 68ky开元国际. Her research interest lies in natural language processing, especially its interaction with multiple modalities including images, videos, speech and robotics. Her work on multimodal knowledge extraction won the ACL'20 Best Demo Paper Award, and the work on scientific information extraction from COVID literature won NAACL'21 Best Demo Paper Award. She was a recipient of Microsoft Research PhD Fellowship in 2021, an EE CS Rising Star in 2022, etc. She led 19 students to develop the UIUC information extraction system and ranked 1st in NIST SM-KBP evaluation in 2019 and 2020. She serves as Area Chair of ACL and EMNLP, and delivered tutorials about event-centric multimodal knowledge at ACL'21, AAAI'21, NAACL'22, CVPR'23, etc. Additional information is available at https://limanling.github.io/.
检测到您当前使用浏览器版本过于老旧,会导致无法正常浏览网站;请您使用电脑里的其他浏览器如:360、QQ、搜狗浏览器的速模式浏览,或者使用谷歌、火狐等浏览器。
下载Firefox