SMP2020第一届“顶级会议”论坛

 

论坛时间:2020年9月5日下午13:30-15:00以及9月6日上午10:30-12:30

论坛简介:首届“顶级会议”论坛将于2020年9月6日举办,邀请了2020年在ICML、KDD、WWW、SIGIR以及IJCAI上发表论文的九位学者来讲解自己的工作,并围绕研究工作分享做出一个好的研究成果的心得、经验与体会。这九位学者中有刚刚博士毕业并工作一年的学者,也有博士高年级同学,博士低年级同学以及硕士研究生同学等,他们将从不同的学习阶段分享自己的研究与经验。

 

论坛主席


主席姓名:钱铁云 
主席简介:钱铁云,
武汉大学计算机学院教授,博导。主要研究领域为Web挖掘、自然语言处理。在ACL、EMNLP、AAAI、SIGIR、CIKM、TOIS、TKDD等顶级国际会议和期刊上发表论文70余篇。担任TKDE、TWEB、TKDD、TASL、INS等杂志审稿人和IJCAI,AAAI,WWW、CIKM、ICDM、APWEB、WAIM等国际学术会议的程序委员会委员,以及数据库、知识图谱、信息检索、社会媒体处理大会等多个重要学术会议的短文、出版、组织或论坛主席。

 

 

 

 

主席姓名:张静

主席简介张静,中国人民大学信息学院计算机系副教授。研究方向是数据挖掘,特别是以知识图谱和社会网络为具体落脚点,研究网络数据挖掘。发表论文45篇,其中包括十余篇KDD、TKDE、TOIS、IJCAI、AAAI等国际顶级会议或期刊论文。Google引用次数3000余次。近年来任SIGKDD’20、WWW’20、SIGKDD’19等领域内国际顶级学术会议程序委员会委员以及TKDE、TOIS、TKDD、中国科学等知名杂志审稿人。任AI Open杂志Associate Editor。

 

  •  

 

第一场:2020年9月5日下午13:30-15:00

 

  • 嘉宾一:Tong Chen
  • 嘉宾简介:Tong Chen received his PhD degree in Computer Science from The University of Queensland in 2020, under the supervision of Dr. Hongzhi Yin and Prof. Xue Li. He is currently a postdoctoral research fellow at The University of Queensland. His research work has been published on top venues like SIGIR, SIGKDD, ICDE, WWW, ICDM, IJCAI, AAAI, CIKM, TOIS, TKDE, etc., where his research interests include data mining, machine learning, recommender systems, and predictive analytics.

  •  

  • 报告题目:

Try This Instead: Personalized and Interpretable Substitute Recommendation

Accepted by SIGIR2020

报告摘要:

As a fundamental yet significant process in personalized recommendation, candidate generation and suggestion effectively help users spot the most suitable items for them. Consequently, identifying substitutable items that are interchangeable opens up new opportunities to refine the quality of generated candidates. When a user is browsing a specific type of product (e.g., a laptop) to buy, the accurate recommendation of substitutes (e.g., better equipped laptops) can provide the user with more suitable options to choose from, thus substantially increasing the chance of a successful purchase. However, in the emerging research on substitute recommendation, existing methods merely treat this problem as mining pairwise item relationships without the consideration of users' personal preferences. Moreover, the substitutable relationships are implicitly identified through the learned latent representations of items, which leads to uninterpretable recommendation results.

In this paper, we propose attribute-aware collaborative filtering (A2CF) to perform substitute recommendation by addressing issues from both personalization and interpretability perspectives. In A2CF, instead of directly modelling user-item interactions, we extract explicit and polarized item attributes from user reviews with sentiment analysis, whereafter the representations of attributes, users, and items are simultaneously learned. Then, by treating attributes as the bridge between users and items, we can thoroughly model the user-item preferences (i.e., personalization) and item-item relationships (i.e., substitution) for recommendation. In addition, A2CF is capable of generating intuitive interpretations by analyzing which attributes a user currently cares the most and comparing the recommended substitutes with her/his currently browsed items at an attribute level. The recommendation effectiveness and interpretation quality of A2CF are further demonstrated via extensive experiments on three real-life datasets.

 

  • 嘉宾二:岑宇阔
  • 嘉宾简介: 岑宇阔,清华大学计算机系一年级博士生,导师是唐杰教授。研究方向为网络表示学习与推荐系统,目前在KDD和TKDE上共发表三篇一作论文。

报告题目:可调控的多兴趣推荐框架

Accepted by KDD2020

报告摘要:神经网络模型目前已经被广泛地应用于各种推荐系统中。这些基于神经网络的推荐算法通常只会从用户的行为序列中学习到一个用户表征向量,但是这个统一的表征向量往往无法反映用户在一段时期内的多种不同的兴趣。我们提出了一种可调控的多兴趣推荐框架来解决这种情形。多兴趣抽取模块会从用户的点击序列中捕获到用户多种不同的兴趣,然后可以用于召回各个兴趣对应的商品。聚合模块会将这些不同兴趣召回的商品整合起来作为推荐的候选商品,供下游的任务来使用。

 

 

 

  • 嘉宾三:何高乐
  • 嘉宾简介:何高乐中国人民大学信息学院硕士生在读,导师为赵鑫与文继荣教授。研究方向为知识图谱及推荐系统。

报告题目:

基于对抗学习,挖掘用户-物品交互数据中的潜藏实体偏好用以提升知识图谱补全任务

Accepted by WWW2020

报告摘要:

知识图谱补全任务旨在自动推测知识图谱中缺失的事实信息。我们观察到知识图谱中的实体往往能在大规模应用平台(电商平台等)对应到实际物品。受此启发,我们提出利用丰富的用户-物品交互数据来提升知识图谱补全任务。由于知识图谱和用户-物品交互数据本身性质差异较大,简单地将两种数据做混合处理可能会损害模型表现。为了解决这一挑战,我们提出了一种基于对抗学习的方法,使用用户-物品交互数据提升知识图谱补全任务。

 

 

4嘉宾四:张洋

嘉宾简介:  张洋,中国科学技术大学2019级硕士生,导师为何向南教授。

报告题目:如何进行推荐系统重训练?一种结合元学习的重训练方法

报告摘要:推荐系统往往都需要定期进行重新训练来更新模型。为了保持模型性能,通常需要结合历史与新收集的数据一起来对模型进行重训练,因为这样可以同时捕捉用户长期与短期的偏好。但是,这种方式往往需要很大的存储与计算开销。在这项工作中,我们研究了推荐系统的重训练机制,这是一个具有较高实用价值的主题,但研究界却很少进行探讨。我们的一个信念是,无需再用历史数据进行模型重训练,因为模型本身已经包含了历史数据的信息。然而,由于新数据规模较小且包含的用户长期偏好信息较少,因此仅利用新数据进行常规的训练可能会轻易地导致过拟合与遗忘的问题。为了解决这个难题,我们结合元学习的思想,提出了一种新的训练方法(SML),期望通过直接迁移历史训练得到的知识,实现在重训练过程中消除历史数据的使用的目的。具体来说,我们设计了一种“迁移”组件,该组件将旧模型与新数据中的知识结合,生成适合未来推荐的新模型。为了更好地学习“迁移”组件,我们优化“未来性能”,即在下一个时间段内评估的推荐准确性。我们基于矩阵分解(MF)实现了SML,并在两个真实世界的数据集上进行了实验,验证了该方法的有效性。

 

 

第二场:2020年9月6日上午10:30-12:30

 

  • 嘉宾一
  • 嘉宾简介: Ting Chen (陈挺)is a research scientist from Google Brain team. He joined Google after obtaining his PhD from University of California, Los Angeles. Representation learning is his main research interest.
  • 报告题目: SimCLR: Closing the Gap Between Supervised And Self-Supervised Learning

Accepted by ICML2020

报告摘要: SimCLR is a simple framework for contrastive learning of visual representations. It simplifies recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.

 

  •  
  •  
  • 嘉宾二

嘉宾简介:Ziniu Hu is a third-year CS PhD student in UCLA advised by Prof. Yizhou Sun. His previous research focuses on developing machine learning methods that can efficiently and effectively handle graph-structured complex data, especially large-scale and multi-relational graphs.

报告题目:GPT-GNN : Generative Pre-training of Graph Neural Networks

Accepted by KDD2020

报告摘要:Graph neural networks (GNNs) have been demonstrated to be successful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce labeling effort is to pre-train an expressive GNN model on unlabelled data with self-supervision and then transfer the learned knowledge to downstream models.

In this work, we present the GPT-GNN framework to initialize GNNs by generative pre-training. We introduces a self-supervised attributed graph generation task to pre-train GNN. We factorize the likelihood of graph generation into two components: 1) attribute generation, and 2) edge generation. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on the billion-scale academic graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art base GNN models without pre-training by up to 9.1% across different downstream tasks, and also outperform other existing pre-training methods.

 

  • 嘉宾三

嘉宾简介:裘捷中,清华大学计算机科学与技术系五年级博士生,导师为唐杰教授。他的研究兴趣主要包括图数据的算法设计和表示学习。他关于图表示学习的工作是WSDM’18引用量最高的论文。

报告题目:

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Accepted by KDD2020

报告摘要

图表示学习目前受到了广泛关注,但目前绝大多数的图表示学习方法都是针对特定领域的图进行学习和建模,所产出的图神经网络难以迁移。近期,预训练在多个领域都取得了巨大的成功,显著地提升了模型在各大下游任务的表现。受到BERT,MoCo,CPC等工作的启发,我们研究了图神经网络的预训练,希望能够从中学习到通用的图拓扑结构特征。我们提出了图对比编码(Graph Contrastive Coding)的图神经网络预训练框架,利用对比学习(Contrastive Learning)的方法学习到内在的可迁移的图结构信息。本工作GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training已被KDD 2020 research track录用。

 

  •  
  • 嘉宾四

嘉宾简介:王涵之,中国人民大学博士生在读,2019年本科毕业于中国人民大学信息学院,导师为魏哲巍教授。她的研究兴趣为图算法,主要涉及图节点相似度、邻近度的高效计算等。其发表于SIGMOD 2020的论文《Exact Single-Source SimRankComputation on Large Graphs》提出了首个支持大图上单源SimRank精确计算的算法ExactSim;其发表于KDD 2020的论文《Personalized PageRank to a Target Node, Revisited》 提出了一种计算复杂度接近最优的single target PPR算法RBS。其研究成果已申请3项国家发明专利。

 

报告题目:时间复杂度接近最优的single-target PPR算法

Accepted by KDD2020

报告摘要:Personalized PageRank(简称PPR)是一种图节点邻近度的度量方法,被广泛应用于图挖掘和网络分析等领域。本篇论文关注single-targetPPR的计算问题,提出了一种高效计算single-target PPR的算法RBS,改进了single-target PPR计算的时间复杂度。当以相对误差进行结果约束时,RBS首次将single-target PPR问题的计算复杂度降低至理论下界,即达到了接近最优的计算复杂度。同时,single-target PPR的广泛应用也使得RBS算法可以进一步改进这些应用问题的运行效率,如频繁命中节点的查询问题(heavy hitters PPR query)、单源SimRank的计算问题、图嵌入和图神经网络中的PPR矩阵计算问题等。

 

嘉宾五

嘉宾简介:徐俊,哈尔滨工业大学社会计算与信息检索研究中心在读博士生,博士期间致力于知识驱动的主动开放域对话研究。

报告题目:Enhancing Dialog Coherence with Event Graph Grounded Content Planning

Accepted by IJCAI2020

报告摘要:如何生成连贯、有信息的开放域多轮对话是一个重要的研究问题。先前的工作聚焦于利用知识来提升回复生成的丰富性,却很少关注多轮连贯性。在本文中,为了提升多轮连贯性,我们提出利用事件链来帮助规划多轮对话骨架。我们首先从叙述文本中抽取事件链,并将多条事件链基于其共有的节点来连接成图;进一步的,我们提出通用的基于事件图谱的增强学习框架来做对话策略。它通过在事件图谱上游走来规划宏观对话内容(事件),进而基于规划好的内容指导回复生成。特别的,我们设计了一个通用的多策略决策机制来实现既宏观内容顺序合理又局部衔接合适的多轮连贯对话。实验结果表明,本文所提框架有助于提升对话的多轮连贯性和丰富度。

 

联系我们

杨洋浙江大学yangya@zju.edu.cn

东昱晓微软雷蒙德研究院yuxdong@microsoft.com