2018-02-07

摘要

本项目针对互联网时代丰富而复杂的多媒体数据,研发了一系列有理论基础和应用 前景的深度学习技术,从海量动态的社交媒体数据中挖掘出隐藏的有用信息。项目申请 书中提出的研究内容顺利开展,并获得了令人满意的成果。第一,基于深度神经网络的 多模态特征学习和表示是我们的核心研究内容:我们采用多模态数据的聚集效应,构建 了一个统一的社交媒体机器学习框架;利用耗散结构理论对人类的隐性学习机制进行建 模。以此为基础,我们根据不同媒体自身特点研发了十余种算法,成功地应用于社交媒 体上非完整图像辨识、自动文本摘要、音频情感计算、多媒体数据融合,相关成果在国 际期刊和会议上发表文章15篇。 第二,在用户隐性特征学习和表示上,我们研发了新的 用户行为建模的算法,挖掘用户的使用特点和偏好,并采用脑波分析的方法提高用户体 验预测的准确性。 相关成果在国际期刊和会议上发表文章3篇。第三,在多媒体特征和 用户特征关联性研究上,我们提出了三个新的算法,构建了视频特征、音频特征以及多 模态特征与用户体验之间的多层次关系的建模,相关成果在国际期刊和会议上发表文章3 篇。第四,在动态媒体内容分析上,我们进行了时空数据的动态显著性研究,并把用户 的注意力关联性运用到社交媒体的推广上。相关成果在国际期刊上发表文章2篇。在原定 的研究内容以外,我们还将深度学习的研究成果应用于社交媒体与计算机安全的交叉问 题研究。 我们利用社交媒体上的海量数据加强数据隐藏能力,研发了新的深度学习算法 去提高数据隐藏的侦测能力。相关成果在国际会议上发表文章2篇。该项目还为人才培养和发展提供了良好的环境,其中四位最初的参与成员已经取得了博士学位。这个项目的开展对中国在多媒体,人工智能,和脑科学跨领域研究有着重要科学意义。 本项目还孕育了香港第一个认知计算实验室,也成为中国国家自然科学基金的一个成果展示平台。

This project aims to explore a set of deep learning techniques with solid theoretical support and great application potential, to discover the hidden information from huge volume social media data. The proposed research tasks have been completed with satisfactory outputs. The first task of multimodal feature learning and representation based on deep neural networks is the core of this project. By integrating the collective prior into the deep architecture, we construct a unified framework of latent feature learning on social media. Moreover, we model implicit learning of human beings based on dissipative structure. Under this framework and model, we design a series of algorithms and successfully apply to incomplete image recognition, automatic text abstraction, music emotion analysis, and multimodal feature learning. Fifteen research papers have been published in international journals and conferences. The second task is the use’s latent feature learning and representation. We develop novel user behavior model to mine the user’s characters and preferences. Moreover, we improve the affective computing accuracy by analyzing user’s brain wave. Three papers have been published in international journals and conferences. The third task is the study of the correlation between multimedia data and user data. We propose three new algorithms to model the multilevel relations between user experience and visual features, audio features, multimodal features respectively. Three papers have been published in international journals and conferences. The forth task is the dynamical media data analysis. We work on the study of the saliency detection via spatio-temporal attention analysis and promote the advertise video by Cross-Network Association. Two papers have been published in the international journals. Besides the proposed tasks, we apply the proposed deep learning techniques to the interdisciplinary research of social media and computer security. We utilize the big data on the social media to increase the difficulty of steganalysis, and at the same time, we propose the novel deep learning techniques to detect the image with hidden messages. Two papers have been published in the international conference. This project proposes a good environment of talent nurturing, and till now four group members have been awarded doctor degree. The findings in this project directly benefit the long-term development of the research in multimedia computing, artificial intelligence, and brain science. This project also contributes the establishment of first cognitive computing lab in Hong Kong, and promotes the research of National Science of Foundation of China to the public.