学术活动

学术报告

当前位置: 首页 - 学术活动 - 学术报告 - 正文

TOP-K Phi Correlation Computation

发布日期:2014-12-30

点击量:

主讲人 时间
地点

熊辉

【主题】TOP-K Phi Correlation Computation

【时间】2007-5-2910:30 AM

【地点】清华大学经济管理学院舜德楼418A

【语言】中文/英文

【内容摘要】

The problem of association pattern mining is to develop techniques for finding groups of highly-correlated objects from massive data. This problem is important for various application domains, such as homeland security, market basket study, and biomedical data analysis. A large body of association mining work was motivated by the difficulty of efficiently identifying highly correlated objects using traditional statistical correlation measures. This has led to the use of alternative interest measures, such as support and confidence, despite the lack of a precise relationship between these new interest measures and statistical correlation measures. However, this approach tends to generate too many spurious patterns involving objects which are poorly correlated. In this talk, we provide a precise relationship between Phi correlation coefficient and the support measure. We also identify a 2-D monotone property of an upper bound of Phi correlation coefficient and develop an efficient algorithm, called TOP-COP to exploit this property to effectively prune many pairs even without computing their correlation coefficients. Our experimental results show that TOP-COP can be an order of magnitude faster than alternative approaches for mining the top-k strongly correlated pairs. Finally, we show that the performance of the TOP-COP algorithm is tightly related to the degree of data dispersion. Indeed, the higher the degree of data dispersion, the larger the computational savings achieved by the TOP-COP algorithm.

【主讲人简介】

Hui Xiong is currently an Assistant Professor in the Management Science and Information Systems Department at Rutgers - the State University of New Jersey, USA. He received the Ph.D. degree in Computer Science from the University of Minnesota, USA, in 2005, the B.E. degree in Automation from the University of Science and Technology of China, and the M.S. degree in Computer Science from the National University of Singapore. His research interests include data mining, spatial databases, statistical computing, and Geographic Information Systems (GIS) with applications in business, database security, self-managing systems, and bio-medical informatics. He has published over 30 papers in the refereed journals and conference proceedings, such as IEEE Transactions on Knowledge and Data Engineering, VLDB Journal, Data Mining and Knowledge Discovery Journal, ACM SIGKDD, SIAM SDM, IEEE ICDM, ACM CIKM, ACM GIS, and PSB. He is the co-editor of the book entitled "Clustering and Information Retrieval", the author of a monograph entitled "Hyperclique pattern discovery: Algorithms and applications", and the co-Editor-in-Chief of Encyclopedia of Geographical Information Science. He has also served on the organization committees and the program committees of a number of conferences, such as ACM SIGKDD, SIAM SDM, IEEE ICDM, IEEE ICTAI, ACM CIKM, and IEEE ICDE. Dr. Xiong is a member of the IEEE Computer Society, the ACM, and the Sigma Xi.

关闭

地址:清华大学经济管理学院伟伦楼447(100084)

邮箱:rccm@mail.tsinghua.edu.cn

电话:010-62771663

传真:010-62784555

Copyright 2025清华大学现代管理研究中心 版权所有