学术报告:Incremental Linear Discriminant Analysis(LDA) for Data Dimensionality Reduction

报告题目: Incremental Linear Discriminant Analysis(LDA) for Data Dimensionality Reduction

报告人: Prof. Delin Chu (新加坡国立大学)

时间: 201366日(周四)下午3:30-4:30

地点: 主楼409

摘要: It has been a challenge problem to develop fast and efficient incremental linear discriminant analysis (LDA) algorithms although several incremental LDA algorithms have been proposed in the past. For this purpose, we conduct a new study on LDA in this paper and develop a new and efficient incremental LDA algorithm. We first propose a new batch LDA algorithm called LDA/QR which only depends on the data matrix and the sizes of data classes. LDA/QR is obtained by computing the economic QR factorization of the data matrix followed by solving a lower triangular linear system. Hence, LDA/QR is a simple and fast LDA algorithm. The relationship between LDA/QR and Uncorrelated LDA (ULDA) is also revealed. Based on LDA/QR, we develop a new incremental LDA algorithm called ILDA/QR which is the exact incremental version of LDA/QR. The main features of our incremental LDA algorithm ILDA/QR include: (i) it can easily handle not only the case that only one new sample is inserted but also the case that a chunk of new samples are added; (ii) it has pleasant computational complexity and space complexity; and (iii) it is very fast and always achieves comparative classification accuracy compared with ULDA algorithm and existing incremental LDA algorithms. Numerical experiments using some real world data demonstrate that our ILDA/QR is very efficient and competitive with the state-of-the-art incremental LDA algorithms in terms of classification accuracy, computational complexity and space complexity.

讲座通知——Web spam detection using machine learning techniques

澳门科技大学资讯科技学院院长蔡亚从教授(Prof. TSOI Ah-Chung)于6月3日来北邮作学术报告。欢迎感兴趣的同学老师踊跃参与。

讲座题目:Web spam detection using machine learning techniques

主讲人:TSOI Ah-Chung教授(澳门科技大学)

时间:2013年6月3日(星期一)下午14:30-16:30

地点:新办501会议室

报告摘要:Web spam detection is a challenging problem primarily because of the large number of features involved, the scarcity of having the number of validated spam sites, and that the data is related to one another through a web topology. In this talk, we will use machine learning techniques, e.g., self organising map, and multilayer perceptrons, except that in both cases, they are extended to handle graph data inputs, balancing of imbalanced data, feature reduction, to study the problem. We applied our techniques to two well known publicly available web spam detection datasets, namely, the UK 2006 dataset, and the UK 2007 dataset. On both datasets using our methods, we achieve the best generalisation results so far, published by others. Moreover our method can be applied to other graph based datasets, without much changes, e.g., the mutagenesis dataset, and we also achieve the best results so far. Hence, our techniques is a general methodology which can handle long term dependency in deep learning architectures.

继续阅读