|
Biclustering, block clustering ,〔 〕 co-clustering, or two-mode clustering 〔 〕 〔 〕 is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Mirkin,〔 〕 although the technique was originally introduced much earlier〔 (i.e., by J.A. Hartigan〔 〕). Given a set of rows in columns (i.e., an matrix), the biclustering algorithm generates biclusters – a subset of rows which exhibit similar behavior across a subset of columns, or vice versa. == Development == Biclustering was originally introduced by J. A. Hartigan in 1972.〔(Hartigan J A. Direct clustering of a data matrix[J]. Journal of the american statistical association, 1972, 67(337): 123–129. )〕 The term biclustering was later used by Mirkin. This algorithm was not generalized until 2000 when Y. Cheng and G. M. Church proposed a biclustering algorithm based on variance and applied it to biological gene expression data.〔https://www.cs.princeton.edu/courses/archive/fall03/cs597F/Articles/biclustering_of_expression_data.pdf Cheng Y, Church G M. Biclustering of expression data()//Ismb. 2000, 8: 93–103.〕 Their paper is still the most important literature in the gene expression biclustering field. In 2001 and 2003, I.S.Dhillon put forward two algorithms applying biclustering to files and words. One version was based on bipartite spectral graph partitioning.〔( Dhillon I S. Co-clustering documents and words using bipartite spectral graph partitioning[C]//Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001: 269–274. )〕 The other was based on information theory. Dhillon assumed the loss of mutual information during biclustering was equal to the KL(Kullback–Leibler)-distance between P and Q. P means the distribution of files and feature words before biclustering. Q is the distribution after biclustering. KL-distance is for measuring the difference between two random distributions. KL = 0 when the two distributions are the same and KL increases as the difference increases.〔(Dhillon I S, Mallela S, Modha D S. Information-theoretic co-clustering[C]//Proceedings of the ninth ACM SIGKDD international conference on KKluwer Academic Publishersnowledge discovery and data mining. ACM, 2003: 89–98. )〕 Thus, the aim of the algorithm was to find the minimum KL-distance between P and Q. In 2004, A.Banerjee used a weighted-Bregman distance instead of KL-distance to design a biclustering algorithm which was suitable for any kind of matrix, unlike the KL-distance algorithm.〔(Banerjee A, Dhillon I, Ghosh J, et al. A generalized maximum entropy approach to bregman co-clustering and matrix approximation[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004: 509–514. )〕 To cluster more than two types of objects, in 2005, Bekkerman expanded the mutual information in Dhillon’s theorem from a single pair into multiple pairs. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Biclustering」の詳細全文を読む スポンサード リンク
|