A Frequent Itemset-Nearest Neighbor Based Approach for Clustering Gene Expression Data

Rosy Das, D.K. Bhattacharyya and J.K. Kalita



Microarray technology has enabled the monitoring of expression levels of thousands of genes across different experimental conditions. Identifying groups of genes that manifest similar expression patterns in such huge amounts of data is crucial in the analysis of gene expression time series. In this study, we present an integrated analysis of microarray data using association mining and clustering that discovers intrinsic grouping based on co-occurrence patterns in such data. A shared nearest neighbor approach is used to cluster the results of association mining to obtain the final clustering of the dataset. The method was used with real-life datasets and has been found to perform satisfactorily.

Index Terms Gene expression, microarray, coherent pattern, association mining, clustering.

Full Text (PDF)