----------------------------------------------------------------------- BIOINFORMATICS COLLOQUIUM College of Science George Mason University ----------------------------------------------------------------------- A Conditional Entropy Based Approach to Co-clustering for the Analysis of Gene Expression Data Jeff Solka George Mason University Abstract: Methods of high-level data exploration capable of robustness in the face of noise found within microarray data are few and far between. Solutions making use of all original features to derive cluster structure can be misleading while those that rely on a trivial feature selection can miss important characteristics. We present a method adopted from previous work in the field of geography (Guo et al, 2003) relying upon conditional entropy between pairs of dimensions to uncover underlying, native cluster structure within a dataset. Applied to an artificially clustered data set, this method performed well though some sensitivity to multiplicative noise was in evidence. When applied to gene expression data, the method produced a clear representation of the underlying data structure.