Multi-Domain Data Mining
Dr Jason Kinser (School of Systems Biology, GMU)

Modern problems receive data from multiple sources or domains. Furthermore, some of these sources may have multiple domains. In bioinformatics, for example, data domains include DNA sequences, microscopic images, gene function, databases, and textual descriptions. Domains in text mining include word frequency, syntax, authors, keywords, citations, and images. Unfortunately, current data mining techniques tend to treat domains separately. BLAST searches usually contain only sequence information without regard to the other domains, and thus this sequence search is performed in a single domain. The solution is to create a multi-domain search space which contains different types of data. The complication is that the known connections between the data define a higher order space that is not tenable to linear techniques such a PCA. Methods are being developed to landscape the search space and to incorporate search techniques.