BINF739 Text Data Mining in Bioinformatics
Spring 2009- Jeffrey L. Solka
This course will provide an overview of the application of text data mining to bioinformatics. Topics to be discussed include statistical methods for document encoding, natural language processing methods for document encoding, visualization of document collections, clustering document collections, methods for semi-automated query refinement, and methods for literature-based discovery.
Instructor
Jeff Solka, 540-653-1982 (D), 540-371-3961 (N), 540-809-9799(C), jlsolka@gmail.com
Place and Time
4:30pm -- 7:10pm Wednesdays Ocaquan Building Room 327
Textbook
Sophia Ananiadou and John McNaught (Eds) - Text Mining for Biology and Biomedcine, Artech House, 2006. (requires)
Soumya Raychaudhuri - Computational Text Analysis for Gunctional Genomics and Bioinformatics, Oxford, 2006 (optional)
Roger Bilisoly, Practical Text Mining With PERL, Wiley, 2008 (optional)
Grading
Grades will be determined based on student presentations on a current paper in the literature along with a student project. Students willNOTATIONAL SCHEDULE
Course and Topical Overview
Read
Ananiadou and McNaught Chapters 1 and 2, Raychaudhuri Chapter 1 and Bilisoly Chapter 1 and Appendix A and
B
Corpora and Their Annotation
Read Ananiadou and McNaught Chapter 8, Raychaudhuri Chapter 2, and Bilisoly Chapter 2
Resources for Biological Text Data Mining
Read Ananiadou and McNaught Chapter 3, Raychaudhuri Chapter 3, and Bilisoly Chapter 3
Terminology Management and Abbreviations
Read Ananiadou and McNaught Chapters 4 and 5, Raychaudhuri Chapter 4, and Bilisoly Chapter 4
Bag of Words Based Approaches and Natural
Language Processing
Read Ananiadou and McNaught Chapter 2, Raychaudhuri Chapter 5, and Bilisoly Chapter 5
Named Entity Recognition and Information Extraction
and a Special Topic Lecture on Streaming Text Data Mining - Dr. Elizabeth Hohman
Read Ananiadou and McNaught Chapters 6 and 7, Raychaudhuri Chapter 6, and Bilisoly Chapter 6
Evaluation of Text Mining in Biology
Read Ananiadou and McNaught Chapter 9, Raychaudhuri Chapter 7, and Bilisoly Chapter 7
No Classes Spring Break
Clustering
and a Special Topic Lecture on Text Data Mining the Wikipedia - Dr. David Marchette
Read Raychaudhuri Chapter 8, and Bilisoly Chapter 8
Dimensionality Reduction and Visualization
and a Special Topic Leture on Iterative Denoising an Iterative Scheme for the Revelation of Cluster Structure on Document Collections
Dr. Kendall Giles
Read Raychaudhuri Chapter 9, and Bilisoly Chapter 9
Literature-based Discovery - I
and a Special Topic Lecture on Literature-based Discovery and SARS a New Connection
Read Raychaudhuri Chapter 9
Literature-Based Discovery - II
Read Raychaudhuri Chapter 10
Integrating Text Mining With Data Mining
Read Ananiadou and McNaught
Chapter 10
Student Paper Presentations
Student Project Presentations
Final Exam (All Assignments Due)