BINF 739-002 Spring 2007

Network Based Models in Bioinformatics and Biocomputing

 

GMU Instructor - Dr.Jennifer Weller

Co-Instructor Dr. Jeff Solka

Meeting Time - PLEASE NOTE NEW TIME: 12:30 pm - 3:00 pm Tuesdays, Spring Semester 2007

Meeting place: Occoquan Bldg room 327 (note change from catalog)


Dr. Weller Contact Information:
Office: Occoquan Building (OB), room 328E
Contact: (703)-993-8329
email: jweller@gmu.edu
Office Hours for Dr Weller: 1-3 Mondays or by appointment; drop-ins are not encouraged and email is a good way to get specific questions answered.

Course Description: The student will learn concepts underlying deterministic and stochastic network-based models. These models will be studied within a biological context. Topics that will be covered include: deterministic graph models (graphs, subgraphs, graph automorphism, trees), random graphs (different graph types based on edge connection probability distribution), measures of centrality, robustness, and techniques for graph visualization. Applications of these models to protein networks, regulatory networks, metabolic networks, and gene expression networks, will be studied. Students will be exposed to both concepts and practical examples of the application of network models to bioinformatics/biocomputing data. General analysis tools will include R, BIOCONDUCTOR, and associated network modeling libraries. Homework problems will be assigned to reinforce concepts and develop skills needed for applying these methodologies to real world problems.

Prerequisites: Experience using R, enough biology to understand what gene expression and regulatory pathways are, as well as the different levels of control in cells

Required texts:

"The Regulatory Genome: Gene Regulatory Networks In Development And Evolution" (Hardcover) by Eric H. Davidson, Academic Press; 1st edition (May 30, 2006)

"Graph Theory and Its Applications, Second Edition (Discrete Mathematics and Its Applications)" (Hardcover) by Jonathan L. Gross, Jay Yellen, Chapman & Hall/CRC; 2 edition (September 22, 2005).

It has been pointed out that one edition of the text is available on-line, please follow the link: Link

Class Policies: Students are responsible for all assigned material (homework and readings) and must be prepared to communicate ideas in class.

Homework policies: Students are encouraged to discuss problems and assignments with one another. Homework assignments and tests must reflect the individual efforts of students. Also note that 70% or more of an assignment must reflect the words and synthesis of a student: a mass of material ‘glued together’ from Web resources does not constitute original work even if properly cited, and will be graded accordingly.

Citation Policy: Students are encouraged to read and cite the literature to support their solutions to homework problems. When part of a solution for a homework assignment or the project has been found in a reference, whether Web, journal or book, the student must properly cite that source. Failure to properly cite the work of another constitutes plagiarism; both cheating and plagiarism will be grounds for referral to the GMU Honor Council.

Important Class Dates


Paper presentation: Apr 24th, 2007 during class time
Project Presentation: May 1st, 2007 during class time
Propject report may be turned in any time up to Monday May 9th at 5pm, as hard copy.

Expected Coursework

There will be five quizzes, six homework assignments, an in-class presentation of an article from the literature and an in-class presentation of a project with an accompanying written report.

Project: Each student will be expected to take a method developed during the course and expand or alter it and then apply it to a dataset presented in one of the papers and compare the results of the two methods, or apply one of the methods leared in class to a new dataset and discuss how the outcome compares to the previous pathway or to expectations based on other types of knowledge. The student must present the project to the rest of the class. A formal project report will also be expected. For late report submissions, the grade will decrease by 10% per day, and submission more than three days late will not be accepted. No delays on presentations will be allowed, if a student is aware of a conflict s/he has the option of an early presentation. Plan ahead!
Points awarded: 20%, 10% oral/10% written.


No extra credit will be given in this class.

Evaluation: final grade will be based on homework (50%, best 5 of 6 grades), quizzes (20%, best 4 of 5 grades), paper presentation (10%) and final project (20%).

Academic integrity: Students are expected to adhere to the standards of academic integrity set forth by George Mason University and the Bioinformatics program. Policies are detailed in the GMU Student Conduct Code.

Dr. Solka also maintains a class Web page, where he posts additional papers and sample scripts relevant to his lectures. We will endeavor to keep them synchronized, but the wise student will check both.
URL for Dr. Solka's page:

http://binf.gmu.edu/~jsolka/spring2007/binf739/binf739_s2007_rev1.html

Data, Meetings, Tutorials, Webcasts, Software etc.

Data

Here is a link to the Drosophila Data set that we have mentioned in class. It is a zip file so you will have to download and upzip it. Dataset Link

Here is the title of paper describing this original experiment:"Gene Expression During the Life Cycle of Drosophila melanogaster" by Arbeitman et al. (2002) Science 297: 2270-2275.

Meetings

In class on Jan 29th, Dr. Solka mentioned a meeting that he is chairing next week on campus, that students might have an interest in attending (there is a fee and you must register in advance).
THe title is "The second annual conference for Quantitative Methods in Defense and National Security" , or QMDNS2007 and information can be found at this url: http://www.galaxy.gmu.edu/QMDNS2007/. while a brochure is posted here: Brochure

Also on Jan 29th, Dr. Solka also mentioned graphing software from Dr. Marchetti that is to be used for someof the homework, the tar-gzipped file is available here:
Download software

Here is a set of slides that constitute a tutorial on Graph Data Management from Dr. Solken of LBNL: Download Solken slides

Papers

  1. Barabasi A-L, Oltvai ZN: "Network biology: understanding the cell’s functional organization." Nat Rev Genet 2004, 5:101-113.
  2. Bickel, D. R. (2004b) "Gene networks and probabilities of spurious connections: Application to expression time-series," under review, preprint available via http://www.davidbickel.com
  3. D’haeseleer, P., Liang, S. & Somogyi, R., "Genetic network inference from co-expression clustering to reverse engineering" (2000) Bioinformatics 16, 707-726.
  4. Hidde De Jong, "Modeling and Simulation of Genetic Regulatory Systems: A Literature Review," J. of Comp. Biol. Volume 9, Number 1, 2002 Mary Ann Liebert, Inc. Pp. 67-103
  5. Gardner, T.S., di Bernardo, D., Lorenz, D., and Collins, J.J. (2003). "Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling", Science 301, 102-105.
  6. Guthke R, Moller U, Hoffmann M, Thies F, Topfer S., "Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection", Bioinformatics. 2005 Apr 15;21(8):1626-34. Epub 2004 Dec 21.
  7. Peter M. Haverty, Ulla Hansen, and Zhiping Weng,, "Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection", Bioinformatics, Vol. 21 no. 8 2005, pages 1626-1634
  8. M. E. J. Newman, "The Structure and Function of Complex Networks", SIAM Review, Vol. 45, No. 2, pg. 167-256.
  9. John Jeremy Rice, Yuhai Tu and Gustavo Stolovitzky, "Reconstructing biological networks using conditional correlation analysis," Bioinformatics, Vol. 21 no. 6 2005, pages 765-773.
  10. Juliane Schäfer and Korbinian Strimmer, "An empirical Bayes approach to inferring large-scale gene association networks," Bioinformatics, Vol. 21 no. 6 2005, pages 754-764.
  11. Saeed Tavazoie, Jason D. Hughes, Michael J. Campbell, Raymond J. Cho & George M. Church, "Systematic determination of genetic network architecture," Nature Genetics, 22,1999.
  12. M. K. Stephen Yeung, Jesper Tegne´, and James J. Collins "Reverse engineering gene networks using singular value decomposition and robust regression," PNAS , April 30, 2002 , vol. 99, no. 9 , 6163-6168.
    In 2005 PNAS had a special edition on gene regulatory networks with the following papers
  13. "Gene Regulatory Networks Special Feature: Gene regulatory networks" by Eric Davidson and Michael Levin, PNAS 2005 102: 4935.
  14. "Gene Regulatory Networks Special Feature: Xenopus as a model system to study transcriptional regulatory networks" by Tetsuya Koide, Tadayoshi Hayata, and Ken W. Y. Cho, PNAS 2005 102: 4943-4948.
  15. "Gene Regulatory Networks Special Feature: Contingent gene regulatory networks and B cell fate specification" by Harinder Singh, Kay L. Medina, and Jagan M. R. Pongubala, PNAS 2005 102: 4949-4953.
  16. "Gene Regulatory Networks Special Feature: The role of binding site cluster strength in Bicoid-dependent patterning in Drosophila" by Amanda Ochoa-Espinosa, Gozde Yucel, Leah Kaplan, Adam Pare, Noel Pura, Adam Oberstein, Dmitri Papatsenko, and Stephen Small, PNAS 2005 102: 4960-4965.
  17. "Gene Regulatory Networks Special Feature: Quantitative analysis of binding motifs mediating diverse spatial readouts of the Dorsal gradient in the Drosophila embryo" by Dmitri Papatsenko and Michael Levine, PNAS 2005 102: 4966-4971.
  18. "Gene Regulatory Networks Special Feature: Transcriptional network underlying Caenorhabditis elegans vulval development" by Takao Inoue, Minqin Wang, Ted O. Ririe, Jolene S. Fernandes, and Paul W. Sternberg, PNAS 2005 102: 4972-4977.
    Lee Hoods at the ISB group also studies computational approaches to gene regulatory networks. One paper of interest is
  19. "The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo" by Bonneau et al., Genome Biology 7:R36 (2006).

    Papers to be presented by class members:
  20. Chris Overall: "Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees" by Y Xu, V Olman, D Xu - Bioinformatics, 2002 18 no. 4 Pages 536-545
  21. Saeed Khoshnevis: "Gene Regulatory Networks Special Feature: The role of binding site cluster strength in Bicoid-dependent patterning in Drosophila" by Amanda Ochoa-Espinosa, Gozde Yucel, Leah Kaplan, Adam Pare, Noel Pura, Adam Oberstein, Dmitri Papatsenko, and Stephen Small, PNAS 2005 102: 4960-4965.
  22. Nuttachat Wisittipanit:"Gene Regulatory Networks Special Feature: Contingent gene regulatory networks and B cell fate specification" by Harinder Singh, Kay L. Medina, and Jagan M. R. Pongubala, PNAS 2005 102: 4949-4953.
  23. Collin Sherrill: "An empirical Bayes approach to inferring large-scale gene association networks," Bioinformatics, Vol.21 no. 6 2005, pages 754-764.by Juliane Schäfer and Korbinian Strimmer
  24. Farroukh Alemi: personal research

    In lecture 5 I discussed techniques for finding kinetic rates. Two reviews presented in the lecture are given below
  25. Robert Beynon "Technique Review - The Dynamics of the Proteome: Strategies for measuring protein turnover on a proteome-wide scale" in Briefings in Functional Genomics and Proteomics, vol 3(4) 382-390 (2005)
  26. Roy Parker and Haiwei Song "Review - The enzymes and control of eukaryotic mRNA turnover" in Nature Structural and Molecular Biology, vol 11(2) 121-128 (2004).
    Chapter 2 in Davidson has several figures that are somewhat difficult to interpret based on the legend alone.. THe following papers contina helpful additional information
  27. "A Genomic Regulatory Network for Development" by Davidson et al., Science 295:1669-1678 (2002)
  28. " New Computational Approaches for Analysis of cis-Regulatory Networks" by Brown et al., Developmental Biology 246: 86-102 (2002)

Web sites of interest (please alert us as you find others)


The Davidson Lab
http://sugp.caltech.edu/endomes/
http://www.bio.davidson.edu/courses/GENOMICS/method/UrchDev.htm

The Hood lab
http://www.systemsbiology.org/Scientists_and_Research/Faculty_Groups/Hood_Group

The Levine lab
http://mcb.berkeley.edu/faculty/GEN/levinem.html

A database on Hox genes and their transcription factors:
http://www.iephb.nw.ru/labs/lab38/spirov/hox_pro/hoxdb.html

A very complete Web resource on C elegans biology, development and the tools used to study it
http://www.wormbook.org/chapters/

Network software
ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks), http://amdec-bioinfo.cu-genome.org/html/ARACNE.htm
CLR (Context Likelihood of Relatedness) Gardner lab http://gardnerlab.bu.edu/software&tools.html
BANJO (Bayesian Network Inference with Java Objects) Harteminck lab http://www.cs.duke.edu/~amink/software/banjo/download/

Databases of transcription factors and/or tools for identifying their binding sites
TRANSFAC http://www.gene-regulation.com/cgi-bin/pub/databases/transfac/search.cgi?
JASPAR http://jaspar.cgb.ki.se/TEMPLATES/help.htm
TRANSCompel http://www.gene-regulation.com/pub/databases/transcompel/compel.html
TESS: Transcription element search site http://www.cbil.upenn.edu/cgi-bin/tess/tess

Lecture notes will be posted below by the Friday following class lectures. This is a courtesy, please do not harass professors about posting lectures prior to class

Lecture 1 notes from Dr. Weller's half Weller Lecture 1 link

Lecture 1 notes from Dr. Solka's half Solka Lecture 1 link

Lecture 2 notes Dr. Solka: using djm to graph in R Solka Lecture 2 link

Lecture 2 notes from Dr. Solka: Structure and Representation Solka Lecture 2 link

Lecture 3 notes from Dr. Weller Weller Lecture 3 link

Lecture 4 notes from Dr. Solka Solka Lecture 4 link

Lecture 5 notes from Dr. Weller Weller Lecture 5 link

Lecture 6 notes from Dr. Solka Solka Lecture 6 link


Note that the paper discussed in class is listed above.

Lecture 7 notes from Dr. Solka Solka Lecture 7 link


The reference for this lecture is Newman, M.E.J. "Who is the best connected scientist? A study of scientific coauthorship networks" Phys. Rev.E64 (2001).

Lecture 8 notes from Dr. Weller Weller Lecture 8 link

Lecture 9 notes from Dr. Weller Weller Lecture 9 link

Lecture 10 notes from Dr. Weller Weller Lecture 10 link

Lecture 11 notes from Dr. Weller Weller Lecture 11 link


HW assignments will be posted here

Homework 1 notes from Dr. Weller HW1-weller

Homework 1 notes from Dr. Solka HW1-solka

Homework 2 Combined Weller/Solka questions HW2 Description
Note: Because of the problems with the Graphing package another week has been allowed to complete this assignment. Please see the igraph library as a possible resource: http://cneurocvs.rmki.kfki.hu/igraph/#intro .

Homework 3 notes from Dr. Weller HW3-weller


Note that the homework requires that you read a paper by Arbeitman et al. and use the data posted above. It is suggested that you use SEBINI (reference posted below) and CABIN, although you may use your own methods instead, or see ARACHNE.

"Gene Expression During the Lide Cycle of Drosophila melanogaster by Arbeitman et al., Science 297: 2270-2276 (2002)

"SEBINI: Software Environment for Biological Network Inference" by Taylor et al., Bioinformatics (2006) SEBINI link

I have a link to CABIN documentation here, although I could not find a paper describing it.THere is a handout from a meeting, which I have scanned and you can look at here: CABIN page 1 CABIN Brochure page 1 link and CABIN page 2 CABIN brochure page 2 and the software documentation itself CABIN documentation link


Homework 4 from Dr. Solka HW4-solka

The following table will be used to announce lecture topics, reading and homework assignments, etc. It is subject to change, so please check back frequently.


BINF739 Section 2 Spring 2007
Date Lecturer Topic Assignments Due date HW and Quiz Alert
Week I: Jan 22 Solka & Weller ‘Omics and genetics data measurements and representations, Introduction to graph models Chapter 1 in The Regulatory Genome, Chapter 1 in Graph Theory Jan 29 HW 1 assigned
Week II: Jan 29 Solka Using graphs to represent structured data Chapter 2 in Graph Theory Feb 6 Quiz 1
Week III: Feb 6 Weller Regulatory logic of the genome, structure-function relationships and control functions Chapter 2 in The Regulatory Genome Feb 13 HW 1 due, HW 2 assigned
Week IV: Feb 13 Solka The characterization of trees, common tree structures and paths, cycles and edge cuts Chapter 3, 4 in Graph Theory Feb 20 Quiz 2
Week V: Feb 20 Weller Developmental processes, the regulatory state and gene regulatory network circuitry Chapter 2 second half in The Regulatory Genome Feb 27 HW 2: 1 wk extension
Week VI: Feb 27 Solka Constructing networks, vertices and edges Chapter 5 in Graph Theory Mar 6 Quiz 3 on Solka material
HW3 assigned
Week VII: Mar 6 Solka Case study of document-author networks Newman, MEJ PhyRev E64(2001) Mar 20 Paper selection due to Professors
Week VIII: Spring Break NA NA NA NA NA
Week IX: Mar 20 Weller Class cancelled na na na
na
Week X: Mar 27 Weller Gene structure, gene regulatory networks and predicting evolutionary processes Chapter 3 in The Regulatory Genome Apr 3 Quiz 4
HW 3 due, HW4 assigned
Week XI: Apr 3 Solka & Weller Visualization of graphs and maps, topology and higher order surfaces Chapter 8 in Graph Theory
Davidson Chapt 4 pt 1
Apr 10 Quiz 5
HW 4 due, HW 5 assigned
Week XII: Apr 10 Solka & Weller Visualization of data: coloring of graph components Chapter 9 in Graph Theory
Davidson Chapter 4 pt2
Apr 17 HW 6 assigned
Week XIII: Apr 17 Weller Gene expression networks and evolution Davidson Chapter 5 Apr 24 HW 5 due
Week XIV: Apr 24 Solka Student paper presentations NA NA HW 6 due; Paper presentation slides due
Week XV: May1 Solka & Weller Student project presentations NA NA Project slides due
Finals Week XVI: May 9th Solka & Weller NA NA NA Project Report due (hard copy)