2006 Summer Internship Reading and
Software Materials
Papers covering the general topics of Delaunay
tessellation, computational mutagenesis, and machine learning. Pay special
attention to the definitions of balanced error rate (BER) and Matthew's correlation
coefficient (MCC) given in #4, because these are new concepts that we will be
applying. The paper in #3 was recently published and covers one aspect of our
work using the residual scores of the mutants with known, experimentally
determined activity.
- Singh
R.K., Tropsha A., Vaisman I.I. Delaunay Tessellation
of Proteins: Four Body Nearest Neighbor Propensities of Amino Acid
Residues, J. Comput. Biol. 3 (1996) 213-221.
- Masso
M., Vaisman I. Comprehensive Mutagenesis of HIV-1
Protease: A Computational Geometry Approach, Biochem. Biophys. Res.
Comm. 305 (2003) 322-326.
- Masso
M., Lu Z., Vaisman I. Computational Mutagenesis
Studies of Protein Structure-Function Correlations, Proteins 64 (2006)
234-245.
- Bao
L., Cui Y. Prediction of the Phenotypic Effects of
Non-Synonymous Single Nucleotide Polymorphisms Using Structural and
Evolutionary Information, Bioinformatics 21 (2005) 2185-2190.
- Krishnan
V.G., Westhead D.R. A Comparative Study of
Machine-Learning Methods to Predict the Effects of Single Nucleotide
Polymorphisms on Protein Function, Bioinformatics 19 (2003) 2199-209.
- Karchin
R., Kelly L., Sali A. Improving Functional Annotation
of Non-Synonomous SNPs with Information Theory, PSB 2005.
- Fawcett
T. ROC graphs: notes and practical considerations for
researchers (2004).
Barnase Papers
- Paper 1 (Axe, et al. 1) contains all the single point
mutants of barnase that we are examining (Fig. 2). The underlined
positions in Fig. 2 are those that directly interact with RNA.
- Paper 2 (Buckle, et al. 1) is the reference used by
Paper 1 to identify the underlined positions. In particular, the positions
were taken from Fig. 5.
- Paper 3 (Buckle, et al. 2) discusses the interaction
of barnase and barstar. Barstar binds to and inhibits barnase, and the
residue positions in barnase that are at the barnase-barstar interface are
listed in Table 2, column 2 (notice that many of them are also positions
that interact with RNA).
- Paper 4 (Fersht) is a review article about barnase
and folding. This paper has additional good biology background about
barnase (say for the introduction of a paper that you would write about
our results).
- Paper 5 (Axe et al. 2) is a paper exploring multiple
mutants in barnase. Although we will not be looking at these mutants for
the project, the paper contains a good deal of information for learning
more about barnase.
IL-3 Papers
- Paper 1 (Olins et al.) contains (almost) all the
single point mutants of IL-3 that we are examining (Table II) … see Paper
3 below for the remaining mutants that we are using. Here in Paper 1 you
have 3 activity classes (full is > 20% of wt activity, moderate is
5-19% of wt activity, and low is < 5% of wt activity). However, in
Table III of this same paper there is a list of 16 mutants (which are in
the full class in Table II) whose activity is actually more than 5-fold
greater than wt ( > 500% of the wt activity). Also, in Paper 3 (Tables
I, II, and III) below, 12 of the additional mutants that we included have
biological activity greater than 100%. Some of the mutants in the tables
of Paper 3 are also in Paper 1, but we always used the activity class
given in Paper 1 if there were any discrepancies. In fact, one of these 12
mutants with activity greater than 100% in Paper 3 is listed only as “moderate”
in Paper 1 (E75K), so we just called it moderate. So there are a total of
16+11=27 mutants with activity greater than 100% in our dataset from both
papers that we called “full”.
Before you run the automute.pl program, you should take these 27 mutants
out of the full class in our dataset and place them in a separate activity
class (call it "increased") which is even higher than full. In
your Delaunay directory, you have 2 files called 1jli_activity_mutants
(followed by .xls and .txt) which contain all the mutants and their
activity (last summer you typed all of it in Excel, and then you saved it
also as a .txt file so the automute.pl program can read it). You should
download the files to your PC, change these 27 mutants from full to
increased in those files, and upload the files back to the Delaunay
directory (all using the SSH Secure File Transfer). Alternatively, if you
remember how to use the vi editor, you can make the changes directly from
the command prompt in the SSH Secure Shell.
- Paper 2 (Feng et al.) is the primary citation for
the 1JLI structure in PDB. There is alot of good biology background about
the protein in this paper, and you can find information about amino acid
annotations into groups. For example, on page 528 they identify all the
amino acids in IL-3 that form a buried hydrophobic core.
- Paper 3 (Bagley et al.) identifies the amino acids
in IL-3 that are at the interface between it and the IL-3 receptor by
using molecular modeling. This paper also includes a good biological
background about IL-3, as well as a summary of previous mutagenesis
studies on IL-3 and the importance of certain positions on activity. As
mentioned above, look at Tables I, II, and III in this paper for the extra
mutants that we included in our dataset.
- Paper 4 (Klein et al. 1) also concerns the
identification of the receptor binding site in IL-3. Compare the results
in this paper to those in Paper 3.
- Paper 5 (Klein et al. 2), Paper
6 (Lopez et al.), and Paper 7 (Klein et al.
3) may also contain useful information, but we will not be working with
the multiple mutants of IL-3.