Homology Modeling and Simulations of Nuclease Structures
As discussed in other chapters in this book (see especially Chapters 6 , 7 , 9 , 23 , and 24 ), many nucleases were first identified as such by their sequence identity to known nucleases. Several of these were isolated because they exhibited a seemingly unrelated activity (protein kinase, angiogenesis, eosinophil activation, tumor-cell growth inhibition). Multiple-sequence alignments of many proteins in a family can reveal a consistently occurring pattern which is not easily detected by pairwise alignments (1 ,2 ). Even proteins with extremely low overall sequence identity (such as human angiogenin) to the pancreatic RNase (ribonuclease) sequences can be recognized by a pattern of highly conserved amino acids in a multiple-sequence alignment (3 ). The sequence of six RNases (RNase A and two mutants thereof, bovine seminal RNase, angiogenin, and onconase) from those whose structures are available in the Protein Data Bank (PDB) (4 ) are aligned in Fig. 1. The “RNase mask” (asterisks in Fig. 1. ) for this sequence family consists of <20 absolutely conserved residues, most of which are involved in disulfide linkages or catalysis. The largest block of identity in the series is CKXXNTF. Because insertions and deletions alter the distance between other conserved residues throughout the alignment, the mask, although useful for classifying members of the family, cannot be used to search sequences for new RNase homologs in a straightforward way.