Gromiha | Protein Bioinformatics | E-Book | sack.de
E-Book

E-Book, Englisch, 339 Seiten

Gromiha Protein Bioinformatics

From Sequence to Function
1. Auflage 2011
ISBN: 978-0-12-388424-4
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark

From Sequence to Function

E-Book, Englisch, 339 Seiten

ISBN: 978-0-12-388424-4
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark



One of the most pressing tasks in biotechnology today is to unlock the function of each of the thousands of new genes identified every day. Scientists do this by analyzing and interpreting proteins, which are considered the task force of a gene. This single source reference covers all aspects of proteins, explaining fundamentals, synthesizing the latest literature, and demonstrating the most important bioinformatics tools available today for protein analysis, interpretation and prediction. Students and researchers of biotechnology, bioinformatics, proteomics, protein engineering, biophysics, computational biology, molecular modeling, and drug design will find this a ready reference for staying current and productive in this fast evolving interdisciplinary field. - Explains all aspects of proteins including sequence and structure analysis, prediction of protein structures, protein folding, protein stability, and protein interactions - Presents a cohesive and accessible overview of the field, using illustrations to explain key concepts and detailed exercises for students.

Michael is a frequent invited speaker to local conferences and universities in India and tointernational conferences focused on bioinformatics, computational biology andmolecular biology. He maintains close connections with research and teachingcolleagues in India and contributes to international publications including handbooks,encyclopedias and journals. He began his research on Computational MolecularBiophysics in 1989, earning the PhD in BioPhysics from Bharathidasan University,India. He gained his first Post Doctoral experience on DNA bending and protein-DNAinteractions at International Center for Genetic Engineering and Biotechnology (ICGEB),Italy. He developed databases for proteins and computer simulation of protein-DNAinteractions during his subsequent postdoc at The Institute of Physical and ChemicalResearch (RIKEN), Japan. At AIST he continues to focus on various aspects of proteinbioinformatics.

Gromiha Protein Bioinformatics jetzt bestellen!

Autoren/Hrsg.


Weitere Infos & Material


1;Front cover
;1
2;Protein Bioinformatics: From Sequence to Function;4
3;Copyright page
;5
4;Contents;8
5;Foreword;16
6;Preface;18
7;Acknowledgments;20
8;Chapter 1: Proteins;22
8.1;1.1 Building blocks;22
8.2;1.2 Hierarchical representation of proteins;25
8.3;1.3 Structural classification of proteins;28
8.4;1.4 Databases for protein sequences;30
8.5;1.5 Protein structure databases;37
8.6;1.6 Literature databases;44
8.7;1.7 Exercises;45
8.8;References;47
9;Chapter 2: Protein Sequence Analysis;50
9.1;2.1 Sequence alignment;50
9.2;2.2 Programs for aligning sequences;51
9.3;2.3 Amino acid properties;60
9.4;2.4 Amphipathic character of a-helices and ß-strands;69
9.5;2.5 Amino acid properties for sequence analysis;73
9.6;2.6 Exercises;77
9.7;References;79
10;Chapter 3: Protein Structure Analysis;84
10.1;3.1 Assignment of secondary structures;84
10.2;3.2 Computation of solvent accessibility;85
10.3;3.3 Representation of solvent accessibility;89
10.4;3.4 Residue–residue contacts;92
10.5;3.5 Amino acid clusters in protein structures;94
10.6;3.6 Contact potentials;95
10.7;3.7 Cation-p interactions in protein structures;99
10.8;3.8 Noncanonical interactions;101
10.9;3.9 Free energy calculations;101
10.10;3.10 Amino acid properties derived from protein structural data;106
10.11;3.11 Parameters for proteins;110
10.12;3.12 Protein structure comparison;115
10.13;3.13 Exercises;118
10.14;References;119
11;Chapter 4: Protein Folding Kinetics;128
11.1;4.1 .-value analysis;128
11.2;4.2 Folding nuclei and .-values;136
11.3;4.3 Relationship between amino acid properties and .-values
;139
11.4;4.4 .-value analysis with hydrophobic clusters and long-range contact networks;141
11.5;4.5 Kinetic database for proteins;144
11.6;4.6 Prediction of protein folding rates;145
11.7;4.7 Relationship between .-values and folding rates
;156
11.8;4.8 Exercises;158
11.9;References;159
12;Chapter 5: Protein Structure Prediction;164
12.1;5.1 Protein structural class;164
12.2;5.2 Secondary structure content;167
12.3;5.3 Secondary structural regions;168
12.4;5.4 Discrimination of transmembrane helical proteins and predicting their membrane-spanning segments;180
12.5;5.5 Discrimination of transmembrane strand proteins;187
12.6;5.6 Identification of membrane-spanning ß-strand segments
;191
12.7;5.7 Discrimination of disordered proteins and domains;197
12.8;5.8 Solvent accessibility;200
12.9;5.9 Inter-residue contact prediction;206
12.10;5.10 Protein tertiary structure prediction;208
12.11;5.11 Exercises;216
12.12;References;218
13;Chapter 6: Protein Stability;230
13.1;6.1 Determination of protein stability;230
13.2;6.2 Thermodynamic database for proteins and mutants;232
13.3;6.3 Relative contribution of noncovalent interactions to protein stability;239
13.4;6.4 Stability of thermophilic proteins;241
13.5;6.5 Analysis and prediction of protein mutant stability;251
13.6;6.6 Exercises;261
13.7;References;262
14;Chapter 7: Protein Interactions;268
14.1;7.1 Protein–protein interactions;268
14.2;7.2 Protein–DNA interactions;287
14.3;7.3 Protein–RNA interactions;299
14.4;7.4 Protein–ligand interactions;303
14.5;7.5 Quantitative structure activity relationship in protein–ligand interactions;310
14.6;7.6 Exercises;313
14.7;References;314
15;Appendix A;324
15.1;List of protein databases;324
15.2;List of protein Web servers;326
16;Index;334


Chapter 2 Protein Sequence Analysis
Publisher Summary
This chapter discusses the protein sequence analysis. The analysis of protein sequences provides the information about the preference of amino acid residues and their distribution along the sequences for understanding the secondary and tertiary structures of proteins and their functions. The identification of similar motifs in protein sequences would help to predict the structurally or functionally important regions. The profiles obtained with the single amino acid properties based on amino acid sequence would reveal the clustering of amino acids with similar property. Amino acid sequences have a lot of hidden information, which can be used for developing sequence-based prediction methods. The comparison of different amino acid sequences using alignment methods would enhances the knowledge about the availability of similar sequences, and these sequences could be used as a template for protein three-dimensional structure prediction. Aligning the sequences or structures mainly carries out the comparison of two proteins. In this method, a one-to-one correspondence is set up between the residues of the two proteins. The simplest observation is the global alignment of two sequences, in which the two proteins have maintained a correspondence over the entire length. An alternative is the local alignment in which the alignment is made only with the most similar part of the proteins. The analysis of protein sequences provides the information about the preference of amino acid residues and their distribution along the sequences for understanding the secondary and tertiary structures of proteins and their functions. The identification of similar motifs in protein sequences would help to predict the structurally or functionally important regions. The profiles obtained with the single amino acid properties based on amino acid sequence would reveal the clustering of amino acids with similar property. Furthermore, the comparison of different amino acid sequences using alignment methods would enhance our knowledge about the availability of similar sequences, and these sequences could be used as a template for protein three-dimensional structure prediction. 2.1 Sequence alignment
The comparison of two proteins is mainly carried out by aligning the sequences or structures. In this method, a one-to-one correspondence is set up between the residues of the two proteins. The simplest observation is the global alignment of two sequences, in which the two proteins have maintained a correspondence over the entire length. An alternative is the local alignment in which the alignment is made only with the most similar part of the proteins. An alignment of two sequences A and B must obey the following conditions: (i) All residues should be used in the alignment and all should be in the same order, (ii) align one residue from A with another from B, (iii) a residue can be aligned with a blank (-), and (iv) two blanks cannot be aligned. The different ways of aligning two sequences, VEITGEIST and PRETERIT, are shown in Figure 2.1. From these alignments, one could estimate the score for each aligned positions and hence the total score. The scoring scheme will be as follows: (i) Score = 1, if both the residues in the same positions of the sequences A and B are the same (e.g., in Alignment 1 [Figure 2.1], both the sequences A and B at position 3 are E, and hence it will have the score of 1), (ii) if the residues are different, score = 0 (e.g., position 1, the residues are V and P, respectively in sequences A and B), and (iii) score = - 1 if there is a gap in the alignment (e.g., positions 2 and 4 in Alignment 1). The added score for all the residues gives the net score for the aligned sequences. In alignments, the positioning of residues with similar properties (e.g., Val and Ile are hydrophobic, Glu and Asp are negatively charged, etc.) is used to find similar sequences (Eidhammer et al. 2004). Figure 2.1 Sequence alignment and scoring schemes for two typical sequences: score = 1 for same residue (shown in boxes); score = 0 for different residues and score = - 1 for gap. 2.2 Programs for aligning sequences
Several computer programs have been developed for estimating the similarity score of two sequences and for finding similar sequences from available databases using pairwise and multiple alignments. 2.2.1 Basic Local Alignment Search Tool (BLAST)
Altschul et al. (1990) developed an approach for a rapid sequence comparison, basic local alignment search tool (BLAST), which directly approximates alignments that optimize a measure of local similarity and the maximal segment pair score. This algorithm has been applied in a variety of contexts, including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In this method, the query protein sequence can be searched with several databases, including the nonredundant structures available in PDB, protein sequences at SWISS-PROT, etc. Furthermore, BLAST has several features such as (i) identifying protein sequences similar to the query, (ii) finding members of a protein family or building a custom position-specific scoring matrix, (iii) finding proteins similar to the query around a given pattern, (iv) finding conserved domains in the query, and (v) searching for peptide motifs. BLAST is available at http://www.ncbi.nlm.nih.gov/BLAST/. An example to identify protein sequences similar to the query is shown in Figure 2.2. BLAST has several options for querying a sequence: Figure 2.2 Retrieval of similar sequences using BLAST: (a) the input page showing the query sequence and other options, (b) the sequences that are showing high sequence identity with the query sequence, and (c) the sequence alignment of the two homologous sequences. (i) Accepts the sequence with accession number, gi, and FASTA format. The input data can be given by copying and pasting the details directly on the Web or by uploading a file from a local computer. Accession number is the number allotted in UniProt for each sequence (e.g., P61626); gi is a bar-separated NCBI sequence identifier (e.g., gi|48428995). A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater than (“>”) symbol at the beginning. An example sequence in FASTA format is given below:
> gi|48428995|sp|P61626.1|LYSC_HUMAN RecName: Full=Lysozyme C MKALIVLGLVLLSVTVQGKVFERCELARTLKRLGMDGYRGISLANWMCLAKWESGYNTRATNYNAGDRSTDYGIFQINSRYWCNDGKTPGAVNACHLSCSALLQDNIADAVACAKRVVRDPQGIRAWVAWRNRCQNRDVRQYVQGCGV
The complete amino acid sequence in FASTA format has been provided in Figure 2.2a. It is also possible to specify a fragment of the sequence by providing a sub-range of the query sequence. (ii) Allows selecting from a database to search against the input sequence. The nonredundant protein sequences (nr) have been selected as the database in Figure 2.2a. (iii) The algorithm of the program can be selected and, for finding similar sequences, BLASTP is used. (iv) It is possible to adjust several parameters: (a) displaying the maximum number of aligned sequences, expect threshold, and word size. Expect threshold (e-value) is the expected number of chance matches in a random model, and it is set at 10 as the default value. Word size is the length of the seed that initiates an alignment. In addition, scoring parameters can be selected for matrix, gap cost, and compositional adjustments. The substitution matrix is a key element in evaluating the quality of a pairwise sequence alignment, which assigns a score for aligning any possible pair of residues. Generally BLOSUM62 is used as the substitution matrix, which is a 20 × 20 matrix obtained for all possible substitutions of 20 amino acid residues (Table 2.1). It is based on a likelihood method by estimating the occurrence of each possible pairwise substitution using the biochemical character of amino acid residues (aliphatic, aromatic, positive charged, negative charged, polar, sulfur containing, etc., see Figure 1.2), and the development of BLOSUM62 has been described in Eddy (2004). The gap cost is a cost to create and extend a gap in an alignment. Furthermore, options are available to filter the low-complexity regions and mask query and lowercase letters in the sequence. Table 2.1 Blosum62...



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.