BLAST Scoring and Statistics (2024)

Alignment scoring

Position-independent scoring

Traditional BLAST uses position-independent scoring: the same substitution gets the same score all any position in the alignment.

Nucleotide Scoring

Nucleotide alignments use an identity scoring system, a simple match mismatch scoring system with a positive score for match and a negative score for a mismatch and gap open and extend penalties. The image below shows how BLAST scores and represents a nucleotide alignment.

BLAST Scoring and Statistics (1)

You can use BLAST 2 Sequences to see a megablast alignment between a human insulin transcript (NM_000207.3) and a predicted insulin transcript (XM_043971863.1) from the colocolo opossum.

The above alignment was produced by the megablast program, which is less sensitive (but faster) than blastn.

Do you differences in the alignment and score using the more sensitive program?

Protein Scoring

Protein alignments use a scoring system based on frequencies of amino acid substitutions in related proteins. The default scoring matrix is BLOSUM62, shown below. The BLOSUM series uses observed substitution frequencies in ungapped alignment blocks of related proteins. BLOSUM62 includes information up to 62% identity. Experiment have shown that this is the best general scoring system. Other available matrices for protein BLAST include several from the BLOSUM series tuned to different distances and several from the PAM series.

The numbers in BLOSUM62 are log odds ratios of the observed substitution frequency to the background frequency. Substitutions that occur more often than expected by chance have positive scores, those that occur less often than chance have negative scores, and those that occur at the background frequencies get a score of zero

BLAST Scoring and Statistics (2)

It's easy to understand the BLOSUM62 scores based amino acid chemistry and protein structure. Amino acid substitutions with side chains of similar size and chemistry have positive scores (e.g., aspartate (D)/glutamate (E)). Those involving dissimilar side chains have negative scores (e.g., phenylalanine (F)/glutamine (Q)). Self-substitution scores are along the diagonal and, in part, reflect the abundance of the amino acids. Rare amino acids such as tryptophan (W) have relatively high scores. Common amino acids such as valine (V), leucine (L), and isoleucine (I) have lower scores. The relatively high self-scores for proline (P) and glycine (G) may be because these amino acids often have special roles in determining protein structure. Keep in mind though that the substitution scores in the BLOSUM matrices are based on observed frequencies not on any predictions from amino acid properties.

The image below shows how BLAST scores and represents a protein alignment.

BLAST Scoring and Statistics (3)

You can use BLAST 2 Sequences to see a blastp alignment between the human creatine kinase M protein sequence (NP_001815.2) and a bacterial arginine kinase protein (MCP4285491.1).

Position Dependent Scoring

The position independent scoring systems make the unrealistic assumption that every position in a protein or nucleotide sequence is equally likely to change. Position specific scoring strategies described next do a better job of modeling real biological sequences and increase sensitivity

Specialized BLAST protein programs such as PSI-BLAST and the Conserved Domain Database (CDD) Search (RPS-BLAST) generate or search a database of Position-Specific Scoring Matrices (PSSMs). In a PSSM the score for a particular substitution depends on the position in the alignment. This is a better model of proteins since it can represent the fact that amino acids that are directly involved in catalysis, substrate, cofactor, or partner interaction as well as those required for critical structural elements are less likely to change than others. PSSMs are generated from multiple sequence alignments either generated on-the-fly from a BLAST search in the case of PSI-BLAST or as a curated database of conserved domains used by CDD search. PSSMs are better at detecting more distant protein relationships than ordinary BLAST and can have a more direct relationship to protein structure and function.

You'll use PSI-BLAST for one example in this workshop. CDD search runs by default in all of our protein examples and will show any conserved domains in your protein queries.

BLAST Scoring and Statistics (2024)
Top Articles
Profitability Ratio - What Are They, Formula, Example
How to choose between different liquid funds in market
Wodemo Link
Burch Messier Funeral Home Bedford Va Obituaries
Best Restaurants In White Rock Bc
Heat Pump Repair Horseshoe Bay Tx
Find used motorbikes for sale on Auto Trader UK
Christine Paduch Howell Nj
Globle Answer March 1 2023
Onlinewagestatements Lifepoint
Occ Roadhouse Menu Prices
Michael W Smith Declaration Of Independence
Benefit Solutions.ehr.com Tenet
1Tamilmv Unblock
Craigslist Albany Ny Garage Sales
Tmj4 Weather Milwaukee
Craigs Detroit
Why Did Mountain Creek Mud Bog Close
My Struggle Boosie Movie Hulu
Culver's Flavor Of The Day Ann Arbor
Bmcc Dean's List
Craigslist Portland Cars And Trucks By Owner
Profile By Sanford Recipes
Wedding Dr Amy Hutcheson Married
Bad Moms 123Movies
Lanipopvip
Virginia Tech named a top-20 public university in latest U.S. News & World Report 2024 Best Colleges report
Hot Fuzz Putlocker
Collier County Registry Of Deeds
Chest Compressor Mr Mine
شيراز كرم تويتر
Craigslist For Cleveland Ohio
Ferguson Showroom West Chester Pa
Ruth 1 Esv
What Time Does Sam's Club Gas Close Today
Savannah State University
Fab Pedigree
Haunted Mansion (2023) | Rotten Tomatoes
NRA Training
Rondale Moore Or Gabe Davis
Overtime Megan File Download
Directions To 401 East Chestnut Street Louisville Kentucky
Miniature Australian Shepherd Craigslist
A Dance Of Fire And Ice Kbh Games
How Greg Gutfeld Turned Fox News Channel Into A Late-Night Ratings Juggernaut
Town Of Kearny Recycling Calendar
70 Fantastic creatures from mythology
Hca Scheduler Login
Half Sleeve Hood Forearm Tattoos
Metro Pcs Locations Near Me
Latest Posts
Article information

Author: Twana Towne Ret

Last Updated:

Views: 6067

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Twana Towne Ret

Birthday: 1994-03-19

Address: Apt. 990 97439 Corwin Motorway, Port Eliseoburgh, NM 99144-2618

Phone: +5958753152963

Job: National Specialist

Hobby: Kayaking, Photography, Skydiving, Embroidery, Leather crafting, Orienteering, Cooking

Introduction: My name is Twana Towne Ret, I am a famous, talented, joyous, perfect, powerful, inquisitive, lovely person who loves writing and wants to share my knowledge and understanding with you.