Codon usage bias (also known as codon bias) is the selective use of nucleotide triplets (codons) to encode specific amino acid sequences in the protein coding genes of a species. Every amino acid in a sequence can be encoded by one (in the case of methionine and tryptophan) to six different codons. The frequencies with which synonymous codons are used to encode an amino acid vary between organisms and sometimes even within the same organism.
Codon usage patterns reflect lineage and the genome composition of a species. Mazumdar et al. investigated the differences between grass and non-grass monocots.
First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces.
Mazumdar et al. identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content.
Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by GC-biased gene conversion.