Assessing and Improving the Accuracy of Detecting Molecular Adaptation with the TreeSAAP Analytical Software

David McClellan and David Ellison



The TreeSAAP software has been successfully used in a variety of protein studies for identifying and characterizing adaptation in terms of shifts in the physicochemical properties of amino acid replacements. It differentiates adaptive replacements from those that may have resulted from random mutation. The accuracy of TreeSAAP was tested using simulated protein-coding DNA data that was randomly generated using a bifurcating phylogeny to reflect a random pattern of mutation constrained only by the structure of the genetic code. A sampling of 1402 simulated amino acid replacements resulted in a default accuracy of 80.6%. More than 50% of the false-positive results were traced to just 11 of the possible single–step amino acid exchanges, each of which exhibited less than 50% accuracy. When these 11 exchanges are eliminated from the subsequent analysis, the accuracy of TreeSAAP is increased to nearly 90%. Further testing of this modified approach for adverse implications with empirical data is warranted.

Index Terms Simulated protein-coding DNA sequences, TreeSAAP, analytical results, accuracy.

Full Text (PDF)