doublet is a modified version of the Smith-Waterman algorithm that incorporates patterns of dipeptide covariation to align protein sequences.

The doublet manuscript [pdf 0.4Mb] describes the doublet alignment algorithm, the derivation of dipeptide substitution matrices, and results of testing doublet versus standard Smith-Waterman in remote homolog detection.

doublet code 105Mb is distributed under the MIT Open Source License.

doublet was found to be statistically indistiguishable from Smith-Waterman for detecting remote homologs using the Pairwise Sequence Comparison Evaluation tools described in Price et al., 2005, Statistical evaluation of pairwise protein sequence comparison with the Bayesian bootstrap. The complete output of the tests are available here (293Mb).