bibliography in progress...
Whole-genome alignments with reversed sequences as negative controls showed that e-value filtering is not enough to remove spurious alignments of tandem repeat which therefore need to be masked (Frith MC, Hamada M and Horton P., 2011).
lastdb
can use various seeding schemes to build its index. Frith and Noé (2014) discuss some of them. TheRY
seeds are made of non-overlapping words using the two-letter alphabetR
=A|G
,Y
=C|T
, to increase speed with a good tradeoff in sensitivity (Frith MC, Noé L, Kucherov G, 2020).last-postmask
(Frith, 2011): discards alignments that contain a significant amount of lower-case-masked sequences.last-split
(Frith and Kawaguchi, 2015): heuristic algorithm inspired by the “repeated matches algorithm” of Durbin and coll. (1998). It searchs for an optimal set of local alignments (as opposed to a set of optimal local alignments). Its output is also used by third-party tool NanoSV (Cretu and coll., 2017).last-train
(Hamada, Ono, Asai and Frith, 2017): estimation of alignment parameters.local-rearrangements
(Frith and Khan, 2018): detection and display of rearrangements supported by multiple long reads and by the ancestrality of the reference sequence.tandem-genotypes
(Mitsuhashi and coll., 2019): detection of expansion of tandem repeats, after alignment withlast-split
.LAST can align DNA sequences to protein databases using a 64 x 21 substitution matrix Yao and Frith, 2020.
JRA (Joint Read Alignment) uses LAST Shrestha and coll., 2018.
A tutorial for the use of
dnarrange
is published in Frith and Mitsuhashi, 2022.
Martin C. Frith, Satomi Mitsuhashi
Posted August 15, 2022. doi:10.1101/2022.05.30.494079
Finding rearrangements in nanopore DNA reads with last and dnarrange
Frith MC, Hamada M, Horton P.
BMC Bioinformatics. 2010 Feb 9;11:80. doi: 10.1186/1471-2105-11-80
Parameters for accurate genome alignment.
Yin Yao, Martin C. Frith
In: Martín-Vide C., Vega-Rodríguez M.A., Wheeler T. (eds) Algorithms for Computational Biology. AlCoB 2021. Lecture Notes in Computer Science, vol 12715. Springer, Cham. DOI:10.1007/978-3-030-74432-8_11
Improved DNA-versus-Protein Homology Search for Protein Fossils
Frith MC, Noé L, Kucherov G.
Bioinformatics. 2020 Dec 21;36(22-23):5344–50. doi:10.1093/bioinformatics/btaa1054.
Minimally-overlapping words for sequence similarity search.
Frith MC, Noé L.
Nucleic Acids Res. 2014 Apr;42(7):e59. doi:10.1093/nar/gku104
Improved search heuristics find 20,000 new alignments between human and mouse genomes.
Shrestha AMS, Frith MC, Asai K, Richard H.
Nucleic Acids Res. 2018 Feb 16;46(3):e18. doi:10.1093/nar/gkx1175
Jointly aligning a group of DNA reads improves accuracy of identifying large deletions.
Cretu Stancu M, van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, de Ligt J, Pregno G, Giachino D, Mandrile G, Espejo Valle-Inclan J, Korzelius J, de Bruijn E, Cuppen E, Talkowski ME, Marschall T, de Ridder J, Kloosterman WP.
Nat Commun. 2017 Nov 6;8(1):1326. doi:10.1038/s41467-017-01343-4
Mapping and phasing of structural variation in patient genomes using nanopore sequencing.
Mitsuhashi S, Frith MC, Mizuguchi T, Miyatake S, Toyota T, Adachi H, Oma Y, Kino Y, Mitsuhashi H, Matsumoto N.
Genome Biol. 2019 Mar 19;20(1):58. doi:10.1186/s13059-019-1667-6
Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads.
Nucleic Acids Res. 2018 Feb 28;46(4):1661-1673. doi:10.1093/nar/gkx1266
Frith MC and Khan S.
A survey of localized sequence rearrangements in human DNA.
Frith MC, Kawaguchi R.
Genome Biol. 2015 May 21;16:106. doi:10.1186/s13059-015-0670-9
Split-alignment of genomes finds orthologies more accurately.
Bioinformatics. 2017 Mar 15;33(6):926-928. doi:10.1093/bioinformatics/btw742
Hamada M, Ono Y, Asai K, Frith MC.
Training alignment parameters for arbitrary sequencers with LAST-TRAIN.
Frith MC
PLoS One. 2011;6(12):e28819. doi:10.1371/journal.pone.0028819
Gentle masking of low-complexity sequences improves homology search.