Hasin Abrar,Paul Medvedev
Hasin Abrar
Given a sorted list of k-mers S, the rank curve of S is the function mapping a k-mer from the k-mer universe to the location in S where it either first appears or would be inserted. An exciting recent development is the observation that, fo...
Applying the Safe-And-Complete Framework to Practical Genome Assembly [0.03%]
实用基因组组装的Safe-And-Complete框架分析
Sebastian Schmidt,Santeri Toivonen,Paul Medvedev et al.
Sebastian Schmidt et al.
Despite the long history of genome assembly research, there remains a large gap between the theoretical and practical work. There is practical software with little theoretical underpinning of accuracy on one hand and theoretical algorithms ...
Xiaofei Carl Zang,Xiang Li,Kyle Metcalfe et al.
Xiaofei Carl Zang et al.
Modern sequencing technologies allow for the addition of short-sequence tags, known as anchors, to both ends of a captured molecule. Anchors are useful in assembling the full-length sequence of a captured molecule as they can be used to acc...
RLBWT Tricks [0.03%]
RLBWT技巧
Nathaniel K Brown,Travis Gagie,Massimiliano Rossi
Nathaniel K Brown
Until recently, most experts would probably have agreed we cannot backwards-step in constant time with a run-length compressed Burrows-Wheeler Transform (RLBWT), since doing so relies on rank queries on sparse bitvectors and those inherit l...
Taxonomic classification with maximal exact matches in KATKA kernels and minimizer digests [0.03%]
基于KATKA内核和最小化摘要的完全匹配最大化的分类系统
Dominika Draesslerová,Omar Ahmed,Travis Gagie et al.
Dominika Draesslerová et al.
For taxonomic classification, we are asked to index the genomes in a phylogenetic tree such that later, given a DNA read, we can quickly choose a small subtree likely to contain the genome from which that read was drawn. Although popular cl...
Tizian Schulz,Paul Medvedev
Tizian Schulz
Given a sequencing read, the broad goal of read mapping is to find the location(s) in the reference genome that have a "similar sequence". Traditionally, "similar sequence" was defined as having a high alignment score and read mappers were ...
Amatur Rahman,Yoann Dufresne,Paul Medvedev
Amatur Rahman
A colored de Bruijn graph (also called a set of k-mer sets), is a set of k-mers with every k-mer assigned a set of colors. Colored de Bruijn graphs are used in a variety of applications, including variant calling, genome assembly, and datab...
Fast and Space-Efficient Construction of AVL Grammars from the LZ77 Parsing [0.03%]
基于LZ77分析的AVL文法的快速且空间高效的构造算法
Dominik Kempa,Ben Langmead
Dominik Kempa
Grammar compression is, next to Lempel-Ziv (LZ77) and run-length Burrows-Wheeler transform (RLBWT), one of the most flexible approaches to representing and processing highly compressible strings. The main idea is to represent a text as a co...
An efficient linear mixed model framework for meta-analytic association studies across multiple contexts [0.03%]
一种高效的线性混合模型框架,用于跨多个情境的元分析关联研究
Brandon Jew,Jiajin Li,Sriram Sankararaman et al.
Brandon Jew et al.
Linear mixed models (LMMs) can be applied in the meta-analyses of responses from individuals across multiple contexts, increasing power to detect associations while accounting for confounding effects arising from within-individual variation...
Sampling Conditions for Conforming Voronoi Meshing by the VoroCrust Algorithm [0.03%]
VoroCrust算法的符合性Delaunay网格采样条件研究
Ahmed Abdelkader,Chandrajit L Bajaj,Mohamed S Ebeida et al.
Ahmed Abdelkader et al.
We study the problem of decomposing a volume with a smooth boundary into a collection of Voronoi cells. Unlike the dual problem of conforming Delaunay meshing, a principled solution to this problem for generic smooth surfaces remained elusi...