PySWRD
#
Cython bindings and Python interface to SWORD (Smith Waterman On Reduced Database), a method for fast database search.
Overview#
Searching a sequence inside a database of target sequences involves aligning the sequence to all the targets to find the highest scoring ones, which has a high computational cost. Several methods have been proposed over the years that use a pre-filter to select. In BLAST, k-mers are extracted from the query, and only targets containing high-scoring k-mers, with respect to the scoring matrix, are actually aligned.
SWORD proposes a pre-filter built on perfect hashing of short mismatching k-mers. The k-mers generated from the query sequence also include k-mers with mismatches to improve sensitivity. When a k-mer is found in a target sequence, SWORD computes the diagonal where it is located, similarly to FASTA. Target sequences are then selected based on the number of hits they have on the same diagonal. The pairwise alignment is then handled by the platform-accelerated Opal library.
PySWRD is a Python module that provides bindings to the heuristic filter part of SWORD using Cython. It implements a user-friendly, Pythonic interface to build a heuristic filter, process a database in chunks, and produce the indices of targets passing the filter. The resulting indices can be used to filter a PyOpal database, using Opal for pairwise alignment like the original C++ implementation.
Setup#
Run pip install pyswrd in a shell to download the latest release from PyPi,
or have a look at the Installation page to find other ways
to install pyswrd.
Library#
License#
This library is provided under the GNU General Public License 3.0 or later. SWORD was developed by Robert Vaser and Dario Pavlovic under the terms of the GNU General Public License 3.0 or later as well. See the Copyright Notice section for the full license.
This project is in no way not affiliated, sponsored, or otherwise endorsed by the original SWORD authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.