PySWRD Stars

Cython bindings and Python interface to Opal, a SIMD-accelerated pairwise aligner.

Actions Coverage PyPI Bioconda AUR Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads

Overview

Searching a sequence inside a database of target sequences involves aligning the sequence to all the targets to find the highest scoring ones, which has a high computational cost. Several methods have been proposed over the years that use a pre-filter to select. In BLAST, k-mers are extracted from the query, and only targets containing high-scoring k-mers, with respect to the scoring matrix, are actually aligned.

SWORD proposes a pre-filter built on perfect hashing of short mismatching k-mers. The k-mers generated from the query sequence also include k-mers with mismatches to improve sensitivity. When a k-mer is found in a target sequence, SWORD computes the diagonal where it is located, similarly to FASTA. Target sequences are then selected based on the number of hits they have on the same diagonal. The pairwise alignment is then handled by the platform-accelerated Opal library.

PySWRD is a Python module that provides bindings to the heuristic filter part of SWORD using Cython. It implements a user-friendly, Pythonic interface to build a heuristic filter, process a database in chunks, and produce the indices of targets passing the filter. The resulting indices can be used to filter a PyOpal database, using Opal for pairwise alignment like the original C++ implementation.

Setup

Run pip install pyswrd in a shell to download the latest release from PyPi, or have a look at the Installation page to find other ways to install pyswrd.

Library

License

This library is provided under the MIT License. Opal was developed by Martin Šošić and is distributed under the terms of the MIT License as well.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original Opal authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.