API Reference¶
Functions¶
- pyswrd.search(queries, targets, *, gap_open=10, gap_extend=1, scorer_name='BLOSUM62', kmer_length=3, max_candidates=30000, score_threshold=13, max_alignments=10, max_evalue=10.0, algorithm='sw', threads=0)¶
Run a many-to-many search of query sequences to target sequences.
This function is a high-level wrapper around the different classes of the
pyswrd
library to support fast searches when all sequences are in memory.- Parameters:
queries (
Sequences
, or iterable ofstr
) – The sequences to query the target sequences with.targets (
Sequences
or iterable ofstr
) – The sequences to be queries with the query sequences.gap_open (
int
) – The penalty for opening a gap in each alignment.gap_extend (
int
) – The penalty for extending a gap in each alignment.scorer_name (
str
) – The name of the scoring matrix to use for scoring each alignment. SeeScorer
for the list of supported names.kmer_length (
int
) – The length of the k-mers to use in the SWORD heuristic filter.max_candidates (
int
) – The maximum number of candidates to retain in the heuristic filter.max_evalue (
float
) – The E-value threshold above which to discard sequences before alignment.algorithm (
str
) – The algorithm to use to perform pairwise alignment. Seepyopal.Aligner.align
for more information.threads (
int
) – The number of threads to use to run the pre-filter and alignments. If zero is given, uses the number of CPUs reported byos.cpu_count
.pool (
ThreadPool
) – A running pool instance to use for parallelization. Useful for reusing the same pool across several calls ofsearch
. IfNone
given, spawns a new pool based on thethreads
argument.
- Yields:
Hit
– Hit objects for each hit passing the threshold parameters. Hits are grouped by query index, and sorted by E-value.
Example
>>> queries = ["MAGFLKVVQLLAKYGSKAVQWAWANKGKILDWLNAGQAIDWVVSKIKQILGIK"] >>> targets = ([ ... "MESILDLQELETSEEESALMAASTVSNNC", ... "MKKAVIVENKGCATCSIGAACLVDGPIPDFEIAGATGLFGLWG", ... "MAGFLKVVQILAKYGSKAVQWAWANKGKILDWINAGQAIDWVVEKIKQILGIK", ... "MTQIKVPTALIASVHGEGQHLFEPMAARCTCTTIISSSSTF", ... ]) >>> for hit in pyswrd.search(queries, targets): ... cigar = hit.result.cigar() ... print(f"target={hit.target_index} score={hit.score} evalue={hit.evalue:.1g} cigar={cigar}") target=2 score=268 evalue=1e-33 cigar=53M
Classes¶
A generator of k-mers with optional substitutions. |
|
A class storing the scoring matrix and gap parameters for alignments. |
|
A class for calculating E-values from alignment scores. |
|
A list of sequences. |
|
The score of the heuristic filter for a single target. |
|
The result of the heuristic filter. |
|
The SWORD heuristic filter for selecting alignment candidates. |
|
A single hit of a database search. |