HeuristicFilter#

class pyswrd.HeuristicFilter#

The SWORD heuristic filter for selecting alignment candidates.

__init__(queries, *, kmer_length=3, max_candidates=30000, score_threshold=13, scorer=None, threads=0, pool=None)#

Create a new heuristic filter.

Parameters:

queries (Sequences) – The queries sequences for which to filter the target database.

Keyword Arguments:
  • kmer_length (int) – The length of the k-mers to generate in the heuristic filter.

  • max_candidates (int) – The maximum number of candidate target sequences to keep per query sequence.

  • score_threshold (int) – The minimum score for generated k-mers.

  • scorer (Scorer) – The scorer to use for scoring alignments.

  • threads (int) – The number of threads to use for scoring k-mers in parallel. Set to one to disable multi-threading. Set to zero to use the number of CPUs reported by os.cpu_count.

  • pool (multiprocessing.pool.ThreadPool) – A running ThreadPool instance to use for scoring database chunks in parallel. If None given, create a new one.

finish()#

Finish scoring the database.

score(database)#

Score a chunk of the database.

This method updates the internal counter of databases sequences. The target indices correspond to the order the filter has seen the sequences. Passing the same sequence twice will cause the heuristic filter to treat them as two independent sequences.

Parameters:

database (Sequences) – The sequences of the database chunk.

scorer#

The scorer used for generating the k-mers.

Type:

pyswrd.Scorer