skbio.sequence.distance.kmer_distance¶
- skbio.sequence.distance.kmer_distance(seq1, seq2, k, overlap=True)[source]¶
Compute the kmer distance between a pair of sequences
State: Experimental as of 0.5.0.
The kmer distance between two sequences is the fraction of kmers that are unique to either sequence.
- Parameters
- Returns
kmer distance between seq1 and seq2.
- Return type
- Raises
ValueError – If k is less than 1.
TypeError – If seq1 and seq2 are not
Sequence
instances.TypeError – If seq1 and seq2 are not the same type.
Notes
kmer counts are not incorporated in this distance metric.
np.nan
will be returned if there are no kmers defined for the sequences.Examples
>>> from skbio import Sequence >>> seq1 = Sequence('ATCGGCGAT') >>> seq2 = Sequence('GCAGATGTG') >>> kmer_distance(seq1, seq2, 3) 0.9230769230...