Package lib :: Package sequence_alignment :: Module msa
[hide private]
[frames] | no frames]

Module msa

source code

Multiple sequence alignment (MSA) algorithms.

Functions [hide private]
list of str, numpy rank-2 int array
central_star(sequences, algorithm='NW70', matrix='BLOSUM62', gap_open_penalty=1.0, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)
Align multiple protein sequences to one reference by fusing multiple pairwise alignments.
source code
list of str, numpy rank-2 int array
msa_general(sequences, residue_numbers=None, msa_algorithm='Central Star', pairwise_algorithm='NW70', matrix='BLOSUM62', gap_open_penalty=1.0, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)
General interface for multiple sequence alignments (MSA).
source code
list of str, numpy rank-2 int array
msa_residue_numbers(sequences, residue_numbers=None)
Align multiple sequences based on the residue numbering.
source code
list of lists of int #
msa_residue_skipping(strings=None, gaps=None)
Create the residue skipping data structure.
source code
Variables [hide private]
  __package__ = 'lib.sequence_alignment'

Imports: float64, int16, zeros, sys, RelaxError, align_pairwise


Function Details [hide private]

central_star(sequences, algorithm='NW70', matrix='BLOSUM62', gap_open_penalty=1.0, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)

source code 

Align multiple protein sequences to one reference by fusing multiple pairwise alignments.

Parameters:
  • sequences (list of str) - The list of residue sequences as one letter codes.
  • algorithm (str) - The pairwise sequence alignment algorithm to use.
  • matrix (str) - The substitution matrix to use.
  • gap_open_penalty (float) - The penalty for introducing gaps, as a positive number.
  • gap_extend_penalty (float) - The penalty for extending a gap, as a positive number.
  • end_gap_open_penalty (float) - The optional penalty for opening a gap at the end of a sequence.
  • end_gap_extend_penalty (float) - The optional penalty for extending a gap at the end of a sequence.
Returns: list of str, numpy rank-2 int array
The list of alignment strings and the gap matrix.

msa_general(sequences, residue_numbers=None, msa_algorithm='Central Star', pairwise_algorithm='NW70', matrix='BLOSUM62', gap_open_penalty=1.0, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)

source code 

General interface for multiple sequence alignments (MSA).

This can be used to select between the following MSA algorithms:

  • 'Central Star', to use the central_star() function.
  • 'residue number', to use the msa_residue_numbers() function.
Parameters:
  • sequences (list of str) - The list of residue sequences as one letter codes.
  • residue_numbers (list of list of int) - The list of residue numbers for each sequence.
  • msa_algorithm (str) - The multiple sequence alignment (MSA) algorithm to use.
  • pairwise_algorithm (str) - The pairwise sequence alignment algorithm to use.
  • matrix (str) - The substitution matrix to use.
  • gap_open_penalty (float) - The penalty for introducing gaps, as a positive number.
  • gap_extend_penalty (float) - The penalty for extending a gap, as a positive number.
  • end_gap_open_penalty (float) - The optional penalty for opening a gap at the end of a sequence.
  • end_gap_extend_penalty (float) - The optional penalty for extending a gap at the end of a sequence.
Returns: list of str, numpy rank-2 int array
The list of alignment strings and the gap matrix.

msa_residue_numbers(sequences, residue_numbers=None)

source code 

Align multiple sequences based on the residue numbering.

Parameters:
  • sequences (list of str) - The list of residue sequences as one letter codes.
  • residue_numbers (list of list of int) - The list of residue numbers for each sequence.
Returns: list of str, numpy rank-2 int array
The list of alignment strings and the gap matrix.

msa_residue_skipping(strings=None, gaps=None)

source code 

Create the residue skipping data structure.

Parameters:
  • strings (list of str) - The list of alignment strings.
  • gaps (numpy rank-2 int array) - The gap matrix.
Returns: list of lists of int #
The residue skipping data structure. The first dimension is the molecule and the second is the residue. As opposed to zero, a value of one means the residue should skipped.