Package lib :: Package sequence_alignment :: Module needleman_wunsch
[hide private]
[frames] | no frames]

Module needleman_wunsch

source code

Functions for implementing the Needleman-Wunsch sequence alignment algorithm.

Functions [hide private]
float, str, str, numpy rank-2 int array
needleman_wunsch_align(sequence1, sequence2, sub_matrix=None, sub_seq=None, gap_open_penalty=1, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)
Align two sequences using the Needleman-Wunsch algorithm using the EMBOSS logic for extensions.
source code
numpy rank-2 float32 array, numpy rank-2 int16 array
needleman_wunsch_matrix(sequence1, sequence2, sub_matrix=None, sub_seq=None, gap_open_penalty=1, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0, epsilon=1e-07)
Construct the Needleman-Wunsch matrix for the given two sequences using the EMBOSS logic.
source code
Variables [hide private]
  SCORE_MATCH = 1
  SCORE_MISMATCH = -1
  SCORE_GAP_PENALTY = 1
  SCORES = array([ 0., 0., 0.], dtype=float32)
  TRACEBACK_DIAG = 0
  TRACEBACK_TOP = 1
  TRACEBACK_LEFT = 2
  __package__ = 'lib.sequence_alignment'

Imports: float32, int16, zeros, RelaxError, RelaxFault


Function Details [hide private]

needleman_wunsch_align(sequence1, sequence2, sub_matrix=None, sub_seq=None, gap_open_penalty=1, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0)

source code 

Align two sequences using the Needleman-Wunsch algorithm using the EMBOSS logic for extensions.

This is implemented as described in the Wikipedia article on the Needleman-Wunsch algorithm. The algorithm has been modified to match that of EMBOSS to allow for gap opening and extension penalties, as well as end penalties.

Parameters:
  • sequence1 (str) - The first sequence.
  • sequence2 (str) - The second sequence.
  • sub_matrix (numpy rank-2 int array) - The substitution matrix to use to determine the penalties.
  • sub_seq (str) - The one letter code sequence corresponding to the substitution matrix indices.
  • gap_open_penalty (float) - The penalty for introducing gaps, as a positive number.
  • gap_extend_penalty (float) - The penalty for extending a gap, as a positive number.
  • end_gap_open_penalty (float) - The optional penalty for opening a gap at the end of a sequence.
  • end_gap_extend_penalty (float) - The optional penalty for extending a gap at the end of a sequence.
Returns: float, str, str, numpy rank-2 int array
The alignment score, two alignment strings and the gap matrix.

needleman_wunsch_matrix(sequence1, sequence2, sub_matrix=None, sub_seq=None, gap_open_penalty=1, gap_extend_penalty=1.0, end_gap_open_penalty=0.0, end_gap_extend_penalty=0.0, epsilon=1e-07)

source code 

Construct the Needleman-Wunsch matrix for the given two sequences using the EMBOSS logic.

The algorithm has been modified to match that of EMBOSS to allow for gap opening and extension penalties, as well as end penalties.

Parameters:
  • sequence1 (str) - The first sequence.
  • sequence2 (str) - The second sequence.
  • sub_matrix (numpy rank-2 int16 array) - The substitution matrix to use to determine the penalties.
  • sub_seq (str) - The one letter code sequence corresponding to the substitution matrix indices.
  • gap_open_penalty (float) - The penalty for introducing gaps, as a positive number.
  • gap_extend_penalty (float) - The penalty for extending a gap, as a positive number.
  • end_gap_open_penalty (float) - The optional penalty for opening a gap at the end of a sequence.
  • end_gap_extend_penalty (float) - The optional penalty for extending a gap at the end of a sequence.
  • epsilon (float) - A value close to zero to determine if two numbers are the same, within this precision.
Returns: numpy rank-2 float32 array, numpy rank-2 int16 array
The Needleman-Wunsch matrix and traceback matrix.