org.biojava.bio.alignment
Class NeedlemanWunsch

java.lang.Object
  extended by org.biojava.bio.alignment.SequenceAlignment
      extended by org.biojava.bio.alignment.NeedlemanWunsch
Direct Known Subclasses:
SmithWaterman

public class NeedlemanWunsch
extends SequenceAlignment

Needleman and Wunsch definied the problem of global sequence alignments, from the first till the last symbol of a sequence. This class is able to perform such global sequence comparisons efficiently by dynamic programing. If inserts and deletes are equally expensive and as expensive as the extension of a gap, the alignment method of this class does not use affine gap panelties. Otherwise it does. Those costs need four times as much memory, which has significant effects on the run time, if the computer needs to swap.

Since:
1.5
Author:
Andreas Dräger, Gero Greiner

Field Summary
protected  String alignment
           
protected  double[][] CostMatrix
           
protected  Alignment pairalign
           
protected  SubstitutionMatrix subMatrix
           
 
Constructor Summary
NeedlemanWunsch(double match, double replace, double insert, double delete, double gapExtend, SubstitutionMatrix subMat)
          Constructs a new Object with the given parameters based on the Needleman-Wunsch algorithm The alphabet of sequences to be aligned will be taken from the given substitution matrix.
 
Method Summary
 List alignAll(SequenceIterator source, SequenceDB subjectDB)
           
 Alignment getAlignment(Sequence query, Sequence target)
          This method is good if one wants to reuse the alignment calculated by this class in another BioJava class.
 String getAlignmentString()
           
 double getDelete()
          Returns the current expenses of a single delete operation.
 double getEditDistance()
          This gives the edit distance acording to the given parameters of this certain object.
 double getGapExt()
          Returns the current expenses of any extension of a gap operation.
 double getInsert()
          Returns the current expenses of a single insert operation.
 double getMatch()
          Returns the current expenses of a single match operation.
 double getReplace()
          Returns the current expenses of a single replace operation.
protected static double min(double x, double y, double z)
          This just computes the minimum of three double values.
 double pairwiseAlignment(Sequence query, Sequence subject)
          Global pairwise sequence alginment of two BioJava-Sequence objects according to the Needleman-Wunsch-algorithm.
static void printAlignment(String align)
          prints the alignment String on the screen (standard output).
static String printCostMatrix(double[][] CostMatrix, char[] queryChar, char[] targetChar)
          Prints a String representation of the CostMatrix for the given Alignment on the screen.
 void setDelete(double del)
          Sets the penalty for a delete operation to the specified value.
 void setGapExt(double ge)
          Sets the penalty for an extension of any gap (insert or delete) to the specified value.
 void setInsert(double ins)
          Sets the penalty for an insert operation to the specified value.
 void setMatch(double ma)
          Sets the penalty for a match operation to the specified value.
 void setReplace(double rep)
          Sets the penalty for a replace operation to the specified value.
 void setSubstitutionMatrix(SubstitutionMatrix matrix)
          Sets the substitution matrix to be used to the specified one.
 
Methods inherited from class org.biojava.bio.alignment.SequenceAlignment
formatOutput
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CostMatrix

protected double[][] CostMatrix

subMatrix

protected SubstitutionMatrix subMatrix

pairalign

protected Alignment pairalign

alignment

protected String alignment
Constructor Detail

NeedlemanWunsch

public NeedlemanWunsch(double match,
                       double replace,
                       double insert,
                       double delete,
                       double gapExtend,
                       SubstitutionMatrix subMat)
Constructs a new Object with the given parameters based on the Needleman-Wunsch algorithm The alphabet of sequences to be aligned will be taken from the given substitution matrix.

Parameters:
match - This gives the costs for a match operation. It is only used, if there is no entry for a certain match of two symbols in the substitution matrix (default value).
replace - This is like the match parameter just the default, if there is no entry in the substitution matrix object.
insert - The costs of a single insert operation.
delete - The expenses of a single delete operation.
gapExtend - The expenses of an extension of a existing gap (that is a previous insert or delete. If the costs for insert and delete are equal and also equal to gapExtend, no affine gap penalties will be used, which saves a significant amount of memory.
subMat - The substitution matrix object which gives the costs for matches and replaces.
Method Detail

setSubstitutionMatrix

public void setSubstitutionMatrix(SubstitutionMatrix matrix)
Sets the substitution matrix to be used to the specified one. Afterwards it is only possible to align sequences of the alphabet of this substitution matrix.

Parameters:
matrix - an instance of a substitution matrix.

setInsert

public void setInsert(double ins)
Sets the penalty for an insert operation to the specified value.

Parameters:
ins - costs for a single insert operation

setDelete

public void setDelete(double del)
Sets the penalty for a delete operation to the specified value.

Parameters:
del - costs for a single deletion operation

setGapExt

public void setGapExt(double ge)
Sets the penalty for an extension of any gap (insert or delete) to the specified value.

Parameters:
ge - costs for any gap extension

setMatch

public void setMatch(double ma)
Sets the penalty for a match operation to the specified value.

Parameters:
ma - costs for a single match operation

setReplace

public void setReplace(double rep)
Sets the penalty for a replace operation to the specified value.

Parameters:
rep - costs for a single replace operation

getInsert

public double getInsert()
Returns the current expenses of a single insert operation.

Returns:
insert

getDelete

public double getDelete()
Returns the current expenses of a single delete operation.

Returns:
delete

getGapExt

public double getGapExt()
Returns the current expenses of any extension of a gap operation.

Returns:
gapExt

getMatch

public double getMatch()
Returns the current expenses of a single match operation.

Returns:
match

getReplace

public double getReplace()
Returns the current expenses of a single replace operation.

Returns:
replace

printCostMatrix

public static String printCostMatrix(double[][] CostMatrix,
                                     char[] queryChar,
                                     char[] targetChar)
Prints a String representation of the CostMatrix for the given Alignment on the screen. This can be used to get a better understanding of the algorithm. There is no other purpose. This method also works for all extensions of this class with all kinds of matrices.

Parameters:
queryChar - a character representation of the query sequence (mySequence.seqString().toCharArray()).
targetChar - a character representation of the target sequence.
Returns:
a String representation of the matrix.

printAlignment

public static void printAlignment(String align)
prints the alignment String on the screen (standard output).

Parameters:
align - The parameter is typically given by the getAlignmentString() method.

getAlignment

public Alignment getAlignment(Sequence query,
                              Sequence target)
                       throws Exception
This method is good if one wants to reuse the alignment calculated by this class in another BioJava class. It just performs pairwiseAlignment and returns an Alignment instance containing the two aligned sequences.

Specified by:
getAlignment in class SequenceAlignment
Returns:
Alignment object containing the two gapped sequences constructed from query and target.
Throws:
Exception

getEditDistance

public double getEditDistance()
This gives the edit distance acording to the given parameters of this certain object. It returns just the last element of the internal cost matrix (left side down). So if you extend this class, you can just do the following: double myDistanceValue = foo; this.CostMatrix = new double[1][1]; this.CostMatrix[0][0] = myDistanceValue;

Returns:
returns the edit_distance computed with the given parameters.

min

protected static double min(double x,
                            double y,
                            double z)
This just computes the minimum of three double values.

Parameters:
x -
y -
z -
Returns:
Gives the minimum of three doubles

getAlignmentString

public String getAlignmentString()
                          throws BioException
Specified by:
getAlignmentString in class SequenceAlignment
Returns:
a string representation of the alignment
Throws:
BioException

alignAll

public List alignAll(SequenceIterator source,
                     SequenceDB subjectDB)
              throws NoSuchElementException,
                     BioException
Specified by:
alignAll in class SequenceAlignment
Parameters:
source - a SequenceIterator containing a set of sequences to be aligned with
subjectDB - the SequenceDB containing another set of sequences.
Returns:
a list containing the results of all single alignments performed by this method.
Throws:
NoSuchElementException
BioException

pairwiseAlignment

public double pairwiseAlignment(Sequence query,
                                Sequence subject)
                         throws BioRuntimeException
Global pairwise sequence alginment of two BioJava-Sequence objects according to the Needleman-Wunsch-algorithm.

Specified by:
pairwiseAlignment in class SequenceAlignment
Returns:
score of the alignment or the distance.
Throws:
BioRuntimeException
See Also:
SequenceAlignment.pairwiseAlignment(org.biojava.bio.seq.Sequence, org.biojava.bio.seq.Sequence)