You are here: Home / Software / CHISPAs Guide

CHISPAs Guide

Quick tour for CHISPAs

 

 

 CHISPAs is intended to assist researchers in filtering out sequences that incorrectly have been assigned to a phenotype. In the most common scenarios, researchers usually work with two phenotypes (mutant and wild-type) and use DNA amplification & sequencing to generate the DNA sequences to assign to each phenotype. It may be anticipated the presence of errors in these assignments derived from:

 

  • DNA amplification and sequencing
  • Phenotype assignments

 

Therefore, many DNA sequences presumably might be wild-type (or mutant) in phenotype or DNA sequence, but actually are not. We referred to these as Incorrect Sequence-Phenotype Assignments or simply ISPAs.

 

To identify ISPAs, this software needs several data:

 

  1. File with DNA sequences assigned to a wild-type phenotype.
  2. File with DNA sequences assigned to a mutant phenotype.
  3. DNA regions useful to identify the true sequences from each file.
  4. Observed rate of experimental errors (sequencing and/or phenotype assignments) and expected variation.
  5. SQL data to work with your data locally.
  6. Name of a file where to save the results.

 

You need to download CHISPAs GUI (you need to have a MySQL engine installed in your machine and a copy of a driver for MySQL in JAVA) into your machine and execute it by typing in your console/terminal:

java -jar CHISPAsGUI.jar

NOTE: You need to specify the path to your copy of MySQL connector either specifying in your local environment the variable CLASSPATH or by using the flag "-classpath"

Once you start CHISPAs, a window will appear in you local computer that looks like this: 

 

You need to provide 13 data for the program to run.

First, you need to specify the date in the text field labelled "Date":


Second, specify the name of your local file with DNA sequences assigned to the wild-type phenotype:

 

Third, specify the name of your local file with DNA sequences assigned to the mutant phenotype:

 


Fourth, type in the letters (IUPAC format; see a summary of this format here) that all sequences with the wild-type phenotype should have in the 5' end at the text field labeled "5' DNA sequence in wild-type sequences:

 

Fifth, type in the letters (IUPAC format; see a summary of this format here) that all sequences with the wild-type phenotype should have in the 3' end at the text field labeled "3' DNA sequence in wild-type sequences:

 

Sixth, type in the letters (IUPAC format; see a summary of this format here) that includes the mutated region on every sequence with the wild-type phenotype should have at the text field labeled "Mutated region in wild-type sequences: " :

 

Seventh, type in the letters (IUPAC format; see a summary of this format here) that all sequences with the mutant phenotype should have in the 5' end at the text field labeled "5' DNA sequence in mutant sequences:

 

Eighth, type in the letters (IUPAC format; see a summary of this format here) that all sequences with the mutant phenotype should have in the 3' end at the text field labeled "3' DNA sequence in mutant sequences:

 

Ninth, type in the letters (IUPAC format; see a summary of this format herethat includes the mutated region on every sequence with the mutant phenotype should have at the text field labeled "Mutated region in mutant sequences: " :

 

Tenth, type in a value for the observed experimental and sequencing error (E). This number should be smaller than 1.0:


Eleventhtype in a value for the expected variation on the error estimation (G). This number should be smaller than 1.0:


Twelfth, specify the name of your local file containing data to operate MySQL with your data (click on the icon with the "Open SQL File..." legend):


Thirteenth, type in a filename where to save the results from this program at the text field labeled "Output filename":


Finally, click on the large button labeled "CHISPAs!" and read the advances in the large text area:


The output file should list the sequences found to be ISPAs, either true or false:

A true ISPA is that sequence found in both wild-type and mutant phenotypes with a frequency bigger than that expected from experimental errors (E+G values). A false ISPA is that sequence with a frequency smaller or equal than that expected from experimental errors.

During the execution and while you keep CHISPAsGUI running, two temporal files are created (the same file names specified in steps Second and Third, but with extension "_matched") that will be erased upon closing the application. This files include the sequences that matched the criteria specified to identify the wild-type (see steps Fourth, Fifth and Sixth) and mutant (see steps SeventhEighth and Ninth) sequences. If you want to keep them, just rename them before closing the application.

You may want to download the examples files to try CHISPAs, including:

 

For any further questions, please contact gdelrio@ifc.unam.mx. 

« November 2017 »
November
SuMoTuWeThFrSa
1234
567891011
12131415161718
19202122232425
2627282930