# Nov 17, 2011 # Authors: Vlad Makarov, Chris Yoon # Language: Python # OS: UNIX/Linux, MAC OSX # Copyright (c) 2011, The Mount Sinai School of Medicine # Available under BSD licence # Redistribution and use in source and binary forms, with or without modification, # are permitted provided that the following conditions are met: # # Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # # Redistributions in binary form must reproduce the above copyright notice, # this list of conditions and the following disclaimer in the documentation and/or # other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. # IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, # BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, # EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. AnnTools, Version 1.1 PROJECT NAME: ANNTOOLS PROJECT HOME PAGE: http://anntools.sourceforge.net OPERATING SYSTEM: Linux, Unix, Mac OS X PROGRAMMING LANGUAGE: Python, SQL, shell script OTHER REQUIREMENTS: Python 2.6 or higher, MySQL 5.1 or higher, MySQLdb (Python-MySQL driver) LICENSE: BSD ANY RESTRICTIONS TO USE BY NON-ACADEMICS: None Source code, supporting files and user manual are freely available for download and use under the BSD License. The system is under active and continuous development, we provide montly updates for the aggregate database used for annotation RELEASE Beta Release, Version 1.0 PREREQUISITES The AnnTools is written in Python and uses MySQL for data storage. Python 2.6 or later (2.7 recommended ) MySQL 5.0 or later (5.5 recommended ) MySQLdb (Python-MySQL driver ) INSTALLATION The easiest way to install the software is to run setup script Download "setup.sh.gz" file from the left navigation tab to any directory on your computer where you have full privileges and sufficient space (at least 100GB) available. Shell file was zipped as some systems may prevent downloading executable files. Unzip the shell $ gunzip setup.sh.gz and give it execute permission $ chmod 755 setup.sh. Open shell file with any text editor and modify MySQL user name and password as appropriate. Account must have DROP, CREATE, INSERT, SELECT privileges to MySQL server in order to accomplish data table installation. Run installation ./setup.sh As some tables are very large, please allow several hours to complete this operation. Perform one time configuration as described below MANUAL INSTALLATION IS ALSO POSSIBLE Download AnnTools source code "anntools.tar.gz" from the left navigation bar to any directory on your computer. Unzip the application code $ tar xzvf anntools.tar.gz Download MySQL dump file "annotator.sql.gz" from the left navigation bar to any directory on your computer. Unzip the MySQL dump $ bunzip2 annotator.sql.bz2 Create a new MySQL database $mysql --user=USER --password=PASSWORD < annotator.sql Please note that you need an account with DROP, CREATE, INSERT, SELECT privileges to perform this operation. Perform one time configuration as described below ONE TIME CONFIGURATION BEFORE YOU START cd to "anntools" directory and change permissions to 755 for all *.py files and *.sh file Edit the "config.txt" file and specify host, user and password. Database name is already specified, please do not change it. Please note that whereas for most of the functionality MySQL account with SELECT privileges is adequate, you need to provide an account with DROP, CREATE, INSERT, SELECT privileges if you wish using custom annotation, as it involved creation of new tables. TEST RUN You will find three shell scripts called run_snp.sh, run_indel.sh and run_cnv.sh. We recommend trying running them to annotate SNP, indel and CNV calls located to "example" subdirectory to make sure that installation went correctly. Each has about 1000 variants and complete for about two minutes (on our machine). To annotate your own variants, simply substitute our VCF files names to yours. ENVIRONMENT Standalone machine or HPCC. Possible run modes: As a standalone program by runing the shell scripts provided (run_snp.sh, run_indel.sh, run_cnv.sh). In this mode one vcf file at a time is annotated Submitted to High Performance Computer Cluster. Writing submission shell will be reqired based on your HPCC specifications. Due to multiplicity of possibilities (SGE, PBS, etc.), writing of submission shell is left to the local developer. As an API, integrated to your own python code to analyse multiple BAM files. Use "annotate.py" python script as a starting point to include AnnTools to your python application. INPUT VCF (Variant Call Format), SamTools pileup, tabular files (plain text). The latter should be manually converted to VCF using tab2vcf.pl tool OUTPUT Output is generated in VCF (Variant Call Format). Annotated Call Sets for single nucleotide substitutions (SNP/SNV), short insertions/deletions (Indel) and structural variants (SV/CNV) are available for download from the left navigaion bar. Examples for each type of variants are available in tabular format below. Information was parced for better viewing.