Bioinformatics Software

From CsWiki
Revision as of 14:57, 18 June 2020 by Yaronw (Talk | contribs)

Jump to: navigation, search
Name Description
Bamtools BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files
Bedops BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale. Tasks can be easily split by chromosome for distributing whole-genome analyses across a computational cluster.
Bedtools A swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF
Biscuit A utility for analyzing sodium bisulfite conversion-based DNA methylation/modification data
Bismark A program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. The output can be easily imported into a genome viewer, such as SeqMonk, and enables a researcher to analyse the methylation levels of their samples straight away
Boost & Boost-cpp Boost provides free peer-reviewed portable C++ source libraries
Bsmap BSMAP is a short reads mapping software for bisulfite sequencing reads. Bisulfite treatment converts unmethylated Cytosines into Uracils (sequenced as Thymine) and leave methylated Cytosines unchanged, hence provides a way to study DNA cytosine methylation at single nucleotide resolution. BSMAP aligns the Ts in the reads to both Cs and Ts in the reference
Cellranger Cell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis. Cell Ranger includes four pipelines relevant to single-cell gene expression experiments
Circos Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions
Darts Deep-learning Augmented RNA-seq analysis of Transcript Splicing
Eigensoft The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypesmaximizing power to detect true associations.
Emase Expectation-Maximization algorithm for Allele Specific Expression
Finestruc fineSTRUCTURE is a fast and powerful algorithm for identifying population structure using dense sequencing data
Gatk The GATK is the industry standard for identifying SNPs and indels in germline DNA and RNAseq data
Gatk4 The GATK is the industry standard for identifying SNPs and indels in germline DNA and RNAseq data
Gsl The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers
Homer Software for motif discovery and next generation sequencing analysis
Igblast A tool for analyzing immunoglobulin (IG) and T cell receptor (TR) sequences
Igv Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
Igvtools command line tools for IGV
Macs2 Model Based Analysis for ChIP-Seq data
Mafft Multiple alignment program for amino acid or nucleotide sequences
Meme Motif based sequence Analysis tools
Mirdeep2 A completely overhauled tool which discovers microRNA genes by analyzing sequenced RNAs
Peakachu Peak calling tool for CLIP-seq data
Plink Whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner
r-ichorcna Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing
Randfold
Rmats MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data
Salmon Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
Tetoolkit Tools for estimating differential enrichment of Transposable Elements and other highly repetitive regions
Tobias TOBIAS - Transcription factor Occupancy prediction By Investigation of ATAC-seq Signal
Viennarna Vienna RNA package -- RNA secondary structure prediction and comparison
Weblogo Web based application designed to make the generation of sequence logos as easy and painless as possible

Software installed in Bioinformatics cluster

PEAKachu

Peak calling tool for CLIP-seq data

Github

To use run this command

module load peakachu

FastUniq

An ultrafast de novo tool for removal of duplicates in paired short DNA sequence reads in FASTQ format

SourceForge

cmpfastq

A simple Perl program that allows the user to compare QC filtered fastq files

Home page


STAR

Ultrafast universal RNA-seq aligner

Github

Installed version: 2.7.3a


Bedtools

Bedtools is a fast, flexible toolset for genome arithmetic.

Webpage

Installed version: 2.29.0

Usage:

module load bedtools

GATK

Genome Analysis Toolkit. It is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery.

The tools can be used individually or chained together into complete workflows.

Webpage

Installed version: 4.1.3.0

Usage:

module load gatk

SourceTracker2

SourceTracker is a Bayesian approach to estimate the proportion of contaminants in a given community that come from possible source environments.

Webpage

Usage:

module load conda-st2