Tools for next-generation sequencing analysis
A more recent version of these tools is available as a Java port: ngsutilsj. This newer java version is more up-to-date, is significantly faster, and easier to install.
For more information, or to download a copy, see: http://compgen.io/ngsutilsj.
NGSUtils is a suite of software tools for working with next-generation sequencing datasets. In 2009, we (Liu Lab @ Indiana University School of Medicine) starting working with next-generation sequencing data. We initially started doing custom coding for each project in a one-off manner. It quickly became apparent that this was an inefficient manner to work, so we started assembling smaller utilities that could be adapted into larger, more complicated, workflows. We have used them for Illumia, SOLiD, 454, Ion Torrent, and Pac Bio sequencing data. We have used them for DNA and RNA resequcing, ChIP-Seq, CLIP-Seq, and targeted resequencing (Agilent exome capture and PCR targeting). These tools are also used heavily in our in-house DNA and RNA mapping pipelines.
These tools have of great use within our lab group, and so we are happy to make them available to the greater community.
NGSUtils is made up of 50+ programs, mainly written in Python. These are separated into modules based on the type of file that is to be analyzed. There are four modules:
Each of these modules contains many commands for manipulating, filtering, converting, or analyzing these types of files. Check out the documentation for each module for more information about some of the commands available.