iNquiry bioinformatics applications on the Princeton Genomics Grid

This page describes all the applications provided via the iNquiry software; general help for iNquiry is available here. The majority of these applications are part of the EMBOSS software suite. A mapping between the EMBOSS tools and their GCG equivalent is available from the EMBnet Norway EMBOSS-GCG comparison page. The descriptions of many of the programs below were taken from two excellent sources of documentation: the EMBOSS page and the Pasteur Institute.

Applications here are listed in alphabetical order.

Tool Description Source
abiview Reads ABI file and displays the trace. EMBOSS
antigenic Finds antigenic regions of a protein sequence. EMBOSS
backtranseq Takes protein seq and estimates the likely nucleic acid sequence by using a codon frequency table. EMBOSS
banana Predicts bending of a DNA sequence. EMBOSS
biosed Simple sequence editing tool that allows search and replace of a subsequence. EMBOSS
bl2seq BLAST one sequence against another to generate a BLAST alignment Altshul et al.
blast2cours blast2cours from the NCBI ToolKit iNquiry
blastall optimized version of blastall binary from the NCBI ToolKit. Only the blastn algorithm has been optimized for the PowerPC. Otherwise it is the same as the blastall binary from the NCBI Toolkit. iNquiry
btblastall wrapper that parallelizes blastall on the cluster. iNquiry
btwisted Predicts twisting and stacking energy of a DNA sequence. EMBOSS
build_icm Creates and outputs an interpolated Markov Model S. Salzberg
cai Calculates the Codon Adaptation Index. EMBOSS
chaos Creates a chaos games plot for a sequence. EMBOSS
charge Reads a protein sequence and outputs the charges of the amino acids within a window of specified length along the sequence. EMBOSS
checktrans Reads a protein sequence with stops and outputs ORFs without stops that meet a minimum size. EMBOSS
chips Reads a sequence and calculates the Nc statistic for the effective number of codons used. EMBOSS
cirdna Draws a circular DNA map given names and positions of markers. EMBOSS
clique Compatibility program for unrooted two-state characters, obtains the largest cliques of characters and the trees that they suggest J. Felsenstein
clustalw Generates multi-sequence alignments for DNA or protein sequences. Des Higgins
codcmp Reads in two codon usage tables and outputs the differences and usage. EMBOSS
coderet Extracts coding regions, mRNA, and protein from feature tables. EMBOSS
compseq Counts the composition of subsequences (eg. CG, TTC) in a sequence. EMBOSS
cons Generates a consensus sequence from a multiple alignment. EMBOSS
consense Reads in a file of trees and generates a consensus tree. J. Felsenstein
cpgplot Plots CpG rich regions in a sequence. EMBOSS
cpgreport Scans a sequence for CpG rich regions; less specific but more sensitive (will find smaller CpG islands). EMBOSS
cusp Reads one or more CDSs and generates a codon usage table. EMBOSS
cutseq Sequence editor that allows you to cut a region from the input sequence. EMBOSS
dan Calculates DNA/DNA and RNA/DNA melting temperature. EMBOSS
degapseq Removes gap/non-alphabetic characters from sequence. EMBOSS
descseq Replaces the name/description line of a sequence. EMBOSS
diffseq Reports differences between mostly identical sequences (useful when looking for SNPs, etc) EMBOSS
digest Finds cleavage positions in a protein. EMBOSS
distmat Calculates distance matrix from multiple alignments. EMBOSS
dnadist Calculates distance matrix from nucleotide sequences. J. Felsenstein
dnaml implements the maximum likelihood method for DNA sequences J. Felsenstein
dnapars calculates unrooted parsimony for DNA sequences J. Felsenstein
dollop performs Dollo and polymorphism parsimony methods J. Felsenstein
dotmatcher Graphs the regions of similarity between two sequences. EMBOSS
dotpath Graphs a non-overlapping wordmatch dotplot. EMBOSS
dottup Graphs a wordmatch dotplot. EMBOSS
drawgram Plots a rooted tree diagram. J. Felsenstein
drawtree Plots an unrooted tree diagram. J. Felsenstein
dreg Regular expression/pattern match search of DNA sequence. EMBOSS
einverted Finds inverted repeats in DNA sequence. EMBOSS
emma Interface to ClustalW. EMBOSS
emowse Searches protein database for matches with mass spec data. EMBOSS
entret Reads a sequence from a database or a file and writes the complete sequence entry to a text file. Alan Bleasby
eprimer3 Interface to primer3 program; picks primers for PCR and other hybridization oligos. EMBOSS
equicktandem Finds tandem repeats in a sequence. EMBOSS
est2genome Aligns ESTs and genomic DNA to predict gene calls. EMBOSS
etandem Finds tandem repeats in a sequence, identifies repeat sizes and calculates consensus. EMBOSS
extract Takes a FASTA format sequence file and a file with a list of start/stop positions in that file and extracts and outputs the specified sequences. S. Salzberg
extractfeat Extracts features from a sequence file. EMBOSS
extractseq Extracts regions from a sequence (eg. can extract exons to generate a CDS). EMBOSS
findkm Determines Km and Vmax based on input of substrate vs. reaction velocity. EMBOSS
fitch Performs Fitch-Margoliash, Least Squares, and other methods. J. Felsenstein
fmtseq Converts sequence files from one format to another. Knight
freak Calculates frequency of bases or residues in a window as it moves along a sequence. EMBOSS
fuzznuc Pattern matching for short patterns in nucleic acid sequences. EMBOSS
fuzzpro Pattern matching for short patterns in protein sequences. EMBOSS
fuzztran Pattern matching for short patterns in translated sequences. EMBOSS
garnier Predicts secondary structure of a protein. Rodrigo Lopez
geecee Calculates the fraction of GC bases of a sequence. EMBOSS
getorf Finds ORFs in a sequence, can define minimum length. EMBOSS
glimmer Finds genes in microbial DNA. Salzburg
graphics R Graphics demo. R Development Core Team
helixturnhelix Finds helix-turn-helix DNA binding motifs in a protein sequence. EMBOSS
hmmalign Aligns sequences to an HMM profile. Eddy
hmmbuild Builds an HMM profile from a sequence alignment. Eddy
hmmcalibrate Reads an HMM file and generates statistics. Eddy
hmmconvert Converts files from one HMM file format to another. Eddy
hmmemit Reads an HMM file and generates sequences from it. Eddy
hmmfetch retrieves an HMM file from an HMM database. Eddy
hmmpfam Compares sequences to the HMM profiles in an HMM database. Eddy
hmmsearch Reads an HMM file and searches a sequence database for similar sequences. Eddy
hmoment Calculates hydrophobic moment of a pepide. EMBOSS
html4blast Formats text BLAST results in HTML. Joly
iep Calculates the isoelectric point of a protein sequence. EMBOSS
infoalign Utility to list properties of sequences in an alignment. EMBOSS
infoseq Utility that lists basic information about a sequence (accession number, length, etc) EMBOSS
isochore Plots isochores in DNA sequences. EMBOSS
kitsch Peforms the Fitch-Margoliash and Least Squares Methods, assumes all tip species are contemporaneous, and an evolutionary clock. J. Felsenstein
lindna Draws a linear DNA map given names and positions of markers. EMBOSS
listor Reads two sets of sequences and outputs the union of them. EMBOSS
lmgene Data transformation and identification of differentially expressed genes in expression arrays. David M. Rocke
loadseq Concatenates multiple sequences. Pasteur
long_orfs Takes a sequence file (in FASTA format) and outputs a list of all long "potential genes" in it that do not overlap by too much. S. Salzberg
marscan Finds matrix/scaffold attachment regions (MAR/SAR sites) in nucleic acid sequences. EMBOSS
maskfeat Use to mask off features in a sequence. EMBOSS
maskseq Use to mask off regions of a sequence (eg. masking low-complexity regions of a sequence). EMBOSS
matcher Compares sequences to find best local alignments between them. EMBOSS
megamerger Merges two overlapping DNA sequences, uses less memory than merger, but merger is more accurate for more divergent sequences. EMBOSS
merger Merges two overlapping DNA sequences, uses more memory than megamerger, but is more accurate for divergent sequences. EMBOSS
mix Applies parsimony with mixed methods. J. Felsenstein
msbar Mutates sequences, emulating different forms of mutation. EMBOSS
mview_blast Converts results of a BLAST, FASTA, etc. into an alignment of multiple hits against a query; note that it is not in itself an alignment program. Brown
mwfilter Filters noise based on molecular wieght from mass spec data. EMBOSS
needle Global alignment of two sequences using Needleman-Wunsch algorithm. EMBOSS
neighbor Neighbor joining and UPGMA cluster methods. J. Felsenstein
newcpgreport Reports CpG rich areas in a sequence. EMBOSS
newcpgseek Reports CpG rich areas in a sequence. EMBOSS
newseq Creates a new sequence file for short sequences. EMBOSS
notseq Excludes a subset of sequences from a file of multiple sequences. EMBOSS
nthseq Extracts one sequence from a set of them. EMBOSS
octanol Calculates protein hydropathy. EMBOSS
oddcomp Finds regions of protein sequences with a biased composition. EMBOSS
palindrome Looks for palindromes in nucleic acid sequences. EMBOSS
pasteseq Editing tool that allows you to insert a sequence into another at a specified position. EMBOSS
patmatdb Takes a protein motif as input and compares it to protein sequence. EMBOSS
patmatmotifs Takes a protein sequence as input and searches PROSITE motif database. EMBOSS
pepcoil Calculates the probability of a coiled-coil structure in a protein sequence. EMBOSS
pepinfo Displays amino acid properties of a protein sequence, can plot hydrophobicity, etc. EMBOSS
pepnet Generates helical net for a protein. EMBOSS
pepstats Generates calculated protein statistics, for eg. molecular weight, charge, etc. EMBOSS
pepwheel Generates helical wheel of protein sequences. EMBOSS
pepwindow Generates Kyte and Doolittle hydropathy plot of protein. EMBOSS
pepwindowall Generates superimposed Kyte and Doolittle hydropathy plots of a set of aligned proteins. EMBOSS
phiblast Searches proteins combining pattern matching with local alignment around the pattern match. EMBOSS
plotcon Generates representation of the quality conservation along a set of aligned sequences. EMBOSS
plotorf Plots predicted ORFs in a sequence. EMBOSS
polydot Generates all-against-all dotplots of a sequence set. EMBOSS
preg Regular expression search of a protein sequence. EMBOSS
prettyplot Displays aligned sequences, with coloring and boxing. EMBOSS
prettyseq Output sequence with translated ranges. EMBOSS
primersearch Searches DNA sequences for matches with primer pairs. EMBOSS
profit Scan a sequence or database with a matrix or profile. EMBOSS
prophecy Creates matrices/profiles from multiple alignments. EMBOSS
prophet Gapped alignment for profiles. EMBOSS
protdist Computes distance matrix from protein sequences. J. Felsenstein
protpars Parsimony method for protein sequences. J. Felsenstein
pscan Locates fingerprints (multiple motif features) in a protein sequence. EMBOSS
psiblast Iterative protein similarity search; uses position-specific scoring matrices constructed during the search. Altshul et al.
readseq Converts DNA or protein sequence files to a specified format. Gilbert
recoder Find and remove restriction sites but maintain the same translation. EMBOSS
redata Isoschizomers, references and Suppliers for Restriction Enzymes EMBOSS
remap Display a sequence with restriction cut sites, translation etc. EMBOSS
restover Finds restriction enzymes that produce a specific overhang. EMBOSS
restrict Finds restriction enzyme cleavage sites. EMBOSS
revseq Reverse and complement a sequence. EMBOSS
seqmatchall Does an all-against-all comparison of a set of sequences. EMBOSS
showdb Displays information about the currently available local sequence databases. EMBOSS
showfeat Show features of a sequence. EMBOSS
showorf Pretty output of DNA translations. EMBOSS
showseq Display a sequence with features, translation etc. EMBOSS
shuffleseq Shuffles a set of sequences maintaining composition EMBOSS
sigcleave Predicts signal peptide cleavage sites EMBOSS
sigscan Uses protein signature file generated by siggen to find other proteins with that signature EMBOSS
silent Silent mutation restriction enzyme scan EMBOSS
splitter Split a sequence into (overlapping) smaller sequences. EMBOSS
stretcher Global alignment of two sequences. EMBOSS
stssearch Searches a DNA database for matches with a set of STS primers EMBOSS
supermatcher Finds a match of a large sequence against one or more sequences EMBOSS
syco Synonymous codon usage Gribskov statistic plot EMBOSS
tfscan Scans DNA sequences for transcription factors. EMBOSS
tmap Predict transmembrane proteins EMBOSS
transeq Translates nucleic acid sequences EMBOSS
trimest Trim poly-A tails off EST sequences EMBOSS
trimseq Trim ambiguous bits off the ends of sequences EMBOSS
vectorstrip Strips out DNA between a pair of vector sequences EMBOSS
water Smith-Waterman local alignment. EMBOSS
whichdb Search all databases for an entry EMBOSS
wise2 Compares a protein sequence to genomic DNA, allowing for introns and frameshifts. Birney
wobble Wobble base plot EMBOSS
wordcount Counts words of a specified size in a DNA sequence. EMBOSS
wordmatch Finds all exact matches of a given size between 2 sequences EMBOSS
xblast Reads blast results and generates a query with the hits masked. Claverie


last update: