| iNquiry bioinformatics applications on the Princeton Genomics Grid |
This page describes all the applications provided via the iNquiry software; general help for iNquiry is available here. The majority of these applications are part of the EMBOSS software suite. A mapping between the EMBOSS tools and their GCG equivalent is available from the EMBnet Norway EMBOSS-GCG comparison page. The descriptions of many of the programs below were taken from two excellent sources of documentation: the EMBOSS page and the Pasteur Institute.
Applications here are listed in alphabetical order.
| Tool | Description | Source |
|---|---|---|
| abiview | Reads ABI file and displays the trace. | EMBOSS |
| antigenic | Finds antigenic regions of a protein sequence. | EMBOSS |
| backtranseq | Takes protein seq and estimates the likely nucleic acid sequence by using a codon frequency table. | EMBOSS |
| banana | Predicts bending of a DNA sequence. | EMBOSS |
| biosed | Simple sequence editing tool that allows search and replace of a subsequence. | EMBOSS |
| bl2seq | BLAST one sequence against another to generate a BLAST alignment | Altshul et al. |
| blast2cours | blast2cours from the NCBI ToolKit | iNquiry |
| blastall | optimized version of blastall binary from the NCBI ToolKit. Only the blastn algorithm has been optimized for the PowerPC. Otherwise it is the same as the blastall binary from the NCBI Toolkit. | iNquiry |
| btblastall | wrapper that parallelizes blastall on the cluster. | iNquiry |
| btwisted | Predicts twisting and stacking energy of a DNA sequence. | EMBOSS |
| build_icm | Creates and outputs an interpolated Markov Model | S. Salzberg |
| cai | Calculates the Codon Adaptation Index. | EMBOSS |
| chaos | Creates a chaos games plot for a sequence. | EMBOSS |
| charge | Reads a protein sequence and outputs the charges of the amino acids within a window of specified length along the sequence. | EMBOSS |
| checktrans | Reads a protein sequence with stops and outputs ORFs without stops that meet a minimum size. | EMBOSS |
| chips | Reads a sequence and calculates the Nc statistic for the effective number of codons used. | EMBOSS |
| cirdna | Draws a circular DNA map given names and positions of markers. | EMBOSS |
| clique | Compatibility program for unrooted two-state characters, obtains the largest cliques of characters and the trees that they suggest | J. Felsenstein |
| clustalw | Generates multi-sequence alignments for DNA or protein sequences. | Des Higgins |
| codcmp | Reads in two codon usage tables and outputs the differences and usage. | EMBOSS |
| coderet | Extracts coding regions, mRNA, and protein from feature tables. | EMBOSS |
| compseq | Counts the composition of subsequences (eg. CG, TTC) in a sequence. | EMBOSS |
| cons | Generates a consensus sequence from a multiple alignment. | EMBOSS |
| consense | Reads in a file of trees and generates a consensus tree. | J. Felsenstein |
| cpgplot | Plots CpG rich regions in a sequence. | EMBOSS |
| cpgreport | Scans a sequence for CpG rich regions; less specific but more sensitive (will find smaller CpG islands). | EMBOSS |
| cusp | Reads one or more CDSs and generates a codon usage table. | EMBOSS |
| cutseq | Sequence editor that allows you to cut a region from the input sequence. | EMBOSS |
| dan | Calculates DNA/DNA and RNA/DNA melting temperature. | EMBOSS |
| degapseq | Removes gap/non-alphabetic characters from sequence. | EMBOSS |
| descseq | Replaces the name/description line of a sequence. | EMBOSS |
| diffseq | Reports differences between mostly identical sequences (useful when looking for SNPs, etc) | EMBOSS |
| digest | Finds cleavage positions in a protein. | EMBOSS |
| distmat | Calculates distance matrix from multiple alignments. | EMBOSS |
| dnadist | Calculates distance matrix from nucleotide sequences. | J. Felsenstein |
| dnaml | implements the maximum likelihood method for DNA sequences | J. Felsenstein |
| dnapars | calculates unrooted parsimony for DNA sequences | J. Felsenstein |
| dollop | performs Dollo and polymorphism parsimony methods | J. Felsenstein |
| dotmatcher | Graphs the regions of similarity between two sequences. | EMBOSS |
| dotpath | Graphs a non-overlapping wordmatch dotplot. | EMBOSS |
| dottup | Graphs a wordmatch dotplot. | EMBOSS |
| drawgram | Plots a rooted tree diagram. | J. Felsenstein |
| drawtree | Plots an unrooted tree diagram. | J. Felsenstein |
| dreg | Regular expression/pattern match search of DNA sequence. | EMBOSS |
| einverted | Finds inverted repeats in DNA sequence. | EMBOSS |
| emma | Interface to ClustalW. | EMBOSS |
| emowse | Searches protein database for matches with mass spec data. | EMBOSS |
| entret | Reads a sequence from a database or a file and writes the complete sequence entry to a text file. | Alan Bleasby |
| eprimer3 | Interface to primer3 program; picks primers for PCR and other hybridization oligos. | EMBOSS |
| equicktandem | Finds tandem repeats in a sequence. | EMBOSS |
| est2genome | Aligns ESTs and genomic DNA to predict gene calls. | EMBOSS |
| etandem | Finds tandem repeats in a sequence, identifies repeat sizes and calculates consensus. | EMBOSS |
| extract | Takes a FASTA format sequence file and a file with a list of start/stop positions in that file and extracts and outputs the specified sequences. | S. Salzberg |
| extractfeat | Extracts features from a sequence file. | EMBOSS |
| extractseq | Extracts regions from a sequence (eg. can extract exons to generate a CDS). | EMBOSS |
| findkm | Determines Km and Vmax based on input of substrate vs. reaction velocity. | EMBOSS |
| fitch | Performs Fitch-Margoliash, Least Squares, and other methods. | J. Felsenstein |
| fmtseq | Converts sequence files from one format to another. | Knight |
| freak | Calculates frequency of bases or residues in a window as it moves along a sequence. | EMBOSS |
| fuzznuc | Pattern matching for short patterns in nucleic acid sequences. | EMBOSS |
| fuzzpro | Pattern matching for short patterns in protein sequences. | EMBOSS |
| fuzztran | Pattern matching for short patterns in translated sequences. | EMBOSS |
| garnier | Predicts secondary structure of a protein. | Rodrigo Lopez |
| geecee | Calculates the fraction of GC bases of a sequence. | EMBOSS |
| getorf | Finds ORFs in a sequence, can define minimum length. | EMBOSS |
| glimmer | Finds genes in microbial DNA. | Salzburg |
| graphics | R Graphics demo. | R Development Core Team |
| helixturnhelix | Finds helix-turn-helix DNA binding motifs in a protein sequence. | EMBOSS |
| hmmalign | Aligns sequences to an HMM profile. | Eddy |
| hmmbuild | Builds an HMM profile from a sequence alignment. | Eddy |
| hmmcalibrate | Reads an HMM file and generates statistics. | Eddy |
| hmmconvert | Converts files from one HMM file format to another. | Eddy |
| hmmemit | Reads an HMM file and generates sequences from it. | Eddy |
| hmmfetch | retrieves an HMM file from an HMM database. | Eddy |
| hmmpfam | Compares sequences to the HMM profiles in an HMM database. | Eddy |
| hmmsearch | Reads an HMM file and searches a sequence database for similar sequences. | Eddy |
| hmoment | Calculates hydrophobic moment of a pepide. | EMBOSS |
| html4blast | Formats text BLAST results in HTML. | Joly |
| iep | Calculates the isoelectric point of a protein sequence. | EMBOSS |
| infoalign | Utility to list properties of sequences in an alignment. | EMBOSS |
| infoseq | Utility that lists basic information about a sequence (accession number, length, etc) | EMBOSS |
| isochore | Plots isochores in DNA sequences. | EMBOSS |
| kitsch | Peforms the Fitch-Margoliash and Least Squares Methods, assumes all tip species are contemporaneous, and an evolutionary clock. | J. Felsenstein |
| lindna | Draws a linear DNA map given names and positions of markers. | EMBOSS |
| listor | Reads two sets of sequences and outputs the union of them. | EMBOSS |
| lmgene | Data transformation and identification of differentially expressed genes in expression arrays. | David M. Rocke |
| loadseq | Concatenates multiple sequences. | Pasteur |
| long_orfs | Takes a sequence file (in FASTA format) and outputs a list of all long "potential genes" in it that do not overlap by too much. | S. Salzberg |
| marscan | Finds matrix/scaffold attachment regions (MAR/SAR sites) in nucleic acid sequences. | EMBOSS |
| maskfeat | Use to mask off features in a sequence. | EMBOSS |
| maskseq | Use to mask off regions of a sequence (eg. masking low-complexity regions of a sequence). | EMBOSS |
| matcher | Compares sequences to find best local alignments between them. | EMBOSS |
| megamerger | Merges two overlapping DNA sequences, uses less memory than merger, but merger is more accurate for more divergent sequences. | EMBOSS |
| merger | Merges two overlapping DNA sequences, uses more memory than megamerger, but is more accurate for divergent sequences. | EMBOSS |
| mix | Applies parsimony with mixed methods. | J. Felsenstein |
| msbar | Mutates sequences, emulating different forms of mutation. | EMBOSS |
| mview_blast | Converts results of a BLAST, FASTA, etc. into an alignment of multiple hits against a query; note that it is not in itself an alignment program. | Brown |
| mwfilter | Filters noise based on molecular wieght from mass spec data. | EMBOSS |
| needle | Global alignment of two sequences using Needleman-Wunsch algorithm. | EMBOSS |
| neighbor | Neighbor joining and UPGMA cluster methods. | J. Felsenstein |
| newcpgreport | Reports CpG rich areas in a sequence. | EMBOSS |
| newcpgseek | Reports CpG rich areas in a sequence. | EMBOSS |
| newseq | Creates a new sequence file for short sequences. | EMBOSS |
| notseq | Excludes a subset of sequences from a file of multiple sequences. | EMBOSS |
| nthseq | Extracts one sequence from a set of them. | EMBOSS |
| octanol | Calculates protein hydropathy. | EMBOSS |
| oddcomp | Finds regions of protein sequences with a biased composition. | EMBOSS |
| palindrome | Looks for palindromes in nucleic acid sequences. | EMBOSS |
| pasteseq | Editing tool that allows you to insert a sequence into another at a specified position. | EMBOSS |
| patmatdb | Takes a protein motif as input and compares it to protein sequence. | EMBOSS |
| patmatmotifs | Takes a protein sequence as input and searches PROSITE motif database. | EMBOSS |
| pepcoil | Calculates the probability of a coiled-coil structure in a protein sequence. | EMBOSS |
| pepinfo | Displays amino acid properties of a protein sequence, can plot hydrophobicity, etc. | EMBOSS |
| pepnet | Generates helical net for a protein. | EMBOSS |
| pepstats | Generates calculated protein statistics, for eg. molecular weight, charge, etc. | EMBOSS |
| pepwheel | Generates helical wheel of protein sequences. | EMBOSS |
| pepwindow | Generates Kyte and Doolittle hydropathy plot of protein. | EMBOSS |
| pepwindowall | Generates superimposed Kyte and Doolittle hydropathy plots of a set of aligned proteins. | EMBOSS |
| phiblast | Searches proteins combining pattern matching with local alignment around the pattern match. | EMBOSS |
| plotcon | Generates representation of the quality conservation along a set of aligned sequences. | EMBOSS |
| plotorf | Plots predicted ORFs in a sequence. | EMBOSS |
| polydot | Generates all-against-all dotplots of a sequence set. | EMBOSS |
| preg | Regular expression search of a protein sequence. | EMBOSS |
| prettyplot | Displays aligned sequences, with coloring and boxing. | EMBOSS |
| prettyseq | Output sequence with translated ranges. | EMBOSS |
| primersearch | Searches DNA sequences for matches with primer pairs. | EMBOSS |
| profit | Scan a sequence or database with a matrix or profile. | EMBOSS |
| prophecy | Creates matrices/profiles from multiple alignments. | EMBOSS |
| prophet | Gapped alignment for profiles. | EMBOSS |
| protdist | Computes distance matrix from protein sequences. | J. Felsenstein |
| protpars | Parsimony method for protein sequences. | J. Felsenstein |
| pscan | Locates fingerprints (multiple motif features) in a protein sequence. | EMBOSS |
| psiblast | Iterative protein similarity search; uses position-specific scoring matrices constructed during the search. | Altshul et al. |
| readseq | Converts DNA or protein sequence files to a specified format. | Gilbert |
| recoder | Find and remove restriction sites but maintain the same translation. | EMBOSS |
| redata | Isoschizomers, references and Suppliers for Restriction Enzymes | EMBOSS |
| remap | Display a sequence with restriction cut sites, translation etc. | EMBOSS |
| restover | Finds restriction enzymes that produce a specific overhang. | EMBOSS |
| restrict | Finds restriction enzyme cleavage sites. | EMBOSS |
| revseq | Reverse and complement a sequence. | EMBOSS |
| seqmatchall | Does an all-against-all comparison of a set of sequences. | EMBOSS |
| showdb | Displays information about the currently available local sequence databases. | EMBOSS |
| showfeat | Show features of a sequence. | EMBOSS |
| showorf | Pretty output of DNA translations. | EMBOSS |
| showseq | Display a sequence with features, translation etc. | EMBOSS |
| shuffleseq | Shuffles a set of sequences maintaining composition | EMBOSS |
| sigcleave | Predicts signal peptide cleavage sites | EMBOSS |
| sigscan | Uses protein signature file generated by siggen to find other proteins with that signature | EMBOSS |
| silent | Silent mutation restriction enzyme scan | EMBOSS |
| splitter | Split a sequence into (overlapping) smaller sequences. | EMBOSS |
| stretcher | Global alignment of two sequences. | EMBOSS |
| stssearch | Searches a DNA database for matches with a set of STS primers | EMBOSS |
| supermatcher | Finds a match of a large sequence against one or more sequences | EMBOSS |
| syco | Synonymous codon usage Gribskov statistic plot | EMBOSS |
| tfscan | Scans DNA sequences for transcription factors. | EMBOSS |
| tmap | Predict transmembrane proteins | EMBOSS |
| transeq | Translates nucleic acid sequences | EMBOSS |
| trimest | Trim poly-A tails off EST sequences | EMBOSS |
| trimseq | Trim ambiguous bits off the ends of sequences | EMBOSS |
| vectorstrip | Strips out DNA between a pair of vector sequences | EMBOSS |
| water | Smith-Waterman local alignment. | EMBOSS |
| whichdb | Search all databases for an entry | EMBOSS |
| wise2 | Compares a protein sequence to genomic DNA, allowing for introns and frameshifts. | Birney |
| wobble | Wobble base plot | EMBOSS |
| wordcount | Counts words of a specified size in a DNA sequence. | EMBOSS |
| wordmatch | Finds all exact matches of a given size between 2 sequences | EMBOSS |
| xblast | Reads blast results and generates a query with the hits masked. | Claverie |