Sequerome

Sequerome is a web-based Sequence profiling tool developed by the Bioinformatics and Computational Biosciences Unit (BCBU) at Georgetown University. This tool, which is the only one of its kind, provides the unique and useful functionality of seamlessly integrating an entire BLAST report to a panel of servers that perform advanced sequence manipulations in a tabbed browsing environment.

Free access to information is not so much an issue as free ability to analyze and organize data in simple user-friendly interfaces. This is especially true for experimental molecular biologists who perform a series of sequence manipulations by visiting multiple sources before using the processed data further in their experiments. Biologists often perform sequence homology studies using the Basic Local Alignment Search Tool BLAST. The output of the results interface provides very limited access to sequence analysis tools, such as restriction enzyme maps for DNA sequences and secondary structure prediction for protein sequences. This is further compounded by the inability to navigate smoothly between different sequence alignment records. Such needs would be better met by a single interface that would allow a user to directly link each of the sequences from alignment reports to different domains/servers offering analysis and manipulation options. Sequerome, a web based Java tool, was developed to meet this end by acting as a front-end to BLAST queries and providing simplified access to web-distributed resources for protein and nucleic acid analysis. Despite the limited scope offered at present, the application is likely to mushroom the growth of related software tools that expand the scope and coverage of molecular biology tools.

Since its inception in 2005, the tool has been featured in Science (journal) and officially linked to many Bioinformatics portals around the globe.

Salient features

 * - Profiling Sequence alignment reports from BLAST by linking the results page to a panel of third party services
 * - Tabbed browsing allowing user to come back earlier operations, visit third party services to perform customized sequence manipulations.
 * - One-box any-format sequence input. Alternate options for sequence input including visiting third party sites.
 * - Cached storage of input sequences and retrieval.
 * - Three pane browsing environment allowing simultaneous input and analysis of multiple sequences.
 * - Archival options on top of each icon, for results from each pane

The software application can be accessed directly as a web-server at its Homepage. The homepage shows three panels, viz. Query pane, Results pane and the Search History pane. The user may resize these panes to perform parallel actions in any of these panes. So, in a single browser one could be running parallel BLAST searches on different sequences, analyzing them or viewing the restriction digests for each document of a BLAST result

Query Pane
Each browser session intends to perform without asking too many questions at the outset. The user has to just dump in the sequence in the Query pane, and BLAST the sequence right away under standard parameters. Experienced users have a choice to perform further special operations under the Advanced options. Some of features include selection of specific databases to BLAST from, upload facility to work with FASTA files stored in individual computers, sequence retrieval using NCBI IDs and visit any user-defined URL to drag-N-drop the sequences. Alternatively the user can also perform a variety of other actions including Sequence manipulation, analysis, and alignment using popular existing tools available in the web. The One-box any-sequence, takes input in any format (FASTA, with or without spaces/numbers...). Alerts also exist to warn wrong selection of choices (DNA/RNA/Protein). Results obtained from 'sequence manipulation' e.g. translation, can be further carried on to do further BLAST analysis while preserving the history of the earlier search.

Results Pane
Sequerome directly queries the input sequence against a variety of databases/tools (‘popular public domains’ and ‘privately hosted services’) including BLAST, PDB, REBASE and others, and generates outputs that are intuitive and easily comprehensible. Instant access to various analysis tools, (including viewing a 3D structure-viewer from a PDBid), is provided as separate command buttons to analyze every record from a BLAST report before making a final selection. This is in contrast to the servers that provide plain links to the various resources leaving the user to once feed in the sequence for analysis. In future this could evolve into provision of user-defined analysis tools for every generic analysis e.g. Cn3D, Deepviewor Rasmol as structure viewers. In case of results from a Protein BLAST, PDBids are displayed prominently in appropriate cases next to the BLAST record, so that the structure of the molecule with a match can be viewed directly (with an already downloaded version of molecular structure viewer e.g Cn3D, ‘Rasmol’.... Once the BLAST report is displayed on the Results pane, the user can to directly perform a quick analysis on any of the BLAST hits using a series of command buttons that are linked to the respective servers/ sites. Most of the results from third party servers can be viewed directly in the Results pane without opening up as many browsers e.g. ORF prediction, Protparam.

Search History Pane
One of the key features of a profiling an input sequence data is to store, retrieve and effectively combine and re-use the older inputs. These can be further enhanced if there is retrieval options for each of the operations performed. The bottom right panel in the browser does this while also storing all the input sequences entered earlier. Thus the browser lends an environment to carry out tabbed browsing; an attraction to all those tired of operating multiple browser sessions at a time. For each of the icons linking to the stored results, the user has a choice of archiving them, including print, save and mail options. These can be seen as small colored pictures on top of each icon. This region is less bug-free and one might encounter glitches in the archival options.

Implementation
Sequerome has a three-tiered architecture that uses Java servlet and Server Page technologies with Java database connectivity (JDBC), making it both server and platform-independent. Sequerome is compatible with essentially all Java-enabled, graphical browsers but is better accessed using Internet Explorer and can be run on most operating systems equipped with a Java Virtual Machine (JVM) and Jakarta Tomcat server. End-users have to download plugins for viewing structure of molecules from PDB e.g. Cn3D, Rasmol, SwissPDB etc.

Further directions
The "post-genomics" era has given rise to a range of web based tools and software to compile, organize, and deliver large amounts of primary sequence information, as well as protein structures, gene annotations, sequence alignments, and other common bioinformatics tasks. A simple web-search returns any number of such services and software tools. Tools like these are likely to dominate the changing face of bioinformatics and computational biology. This will occur as researchers and investigators begin to their upload raw information from microarray chips and proteomic blots profiling large amounts of genetic data. Thus, future tools are likely to have more advanced features like the ability upload image and sound data, or provide 'smart' summaries of huge data using 'intelligent text summarization' tools.

New tools are also likely to see the development of advanced user-defined search templates that would perform automated-searches on huge datasets to aid in drug-discovery, biomedical imaging, microarray data and simulation of complex metabolic pathways. Sequerome is a prototype that hopes to address this by presenting a tool that would include the above features in addition to creating other advanced features including, personalized search trees, downloadable browsers and toolbars for customized searches, user-defined selection of tool choices for a particular generic analysis e.g. secondary structure prediction, from command options. The possibilities are endless and the bio-maze is just beginning to get interconnected. As the information pyramid continues to grow and re-assemble itself, new generations of single-interface systems are bound to grow and expand, which would address the specific needs of individual research groups.

Portals citing Sequerome

 * Canadian Bioinformatics Help Desk - Canada
 * University of Edinburgh - UK
 * ExPAsy - Swiss Institute of Bioinformatics
 * International Immuno-Genetics Information system - France
 * Children's Medical Research Center - Illinois, USA
 * UCLA library E-resources - California, USA
 * Grunwald labs - Oregon, USA