Protein subcellular localization prediction

Protein subcellular localization prediction involves the computational prediction of where a protein resides in a cell. Prediction of protein subcellular localization is an important component of bioinformatics-based prediction of protein function and genome annotation, and it can aid the identification of drug targets.

Most eukaryotic proteins are encoded in the nuclear genome and synthesized in the cytosol, but many need to be further sorted before they reach their final destination. For prokaryotes, proteins are synthesized in the cytoplasm and some must be targeted to other locations such as to a cell membrane or the extracellular environment. Proteins must be localized at their appropriate subcellular compartment to perform their desired function.

Experimentally determining the subcellular localization of a protein is a laborious and time consuming task. Through the development of new approaches in computer science, coupled with an increased dataset of proteins of known localization, computational tools can now provide fast and accurate localization predictions for many organisms. This has resulted in subcellular localization prediction becoming one of the challenges being successfully aided by bioinformatics. Many protein subcellular localization prediction methods now exceed the accuracy of some high-throughput laboratory methods for the identification of protein subcellular localization.

Methods
Several computational tools for predicting the subcellular localization of a protein are publicly available, a few of which are listed below. Note that the number of different subcellular localizations predicted for each method varies, and the accuracy of methods varies, so different methods are suitable depending on what you want to predict and how sensitive or specific you wish your analysis to be. Methods for the prediction of bacterial localization predictors, and their accuracy, have been recently reviewed. See also the PSORT.org portal for a more extensive list of localization predictors for both bacteria and eukaryotes:


 * PSORT : The first widely used method for protein subcellular localization prediction, developed under the leadership of Kenta Nakai. Now researchers are also encouraged to use other PSORT programs such as WoLF PSORT and PSORTb for making predictions for certain types of organisms (see below). PSORT prediction performances are lower than those of recently developed predictors.


 * LOCtree : Prediction based on mimicking the cellular sorting mechanism using a hierarchical implementation of support vector machines. LOCtree is a comprehensive predictor incorporating predictions based on PROSITE/PFAM signatures as well as SwissProt keywords.


 * BaCelLo : Prediction of eukaryotic protein subcellular localization. Unlike other methods, the predictions are balanced among different classes and all the localizations that are predicted are considered as equiprobable, to avoid mispredictions.


 * TargetP : Prediction of N-terminal sorting signals.


 * SecretomeP : Prediction of eukaryotic proteins that are secreted via a non-traditional secretory mechanism.


 * PredictNLS : Prediction of nuclear localization signals.


 * WoLF PSORT http://wolfpsort.org/: An updated version of PSORT/PSORT II for the prediction of eukaryotic sequences.


 * PSORTb : Prediction of bacterial protein localization.


 * Proteome Analyst : Prediction of protein localization for both prokaryotes and eukaryotes using a text mining approach.


 * CELLO : CELLO uses a two-level Support Vector Machine system to assign localizations to both prokaryotic and eukaryotic proteins.

Relevance
Determining subcellular localization is important for understanding protein function and is a critical step in genome annotation.

Knowledge of the subcellular localization of a protein can significantly improve target identification during the drug discovery process. For example, secreted proteins and plasma membrane proteins are easily accessible by drug molecules due to their localization in the extracellular space or on the cell surface.

Bacterial cell surface and secreted proteins are also of interest for their potential as vaccine candidates or as diagnostic targets.

Aberrant subcellular localization of proteins has been observed in the cells of several diseases, such as cancer and Alzheimer’s disease.

Secreted proteins from some archaea that can survive in unusual environments have industrially important applications.