Version 1.1
What is SubMito

    SubMito is the first computational system for predicting protein submitochondria locations from its primary sequence. SubMito is designed and implemented with Java. This site is a web-like front end for SubMito system. Users can access SubMito on the server side by uploading a FASTA file or entering sequence below. The prediction result will be saved on server, and a link to the result file will be provided. Users can use this link to download the result file.

    Since there may be several sessions running at the same time on the server, the responding of the server side SubMito may be very slow. Another way to use SubMito is to download a local version of the software. The links to SubMito distribution packages for different OSes and a dataset for quick-testing SubMito are listed in the Download section. If there is any problem while you are using SubMito, please contact


History of SubMito

    The initial idea of SubMito was from a disscussion with Dr. Jun Cai on 28th Feb. 2006.
    SubMito project was then started on 7th Mar. 2006, guided by Prof Yanda Li.
    The first version of releasable SubMito (V1.0) was published online on 2nd May. 2006.
    A bug report was received on 31th Aug. 2006.
    The bug was fixed and an updated version of SubMito (V1.1) was published online on 7th Sep. 2006.
    The web site of SubMito was updated to a well designed version on 7th Sep. 2006.


Online service usage

    The online service can accept 2 forms of input. One is single sequence, the other is uploaded FASTA file. If you want to predict submitochondria location for a single sequence, you can paste your sequence in the text box below. Optionally, you can choose to write some remarks in the text field labeling "Enter your remark". Then, press [Submit] button to perform the prediction. The result will be saved in a FASTA format file, and you can download it at any time in the coming 48 hours.

    If you decide to upload a FASTA file, you need to make sure that the size of your FASTA file is not larger than 2MB. The server currently can not receive a file larger than 2MB. When you have selected your FASTA file, press [Submit] button to upload it and perform the prediction. The result will be saved in a FASTA file, which you can download at any time in the coming 48 hours. The result file name can be specified before you upload your FASTA file by entering a string in the text field labeling "Enter result file name", but this is not required. If you leave the text field to be blank, the server will automaticly generate a file name for you.

    In the result file, you may see some message like "SubMito core:... ". This is usually caused by your sequence containing symbols like "X", "B", "Z" or any other characters not in the list of standard amino acid symbols including Space and Return. The system core of SubMito automatically filtered these symbols out and generate that message to tell you that one or more symbols has been removed from your original sequence, the result based on this modified sequence may be not reliable. If you submit a single sequence, a detailed report for such events will be shown on your browser.


Online Service

    Here you can enter your sequence or upload a FASTA format file. SubMito will predict the submitochondria locations for the sequence.

Predict single sequence
Enter your remark (optional, only a-z,A-Z,0-9):
Enter your sequence:
Upload FASTA file
Enter FASTA file name:
Enter result file name: (No extention, only a-z,A-Z,0-9)


Local predictor usage

    The local predictor is developed with Java. If you do not have a JRE installed on your system, you need to access to get the newest version of JRE for your system before you start to use the predictor. The user interface of local predictor is almost the same as the online service. You can use it to predict a singal sequence or a FASTA file containing plenty of sequences. If you are using Window or Linux, you should download the pre-configured version. If you are using other OSes, you need to download the unconfigured version and modify SubMito.conf your self. The modification is only to chang the path seperator to "\" or "/".

    NOTICE: If you want to download and setup your local predictor, there's some difference with the online service when you want to use a FASTA file. The sequence in FASTA file for local predictor MUST be written in ONE LINE. The multi-line sequence file will cause the predictor fault. There is no such restrictions for online service. Though the online service may be a little slow, we still recommend using the online service instead of using the local predictors.



    Here we provide 3 version of SubMito, for Windows, Linux and other OSes. The package which is prepared for Linux and Windows is pre-configured. But the package for other OSes is not configured. You can modify the configure file according to that for Windows or Linux.

    SubMito un-configured version for all platforms
    SubMito pre-configured version for Linux
    SubMito pre-configured version for Windows

    The folowings are some datasets which can be directly filled into SubMito to get a quick-test result from SubMito. These datasets are provided here only for the purpose of testing SubMito. The data itself and the result from SubMito has no scientific meaning.

    SubMito Sample dataset


Author's declaration

    SubMito system was designed and implemented by Pufeng Du. The algorithm used in SubMito is based on SVM classifier. The features used in SubMito classifier for classifying proteins are extracted from segmented protein sequences, which is an extened version of Chou's pseudo-amino-acid composition.

    SubMito is free for any acadmic use. But for any other purpose, please contact to get a permission.

    If you want to have a copy of SubMito's source code, please contact



    We would thank Dr. Jun Cai for his helpful discussion. We also have to thank the anonymous reviewers for their helpful comments, good suggestions and carefully testing for SubMito.

    This project is partialy supported by NSFC 60234020 and NSFC 60572086 of China.


Bioinformatics Division, TNLIST and Department of Automation, Tsinghua University