MoDPepInt Server
SH2PepInt - Help
BIF
IFF

Introduction

SH2PepInt has been developed to predict binding partners of 51 human SH2 domains. Peptides are restrained to 7 amino acids length, i.e. -2 to +4 amino acids around the pTyr position. Depending on the user requirement it uses PhosphoSitePlus and Gene Ontology databases for predicting highly reliable SH2-peptide interactions.

When using SH2PepInt please cite :

Results are computed with SH2PepInt version 1.0.0

Overview

The following parameters are used to control the execution of SH2PepInt

Furthermore, additional information is available

Input Parameters

?  Protein/Peptide FASTA

PDZPepInt accepts input in form of a multiple FASTA file. An example looks like this:
	
>EbbB1
MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEV
VLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALA

>ErbB2
MELAALCRWGLLLALLPPGAASTQVCTGTDMKLRLPASPETHLDMLRHLYQGCQVVQGNL	
ELTYLPTNASLSFLQDIQEVQGYVLIAHNQVRQVPLQRLRIVRGTQLFEDNYALAVLDNG

>ErbB3
MRANDALQVLGLLFSLARGSEVGNSQAVCPGTLNGLSVTGDAENQYQTLYKLYERCEVVM
GNLEIVLTGHNADLSFLQWIREVTGYVLVAMNEFSTLPLPNLRVVRGTQVYDGKFAIFVM

>ErbB4
MKPATGLWVWVSLLVAAGTVQPSDSQSVCAGTENKLSSLSDLEQQYRALRKYYENCEVVM
GNLEITSIEHNRDLSFLRSVREVTGYVLVALNQFRYLPLENLRIIRGTKLYEDRYALAIF

Input can be given either as direct text input or uploading a file.
Note: Input size is limited to restrict computation time and memory requirements.
The parameter constraints are: The input has to be in valid FASTA format. The number of sequences has to be at least 0 and at most 500. Sequence lengths have to be in the range 1-5000. The allowed sequence alphabet is 'GPAVLIMCFYWHKRQNEDST'. Either FASTA input or a UniProt ID list have to be provided. In case an enabled filter requires UniProt IDs: Each FASTA sequence header/name has to be just a valid UniProt ID or a valid UniProt FASTA header.
Defaults to ()

?  Protein UniProt IDs

Instead of feeding directly protein sequences, you can provide UniProt IDs of the targeted proteins as an input. In this case, the protein sequence will be automatically downloaded from the UniProt database. Multi UniProt IDs separated by below mentioned separators are also accepted. An example looks like this:
 P00533, P04626, P21860
The parameter constraints are: Has to be a list of 0-500 UniProt IDs that are separated by ([,\.;: \t\n]|\r\n). Access to the UniProt database is needed. The value has to match against the regular expression '^[^;'"]*$'.
Defaults to ()

?  SH2 Domains

List of all SH2 protein domains available for an interaction screening.
The parameter constraints are: The value has to match against the regular expression '^[^;'"]*$'. Only protein domains from the following list are allowed : ABL1, ABL2, APS, BCAR3, BLK, BMX, BRDG1, BTK, CRKL, CRK, CTEN, E105251, E109111, E185634, EAT2, FER, FES, FGR, FRK, GRAP2, GRB10, GRB14, GRB2, HCK, INPPL1, ITK, LCK, LCP2, LYN, MATK, MIST, NCK1, NCK2, PTK6, SH2B, SH2D1A, SH2D2A, SH2D3C, SHC1, SHC3, SOCS2, SOCS5, SRC, TEC, TENC1, TENS1, TNS, TXK, VAV1, VAV2, YES1.
Defaults to ()

Filters

?  Phosphotyrosine (pY) (needs UniProt IDs)

This filter has been implemented using the annotated information in the PhosphoSitePlus database; in this way we have selected only those phosphotyrosine peptides whose phosphorylation has been experimentally verified. At the moment of the analysis (January 2013) the PhosphoSitePlus database contained 30,228 phosphorylation sites from 10,688 human proteins. We have ignored those peptides that were not present in the UniProtKB/ Swiss-Prot database obtaining finally 27,481 phosphorylation peptides out of 9621 proteins. This filter needs UniProt IDs of the query proteins: within FASTA provide either ONLY the UniProt ID within the FASTA header OR use the header encoding from the UniProt database; alternatively just provide the UniProt IDs for an automated download of the query sequences.
The parameter constraints are: Input value has to be parsable as Boolean.
Defaults to (false)

?  Cellular localization (needs UniProt IDs)

This filter was implemented considering the terms relative to the sub-cellular localization hierarchy in the controlled vocabulary of the Gene Ontology database (January, 2013). In case of multiple cellular locations (e.g. GRB2 protein can be found in nucleus, cytoplasm, endosome and golgi apparatus) we consider a peptide viable for interaction if it shares at least one of the terms with the domain. This filter is not applicable for the domains without annotated sub-cellular localizations (such as SHD/E105251). UniProt IDs of the query proteins are mandatory for using this filter: within FASTA provide either ONLY the UniProt ID within the FASTA header OR use the header encoding from the UniProt database; alternatively just provide the UniProt IDs for an automated download of the query sequences.
The parameter constraints are: Input value has to be parsable as Boolean.
Defaults to (false)

Output Description

The output table summarizes all predicted protein-domain interactions.

Detailed descriptions:
  • 1. Input sequence id
  • 2. Binding region in the protein
  • 3. Binding sequence
  • 4. Binding domain/s with the protein
Therein, Y indicates binding Tyrosine residue while pY indicates binding Tyrosine residue whose phosphorylation has been validated and reported in PhosphoSitePlus database.

Input Examples

?  SH2 interactions

This is an example for the interaction prediction of human SH2 domains from GRB2, CRK and CRKL proteins with the target proteins ERBB1, ERBB2 and ERBB3.
The example's result can be directly accessed here

Frequently Asked Questions

If your question is not listed, please send it to us!

? Is it possible to train a SH2 prediction model with my own data?

In case you want to use your own data to make a dedicated SH2-prediction model, please contact us.

List of Changes