Introduction
SH3PepInt is an alignment-free, fast and sophisticated graph kernel based tool
to predict SH3-peptide interactions. Total 69 built-in models are
available for SH3-peptide predictions. It does not need any pre-alignment
of the peptides. It computes a window analysis on the query proteins and
considers 15 amino acids length peptides. User can change the step size
for the window analysis. Depending on the user requirement it uses
Gene Ontology database
for getting more reliable interactions.
When using SH3PepInt please cite :
- Kousik Kundu, Martin Mann, Fabrizio Costa, and Rolf Backofen
MoDPepInt: An interactive webserver for prediction of modular domain-peptide interactions
Bioinformatics, 2014, in press. - Kousik Kundu, Fabrizio Costa, and Rolf Backofen
A graph kernel approach for alignment-free domain-peptide interaction prediction with an application to human SH3 domains
Bioinformatics, 29(13), pp. i335-i343, 2013.
Results are computed with SH3PepInt version 1.0.0
Overview
The following parameters are used to control the execution of SH3PepInt
Furthermore, additional information is available
Input Parameters
Protein/Peptide FASTA
PDZPepInt accepts input in form of a multiple FASTA file.
An example looks like this:
Input can be given either as direct text input or uploading a file.
Note: Input size is limited to restrict computation time and memory requirements.
>PIK3C2B MSSTQGNGEHWKSLESVGISRKELAMAEALQMEYDALSRLRHDKEENRAKQNADPSLISW DEPGVDFYSKPAGRRTDLKLLRGLSGSDPTLNYNSLSPQEGPPNHSTSQGPQPGSDPWPK >PPP3CA MSEPKAIDPKLSTTDRVVKAVPFPPSHRLTAKEVFDNDGKPRVDILKAHLMKEGRLEESV ALRIITEGASILRQEKNLLDIDAPVTVCGDIHGQFFDLMKLFEVGGSPANTRYLFLGDYV >SYNJ1 MAFSKGFRIYHKLDPPPFSLIVETRHKEECLMFESGAVAVLSSAEKEAIKGTYSKVLDAY GLLGVLRLNLGDTMLHYLVLVTGCMSVGKIQESEVFRVTSTEFISLRIDSSDEDRISEVR
Input can be given either as direct text input or uploading a file.
Note: Input size is limited to restrict computation time and memory requirements.
The parameter constraints are: The input has to be in valid FASTA format. The number of sequences has to be at least 0 and at most 500. Sequence lengths have to be in the range 1-5000. The allowed sequence alphabet is 'GPAVLIMCFYWHKRQNEDST'. Either FASTA input or a UniProt ID list have to be provided. In case an enabled filter requires UniProt IDs: Each FASTA sequence header/name has to be just a valid UniProt ID or a valid UniProt FASTA header.
Defaults to ()
Defaults to ()
Protein UniProt IDs
Instead of feeding directly protein sequences, you can provide
UniProt IDs of the targeted proteins as an input.
In this case, the protein sequence will be automatically
downloaded from the UniProt database. Multi UniProt IDs
separated by below mentioned separators are also accepted.
An example looks like this:
O00750, Q08209, O43426
The parameter constraints are: Has to be a list of 0-500 UniProt IDs that are separated by ([,\.;: \t\n]|\r\n). Access to the UniProt database is needed. The value has to match against the regular expression '^[^;'"]*$'.
Defaults to ()
Defaults to ()
SH3 Domains
List of all SH3 protein domains available for an interaction screening.
The parameter constraints are: The value has to match against the regular expression '^[^;'"]*$'. Only protein domains from the following list are allowed : ABL, AMPHIPHYSIN_I, AMPHIPHYSIN_II, ARGBP1, ARGBP2a-1, ARGBP2a-2, ARGBP2a-3, ARHGAP12, ARHGEF16, BOG25, BTK, CAP-1, CAP-2, CIN85-1, CIN85-3, COOL1, CSK, DDEF2, DLG1, DOCK1, ENDOPHB1, ENDOPHILIN1, ENDOPHILIN2, ENDOPHILIN3, EPS8, EPS8L2, FGR, FISH, FNBP1L, FRK, FYN, GRAP2-1, GRAP2-2, GRB2-1, GRB2-2, HCK, INTERSECTIN1-1, INTERSECTIN1-2, INTERSECTIN1-3, INTERSECTIN1-4, INTERSECTIN1-5, IRSP53, LCK, LYN, MLK3, MPP1, MYO7A, NCF1, NCK2, NPHP1, OSTF1, P51NOX, PAC2, PAC3, PIK3R1, PLCG1, RASGAP, RIMB1-2, RIMB1-3, RUSC1, SH3PX3, SNX18, SNX9, SRC, STAM1, STAM2, TUBA-1, TUBA-3, TUBA-6.
Defaults to ()
Defaults to ()
Step size
For retrieving the peptide sequences, the query
proteins will be scanned automatically with a window size of 15
and the user defined step size.
The parameter constraints are: Input value has to be parsable as Integer. The value must be greater than or equal to 1.
Defaults to (5)
Defaults to (5)
Filters
Proline-rich peptides
This filter has been introduced to find the potential
proline-rich binding
motifs using
31 regular expressions
that are most common to
describe most of the SH3 domain binding specificity
(see also this article).
The parameter constraints are: Input value has to be parsable as Boolean.
Defaults to (true)
Defaults to (true)
Cellular localization (needs UniProt IDs)
This filter was implemented considering the terms relative to
the sub-cellular localization hierarchy in the controlled
vocabulary of the
Gene Ontology database
(January, 2013). In
case of multiple cellular locations, we
consider a peptide viable for interaction if it shares at least
one of the terms with the domain. This filter is not applicable
for the domains without annotated sub-cellular localizations.
Uniprot Ids of the query proteins are mandatory for using this
filter: within FASTA provide either ONLY the UniProt ID within
the FASTA header OR use the header encoding from the UniProt
database; alternatively just provide the UniProt IDs for an
automated download of the query sequences.
The parameter constraints are: Input value has to be parsable as Boolean.
Defaults to (false)
Defaults to (false)
Output Description
The output table summarizes all predicted protein-domain interactions.
Detailed descriptions:
Detailed descriptions:
- 1. Input sequence id
- 2. Binding region in the protein
- 3. Binding sequence
- 4. Binding domain/s with the protein
Input Examples
SH3 interactions
This is an example for the interaction prediction of human
SH3 domains from ABL, LCK and PIK3R1 proteins with the target proteins
PIK3C2B, PPP3CA and SYNJ1.
The example's result can be directly accessed here
List of Changes
- 3.2.0 : SH3PepInt v1.0.0 online