DeepPhospho


Prediction of fragment ion intensity and retention time for phosphopeptides

Start

Inputs

Trained DeepPhospho Models

Peptide input format

Prediction for a single peptide


Prediction in a batch mode

Re-calibrate iRT

Processing

[[ error_msg ]]

Outputs

Indexed retention time: [[ rt_pred ]]

[[ other_info ]]

Ion intensity



How To

I. Peptide format as an input

DeepPhospho supports the following modifications:

  • Phosphorylation on S/T/Y
  • Acetylation at peptide N-term
  • Oxidation on M
  • Fixed modification of carbamidomethylation on all C

Here we provide several functions for format transformation. Five supported formats for modified peptides are listed as below (other formats may also work if they adopt the same rules for placing modifications):

  • SN13 format: Spectronaut 13+ style (modifications with long-term name start and end with brackets):
    • [Phospho (STY)] for Phospho (STY)
    • [Oxidation (M)] for Oxidation (M)
    • [Acetyl (Protein N-term)] for Acetyl (N-term)
    • Example: _[Acetyl (Protein N-term)]M[Oxidation (M)]LSLRVPLAPITDPQQLQLS[Phospho (STY)]PLK_
  • MQ1.5 format: MQ 1.5- style (modifications with lowercase shorthand):
    • (ph) for Phospho (STY)
    • (ox) for Oxidation (M)
    • (ac) for Acetyl (N-term)
    • Example: _(ac)GS(ph)QDM(ox)GS(ph)PLRET(ph)RK_
  • MQ1.6 format: MQ 1.6+ style (modifications with long-term name start and end with parentheses):
    • (Phospho (STY)) for Phospho (STY)
    • (Oxidation (M)) for Oxidation (M)
    • (Acetyl (Protein N-term)) for Acetyl (N-term)
    • Example: _(Acetyl (Protein N-term))TM(Oxidation (M))DKS(Phospho (STY))ELVQK_
  • Comet format: Comet style (modifications with symbols annotated):
    • S@/T@/Y@ for Phospho (STY)
    • M* for Oxidation (M)
    • n# for Acetyl (N-term)
    • Example: n#DFM*SPKFS@LT@DVEY@PAWCQDDEVPITM*QEIR
  • DP format: DeepPhospho style (modified residues with integers annotated used as DeepPhospho input):
    • 2/3/4 for S/T/Y with Phospho (STY), respectively
    • 1 for Oxidation (M)
    • * as peptide first char for Acetyl (N-term)
    • @ as peptide first char for Non-Acetyl modified
    • Example: *1ED2MCLK = (ac)M(ox)EDS(ph)MCLK
    • Example: @Q3D2MCLK = QT(ph)DS(ph)MCLK

II. Trained DeepPhospho models

Here we provide 4 individual models. Each of them was pre-trained with multiple phosphoproteomics datasets and then fine-tuned with a specific DIA/DDA MS dataset. Sample source and data acquisition conditions for each dataset used in model fine-tuning are listed as below:

  • U2OS DIA model
    • Sample: U2OS cell line
    • Instrument: Fusion Lumos
    • MS method: MS1: 350-1650 m/z, 120k resolution, 2e5 AGC, 100ms IT, MS2: 200-1800 m/z, 30k resolution, 5e5 AGC, 50ms IT, 28% NCE
    • LC condition: EASY-nLC 1200, 75 μm × 30 cm PicoFrit
  • RPE1 DIA model
    • Sample: RPE1 cell line
    • Instrument: QE HF-X
    • MS method: MS1: 472-1145 m/z, 120k resolution, 3e6 AGC, 45ms IT, MS2: 200-2000 m/z, 15k resolution, 3e6 AGC, 22ms IT, 25% NCE
    • LC condition: EASY-nLC 1200, 75 μm × 15 cm Reprosil-Pur C18
  • Dilution DIA model
    • Sample: HeLa + BY4742
    • Instrument: QE HF-X
    • MS method: MS1: 350-1400 m/z, 120k resolution, 3e6 AGC, 45ms IT, MS2: 200+ m/z, 15k resolution, 3e6 AGC, 22ms IT, 25% NCE
    • LC condition: EASY-nLC 1200, 75 μm × 15 cm Reprosil-Pur C18
  • RPE1 DDA model
    • Sample: RPE1 cell line
    • Instrument: QE + QE HF
    • MS method: MS1: 350-1400 m/z, 60k resolution, 3e6 AGC, 25ms IT, MS2: 200+ m/z, 15k resolution, 1e5 AGC, 22ms IT, 28% NCE
    • LC condition: EASY-nLC 1200, 75 μm × 15 cm Reprosil-Pur C18

III. Prediction for a single phophopeptide

For a single phosphopeptide, fragment ion intensity prediction requires the input of both "Peptide Sequence" and "Charge"; iRT prediction only requires the input of "Peptide Sequence" in the correct format.


IV. Prediction in a batch mode and iRT calibration

Batch submission requires a tab-separate input file containing specific columns entitled "InputPeptide", "PrecCharge", and "CalibrationRT". Of these three columns, the first two are required for predictions of both fragment ion intensities and iRT values. The third one is needed for iRT calibration. In this case, the predicted iRT will be re-calibrated (with a quadratic polynomial fitting) to match with the experimental scale specified in the column of "CalibrationRT" (for this point, we would recommand offline app for fine-tuning and prediction with much higher accuracy). At least 7 peptides with measured (i)RTs in the input file is recommended for (i)RT calibration. The output file downloaded in the batch mode is a ready-to-use spectral library for DIA data mining.

About

I. DeepPhospho for DIA phosphoproteomics

DeepPhospho is a hybrid deep neural network combining LSTM and transformer modules and it is specifically developed for accurate prediction of fragment ion intensity and indexed retention time (iRT) for any given phosphopeptide. In our published work (doi: 10.1038/s41467-021-26979-1), we built a new DIA phosphoproteomics workflow leveraging DeepPhospho predicted libraries, which substantially expanded the phosphoproteome coverage while maintaining high quantification performance compared to the gold-standard experimental DDA library.


II. Generation of predicted libraries by DeepPhospho

Using DeepPhospho, an experimental project-specific DDA library or direct DIA library can be converted to a predicted DDA library or a predicted DIA library. A predicted library can be also generated from public phosphoproteome or phosphosite databases, or external phosphoproteomics data. In our published work, we systematically compared the performance of a series of DeepPhospho predicted libraries in different compositions and built from different data sources.



III. DeepPhospho web server

To facilitate user access to predictions and library generation, DeepPhospho is provided as a web server. In the START page, users can make predictions of MSMS spectra and iRT values for either a single phosphopeptide or a batch of phosphopeptides with defined sequences and charge states. In the batch mode, after inputting the phosphopeptide information, users will be able to download a .txt file as a ready-to-use spectral library for DIA data mining.

In this web server we provide four DeepPhospho models fine-tuned with specific DDA/DIA MS datasets that were acquired from different sample sources and under different LC conditions or MS settings. These trained models can make accurate predictions for phosphopeptides analyzed under similar instrument conditions. For the analysis of data acquired at distinct conditions, we would recommend users to download and explore our user-friendly and more flexible DeepPhospho pipeline stored in GitHub repository (https://github.com/weizhenFrank/DeepPhospho).


IV. DeepPhospho offline app

We have created an offline DeepPhospho app for users who are interested in doing transfer learning with their own datasets. This offline app is a full wrapper of our DeepPhospho pipeline and can be easily launched on a desktop. It allows users to directly use the pre-trained model, train a new model, or fine-tune the model parameters with their own target datasets before making predictions with a selected model. Using our offline DeepPhospho app, a ready-to-use spectral library will be generated as an output file. The offline app now supports model training using result files in the Spectronaut library format or in the MaxQuant msms format, and it supports fragment ion intensity and iRT prediction using input files in four different formats (Spectronaut style, MaxQuant style, Comet style, and DeepPhospho self-defined style as specified in our DeepPhospho web interface). The offline DeepPhospho app can be downloaded from GitHub repository GitHub repository (https://github.com/weizhenFrank/DeepPhospho).