psite_annotation.addModifiedSequenceGroups

psite_annotation.addModifiedSequenceGroups(df, match_tolerance=2)

Annotate DataFrame with representative sequences from grouped localizations.

Requires “Modified sequence” column in the dataframe to be present.

Adds the following annotation columns to dataframe:

  • ‘Delocalized sequence’ = Canonical unmodified backbone with an index suffix to distinguish the number of modifications.

  • ‘Modified sequence group’ = All peptide variants belonging to the same delocalized group, concatenated with semicolons.

Example

df = pa.addModifiedSequenceGroups(df)
Required columns:

Modified sequence

Parameters:
  • df (DataFrame) – pandas dataframe with ‘Modified sequence’ column

  • match_tolerance (int) – group all modifiable positions within n positions of modified sites.

Returns:

annotated and aggregated dataframe

Return type:

pd.DataFrame