psite_annotation.annotators.ModifiedSequenceGroupAnnotator
- class psite_annotation.annotators.ModifiedSequenceGroupAnnotator(match_tolerance=2)
Bases:
objectAnnotate pandas dataframe with modified sequence groups where localizations are within match_tolerance of each other.
Example
annotator = ModifiedSequenceGroupAnnotator() df = annotator.annotate(df)
Initialize the options for ModifiedSequenceGroupAnnotator.
- Parameters:
match_tolerance (
int) – group all modifiable positions within n positions of modified sites.
Methods
Group delocalized phospho-forms.
load_annotations- rtype:
None
- annotate(df, inplace=False)
Group delocalized phospho-forms.
This function identifies peptide sequences that differ only by the position of their phosphorylation ((ph)) group and collapses them into “delocalized” groups. Each group contains all modified sequence variants that represent the same underlying peptide backbone.
The following columns are added to the dataframe:
‘Delocalized sequence’ = Canonical unmodified backbone with an index suffix to distinguish the number of modifications.
‘Modified sequence group’ = All peptide variants belonging to the same delocalized group, concatenated with semicolons.
- Parameters:
df (
DataFrame) – Input dataframe with: - “Modified sequence” column containing peptide strings with (ph) annotationsinplace (
bool) – add the new column to df in place
- Returns:
Dataframe with Modified sequence group column
- Return type:
pd.DataFrame
- Required columns:
Modified sequence