psite_annotation.annotators.KinaseLibraryAnnotator
- class psite_annotation.annotators.KinaseLibraryAnnotator(motifs_file, quantiles_file, top_n=5, score_cutoff=3, split_sequences=False, threshold_type='total', sort_type='total')
Bases:
objectAnnotate pandas dataframe with highest scoring kinases from the kinase library.
Johnson et al. 2023, https://doi.org/10.1038/s41586-022-05575-3
Requires “Site sequence context” column in the dataframe to be present. The “Site sequence context” column can be generated with PeptidePositionAnnotator().
Example
annotator = KinaseLibraryAnnotator(<path_to_motifs_file>, <path_to_quantiles_file>) annotator.load_annotations() df = annotator.annotate(df)
Initialize the input files and options for MotifAnnotator.
- Parameters:
annotation_file – tab separated file with motifs and their identifiers
Methods
Adds column with motifs the site sequence context matches with.
Reads in tab separated file with motif and quantile annotations.
- annotate(df, inplace=False)
Adds column with motifs the site sequence context matches with.
Adds the following annotation columns to dataframe:
Motif Kinases = semicolon separated list of kinases that match with the site sequence contexts
Motif Scores = semicolon separated list of scores corresponding to Motif Kinases
Motif Percentiles = semicolon separated list of percentiles corresponding to Motif Kinases
Motif Totals = semicolon separated list of score*percentile corresponding to Motif Kinases
- Parameters:
df (
DataFrame) – pandas dataframe with “Site sequence context” columninplace (
bool) – Whether to modify the DataFrame rather than creating a new one.
- Returns:
annotated dataframe
- Return type:
pd.DataFrame
- Required columns:
Site sequence context
- load_annotations()
Reads in tab separated file with motif and quantile annotations.
- Return type:
None