psite_annotation.addSiteSequenceContext
- psite_annotation.addSiteSequenceContext(df, fastaFile, pspInput=False, context_left=15, context_right=15, retain_other_mods=False, return_unique=False, return_sorted=False, organism='human')
Annotate pandas dataframe with sequence context of a p-site.
Adds the following annotation columns to dataframe:
‘Site sequence context’ = +/- 15 amino acids around each of the modified sites, separated by semicolons
- Required columns:
Site positions- Parameters:
df (
DataFrame) – pandas dataframe with ‘Site positions’ columnfastaFile (
str) – fasta file containing protein sequencespspInput (
bool) – set to True if fasta file was obtained from PhosphositePluscontext_left (
int) – number of amino acids to the left of the modification to includecontext_right (
int) – number of amino acids to the right of the modification to includeretain_other_mods (
bool) – retain other modifications from the modified peptide in the sequence context in lower casereturn_unique (
bool) – eliminate duplicated sequences from the ‘Site sequence context’ column, not preserving the order between the this column and the rest of the data framereturn_sorted (
bool) – sort the sequences from the ‘Site sequence context’ column alphabetically, not preserving the order between the this column and the rest of the data frame
- Returns:
annotated dataframe
- Return type:
pd.DataFrame