psite_annotation.annotators.DomainAnnotator
- class psite_annotation.annotators.DomainAnnotator(annotation_file)
Bases:
objectAnnotate pandas dataframe with domains from uniprot.
Requires ‘Matched proteins’, ‘Start positions’, ‘End positions’ columns in the dataframe to be annotated. The ‘Matched proteins’, ‘Start positions’, ‘End positions’ columns can be generated with PeptidePositionAnnotator().
Example
annotator = DomainAnnotator(<path_to_annotation_file>) annotator.load_annotations() df = annotator.annotate(df)
Initialize the input files and options for DomainAnnotator.
- Parameters:
annotation_file (
Union[str,IO]) – comma separated file with domains and their positions within the protein
Methods
Adds column with domains the peptide overlaps with.
Reads in comma separated file with domain annotations extracted from ProteomicsDB.
- annotate(df, inplace=False)
Adds column with domains the peptide overlaps with.
Adds the following annotation columns to dataframe:
Domains = semicolon separated list of domains that overlap with the peptide
- Parameters:
df (
DataFrame) – pandas dataframe with ‘Proteins’, ‘Start positions’ and ‘End positions’ columnsinplace (
bool) – Whether to modify the DataFrame rather than creating a new one.
- Returns:
annotated dataframe
- Return type:
pd.DataFrame
- Required columns:
Matched proteins,Start positions,End positions
- load_annotations()
Reads in comma separated file with domain annotations extracted from ProteomicsDB.
- Return type:
None