Protein table format¶
Use cases¶
The Protein table is a Parquet file that contains the details of the proteins identified and quantified .
Store proteins identified and quantified from mzTab file, with the corresponding abundance and search engine scores.
Enable easy visualization and scanning on protein level.
Format¶
protein_accession: A list protein’s accessions ->list[string] (e.g. [P02768, P02769])best_id_score: The best search engine score for the given protein as a key value pair ->stringabundance: The protein’s abundance as measured in the given sample ->floatsample_accession: A unique sample accession corresponding to the source name in the SDRF->stringglobal_qvalue: Global q-value from quantms ->doubleis_decoy: Indicates whether the protein is decoy ->boolean (0/1)
Optional fields:
gene_accessions: A list of gene accessions ->list[string] (e.g. [ENSG00000139618, ENSG00000139618])gene_names: A list of gene names ->list[string] (e.g. [APOA1, APOA1])number_of_peptides: Number of peptides for the protein in the given samplesample_accession->intnumber_of_psms: Number of PSMs for the protein in the given samplesample_accession->intnumber_of_unique_peptides: Number of unique peptides for the protein in the given samplesample_accession->int