Project file

The project file is a JSON file that contains the metadata of the project. The project file is used to link the different files of the project and to store the metadata of the project. The project file is a json file that contains the following fields:

  • project_accession -> ProteomeXchange Identifier -> string

  • project_title -> Project title -> string

  • project_description -> Project description -> string

  • project_sample_description -> Sample description of the project -> string

  • project_data_description -> Data description of the project -> string

  • project_pubmed_id -> PubMed identifier -> string

  • organism -> List organism name -> list[string]

  • organism_part -> List of organism part -> list[string]

  • disease -> List of diseases -> list[string]

  • cell line -> List of cell line (if available) -> list[string]

  • instrument -> List of instrument names -> list[string]

  • enzyme -> List of protease type for digest -> list[string]

  • experiment_type -> List of all keywords in ProteomeXchange or PRIDE around the dataset. -> list[string]

  • acquisition_properties -> List of key value pairs for the acquisition properties (see example below) -> list[Key/Value]

  • quantms_files -> List of all files generated by quantms and collected in the final results folder-> list[Key/Value]

  • quantms_version -> Version of quantms used to generate the files -> string

  • comments -> List of comments or additional information needed -> list[string]

Key/Value pair object:

The key/value pairs are used to store the acquisition properties and the quantms files. The key/value pair object is a json object that contains the following fields:

  • key -> Key of the pair -> string

  • value -> Value of the pair -> string

Example of acquisition_properties:

"acquisition_properties": [
     {"precursor tolerance": "0.05 Da"},
     {"dissociation method": "HCD"}
]

In the acquisition properties only the instrument and the enzyme are not present and should be written independently in the properties instrument and enzyme.

Quantms files

Recommendations for the file name in the quantms project. The file name should be in the following format:

{user_prefix}-{uui}.{file_section}.{file_extension}

Example of quantms_files:

"quantms_files": [
     {"protein_file": "PXD004683-550e8400-e29b-41d4-a716-446655440000.protein.parquet"},
     {"peptide_file": "PXD004683-550e8400-e29b-41d4-a716-446655440000.peptide.parquet"},
     {"psm_file":     "PXD004683-550e8400-e29b-41d4-a716-446655440000.psm.parquet"},
     {"feature_file": "PXD004683-958e8400-e29b-41f4-a716-446655440000.feature.parquet"},
     {"differential_file": "PXD004683-958e8400-e29b-41f4-a716-446655440000.differential.tsv"},
     {"absolute_file":     "PXD004683-958e8400-e29b-41f4-a716-446655440000.absolute.tsv"},
     {"sdrf_file":         "PXD004683-958e8400-e29b-41f4-a716-446655440000.sdrf.tsv"}
]

uuids: A Universally Unique Identifier (UUID) URN Namespace, as defined in RFC 4122, provides a standardized method for generating globally unique identifiers across various systems and applications. UUIDs are structured into five sections, separated by hyphens, which include a time-based timestamp, a clock sequence, and a node identifier. The UUID URN Namespace ensures that each generated UUID is highly unlikely to collide with any other UUID, even when produced by different entities and systems.

To generate file names using UUIDs in a programming language like Python, you can utilize the uuid module that provides functions to create UUIDs. Here’s an example of how you could generate and format UUID-based file names:

import uuid

def generate_uuid_filename():
    return uuid.uuid4()  # Generate a random UUID

# Generate and print a UUID-based file name
print("Generated UUID filename:", generate_uuid_filename())

In this Python code snippet, the generate_uuid_filename function creates a random UUID using the uuid4 function. The uuid in quantms will contain 5 sections separated by hyphens, which include a time-based timestamp, a clock sequence, and a node identifier.

file_sections: File sections are used to identify the type of file. The file sections are the following:

Sample table

We only provide here the SDRF format used to analyze the data with quantms. The SDRF file is a tab-delimited file that contains the metadata of the samples. The SDRF file is used to link the different files of the project and to store the metadata of the samples.

Read here more about SDRF.