5ULTRA generates a Tab-Separated Values (TSV) file. The columns included in the output depend on the "Annotation" and "Splice Analysis" options you select. Below is a detailed breakdown of each potential column.
These columns are included in all output files, regardless of the options selected.
Column Name | Description |
---|---|
CHROM , POS , ID , REF , ALT | Standard variant identifiers, same as the input file. |
CSQ | The variant's predicted consequence on uORF or Kozak sequence (e.g., uStart_gain/loss, uStop_gain/loss, uKozak, mKozak). |
Translation | Predicted overall effect on protein translation: "Increased", "Decreased", or "N-terminal Extension". |
5ULTRA_Score | The final variant prioritization metric, ranging from 0 (benign) to 1 (pathogenic). |
GENE | The official gene symbol associated with the transcript. |
TRANSCRIPT | The Ensembl transcript ID (ENST...) on which the annotation was based. |
These columns are added to the output when "Splice Analysis" is switched On.
Column Name | Description |
---|---|
SpliceAI | The raw SpliceAI prediction scores for the variant. |
Splicing_CSQ | Predicted missplicing consequence on the 5'UTR sequence. (e.g., DG: new donor site, AG: new acceptor site). |
These additional columns are appended to the output when "Annotation" is set to Full.
Column Name | Description |
---|---|
MANE | The matched NCBI transcript ID (e.g., NM_...) if the transcript is part of the MANE Select set. |
5UTR_START / 5UTR_END | Genomic coordinates of the entire 5′ UTR. |
STRAND | The DNA strand of the transcript (+ or -). |
5UTR_LENGTH | Total length of the 5′ UTR in base pairs. |
START_EXON | The exon number where the canonical Coding Sequence (CDS) begins. |
mKOZAK / mKOZAK_STRENGTH | The Kozak sequence and its calculated strength ("Strong", "Moderate", "Weak") around the main CDS start codon. |
uORF_count | Total number of upstream Open Reading Frames (uORFs) found in the transcript's 5'UTR. |
Overlapping_count , etc. | Counts of specific uORF types (Overlapping, N-terminal, Non-Overlapping). |
uORF_START / uORF_END | Genomic coordinates of the specific uORF affected by the variant. |
Ribo_seq | Indicates if there is ribosomal profiling evidence for the uORF's translation ("True", "False", or "New uORF" if created by the variant). |
uSTART_mSTART_DIST / uSTART_CAP_DIST | Distance from the uORF start to the main CDS start and to the 5' cap, respectively. |
uSTOP_CODON | The specific stop codon of the uORF (TAA, TGA, or TAG). |
uORF_TYPE | The type of the affected uORF (Non-overlapping, Overlapping, N-terminal extension). |
uKOZAK / uKOZAK_STRENGTH | The Kozak sequence and strength around the uORF's start codon. |
uORF_LENGTH / uORF_AA_LENGTH | The length of the uORF in nucleotides and amino acids. |
uORF_rank | The relative rank of the uORF based on its proximity to the main CDS start. |
uSTART_PHYLOP / uSTART_PHASTCONS | Conservation scores (PhyloP, PhastCons) for the uORF start codon. |
pLI / LOEUF | Gene-level constraint/intolerance metrics. |