Case Study¶

This case study demonstrates the application of the TRIAGE analysis pipeline using a publicly available single-nucleus RNA sequencing (snRNA-seq) dataset.

Data Source:

Dataset: Kuppe et al., Nature 2022, Spatial multi-omic map of human myocardial infarction.
Data Available at: https://www.nature.com/articles/s41586-022-05060-x#data-availability
The demonstration uses control endothelial and pericyte cells to showcase the TRIAGE analysis pipeline.

Objective: To apply TRIAGEcluster for cell clustering, byPeak() for calculating average expression data by various categories, TRIAGEgene for generating TRIAGE-weighted expression data, plotJaccard() for Jaccard Index Heatmap visualization, TRIAGEparser to dissect gene clusters and their associated ontologies, and plotGO() for GO enrichment visualization.

Steps:

TRIAGEcluster + byPeak()
TRIAGEgene + plotJaccard()
TRIAGEparser + plotGO()

# Load packages and set up the working directory
library(TRIAGE)
library(reticulate)
setwd("/path/to/working/directory")

Step 1: Run TRIAGEcluster for cell clustering, select a suitable bandwidth resolution based on UMAP reviews, and calculate average gene expression by various categories using the byPeak() function.

# Run TRIAGEcluster
expr_file <- system.file("extdata", "TRIAGEcluster_demo_expr_human.csv", package = "TRIAGE")
metadata_file <- system.file("extdata", "TRIAGEcluster_demo_metadata_human.csv", package = "TRIAGE")
TRIAGEcluster(expr_file, metadata_file, outdir = "tests", output_prefix = "demo")

# Select a suitable bandwidth and calculate average gene expression
peak_file <- "tests/demo_bw0.80_metadata.csv"
avg_peak <- byPeak(expr_file, peak_file)
# Save the average gene expression result to a CSV file
write.csv(avg_peak, file = "tests/AverageByPeak.csv", row.names = TRUE, quote = FALSE)

TRIAGEcluster generates nine UMAPs with bandwidths from 0.1 to 0.9, and a corresponding set of metadata files. Here, a bandwidth of 0.8 was selected for downstream application, resulting in four TRIAGEcluster-defined peaks within the UMAP.

UMAP showing TRIAGEcluster-defined peaks

A snippet of the “AverageByPeak.csv” file is shown below:

	Peak0	Peak1	Peak2	Peak3
ALDH1A2	0.04404541	0.04426352	1.18265890	0.12415153
PRRX1	0.63263328	0.19289223	0.08957362	0.11108879
NOVA2	0.00785169	0.31212372	0.22311596	0.28179011
NGFR	0.07502976	0.01140611	0.01078702	0.02280367
MCF2L	0.03431053	1.41983845	0.25927713	1.39706381
TBX4	0	0	0	0
…	…	…	…	…

Step 2: Run TRIAGEgene to generate TRIAGE-weighted expression data and plotJaccard() to generate Jaccard similarity index heatmap.

# Run TRIAGEgene to generate TRIAGE-weighted expression data (DS)
ds <- TRIAGEgene(avg_peak)
# Save the average DS result to a CSV file
write.csv(ds, file = "tests/AverageByPeak_DS.csv", row.names = TRUE, quote = FALSE)
# Save the average DS result to a tab-delimited text file
write.table(ds, file = "tests/AverageByPeak.txt", sep = "\t",
            row.names = TRUE, col.names = NA, quote = FALSE)

# Generate Jaccard Index Heatmap
plotJaccard(ds, "tests/Jaccard_heatmap_peak.pdf")

A snippet of the “AverageByPeak_DS.csv” file is shown below:

	Peak0	Peak1	Peak2	Peak3
ALDH1A2	0.000685619	0.000689014	0.018409486	0.001932566
PRRX1	0.019265802	0.005874214	0.002727817	0.003383026
NOVA2	0.000207595	0.008252406	0.005899082	0.0074504
NGFR	0.012602077	0.001915782	0.001811799	0.003830128
MCF2L	0.000594645	0.024607587	0.004493599	0.024212873
TBX4	0	0	0	0
…	…	…	…	…

The plotJaccard() function generates a heatmap to visualize the extent of similarity across identified peaks:

Step 3: Run TRIAGEparser to dissect gene clusters and their associated ontologies and plotGO() to generate GO enrichment heatmaps.

# Run TRIAGEparser to dissect gene clusters and their associated ontologies
input_file <- "tests/AverageByPeak_DS.csv"
TRIAGEparser(input_file, input_type = "table", outdir="tests/TRIAGEparser_peak")

# Generate GO Heatmaps
plotGO(indir="tests/TRIAGEparser_peak", outdir="tests/TRIAGEparser_peak")

The TRIAGEparser output comprises gene clusters and their associated Gene Ontology (GO) enrichments, which are organized into two separate folders named “gene_cluster” and “go”. The plotGO() function generates heatmaps for gene clusters in each peak. The folder structure is illustrated below:

GO enrichment heatmap for the ‘Peak0’ cluster:

GO enrichment heatmap for 'Peak0' cluster

GO enrichment heatmap for the ‘Peak1’ cluster:

GO enrichment heatmap for 'Peak1' cluster

GO enrichment heatmap for the ‘Peak2’ cluster:

GO enrichment heatmap for 'Peak2' cluster

GO enrichment heatmap for the ‘Peak3’ cluster:

GO enrichment heatmap for 'Peak3' cluster

Alternative Calculations 1: Average Gene Expression by Cluster:

Objective: To calculate average gene expression based on cluster categories using byPeak() function, followed by TRIAGEgene analysis and Jaccard index heatmap generation.

expr_file <- system.file("extdata", "TRIAGEcluster_demo_expr_human.csv", package = "TRIAGE")
peak_file <- "tests/demo_bw0.80_metadata.csv"

# Calculate average gene expression by cluster
avg_cluster <- byPeak(expr_file, peak_file, peak_col = "final_cluster")
write.csv(avg_cluster, file = "tests/AverageByCluster.csv", row.names = TRUE, quote = FALSE)

# Run TRIAGEgene
ds_cluster <- TRIAGEgene(avg_cluster)
write.csv(ds_cluster, file = "tests/AverageByCluster_DS.csv", row.names = TRUE, quote = FALSE)

# Generate Jaccard index heatmap
plotJaccard(ds_cluster, "tests/Jaccard_heatmap_cluster.pdf")

# Run TRIAGEparser
input_file <- "tests/AverageByCluster_DS.csv"
TRIAGEparser(input_file, input_type = "table", outdir="tests/TRIAGEparser_cluster")

# Generate GO Heatmaps
plotGO(indir="tests/TRIAGEparser_cluster", outdir="tests/TRIAGEparser_cluster")

Alternative Calculations 2: Average Gene Expression by Cell Type

Objective: To calculate average gene expression based on cell type categories using byPeak() function, followed by TRIAGEgene analysis and Jaccard index heatmap generation.

expr_file <- system.file("extdata", "TRIAGEcluster_demo_expr_human.csv", package = "TRIAGE")
peak_file <- "tests/demo_bw0.80_metadata.csv"

# Calculate average gene expression by cell type
avg_celltype <- byPeak(expr_file, peak_file, peak_col = "cell_type")
write.csv(avg_celltype, file = "tests/AverageByCelltype.csv", row.names = TRUE, quote = FALSE)

# Run TRIAGEgene
ds_celltype <- TRIAGEgene(avg_celltype)
write.csv(ds_celltype, file = "tests/AverageByCelltype_DS.csv", row.names = TRUE, quote = FALSE)

# Generate Jaccard index heatmap
plotJaccard(ds_celltype, "tests/Jaccard_heatmap_celltype.pdf")

# Run TRIAGEparser
input_file <- "tests/AverageByCelltype_DS.csv"
TRIAGEparser(input_file, input_type = "table", outdir="tests/TRIAGEparser_celltype")

# Generate GO Heatmaps
plotGO(indir="tests/TRIAGEparser_celltype", outdir="tests/TRIAGEparser_celltype")

Case Study¶

TRIAGE R Package

Navigation

Related Topics