Codes in paper

Calculation of dysfunction score in TIDE

August 26, 2020February 21, 2021 by Yuwei Liao

Introduction

The TIDE score calculated by the TIDE software consists of two parts: dysfunction score and exclusion score. The calculation principle of dysfunction score is that genes with immune disorders have a higher weight, and the dysfunction score can be obtained by multiplying by the amount of expression. The exclusion score is obtained by multiplying the expression amount by the higher weight of immune rejection genes. The weight of immune disorder is obtained by using cancer survival data, and the weight of immune rejection is obtained by using the gene expression of immune rejection samples. This article mainly talks about the calculation of imbalance score.

Code explanation

Read in the data, prepare the matrix

#!/usr/bin/env Rscript
suppressPackageStartupMessages(library(survival))

CTL_genes = c('CD8A', 'CD8B', 'GZMA', 'GZMB', 'PRF1')  # Genes associated with cytotoxic T lymphocytes

readmat = function(mat) as.matrix(read.table(mat, sep='\t', header=T, check.names=F, quote=NULL))

writemat = function(result, output)
{
  result = result[!is.na(result[,"p"]),,drop=F]
  FDR = p.adjust(result[,"p"], method="fdr")
  result = cbind(result, FDR)
  write.table(result, file=output, sep='\t', quote=F)
}

commands = commandArgs(trailingOnly=T)

# three inputs here
mat = t(readmat(commands[1])) # expression, mutation, or RPPA matrix
pivot = t(readmat(commands[2]))  # CD8A, CD8B, GZMA, GZMB, PRF1 expression, As a fulcrum
survival = readmat(commands[3]) # clinical information

Filter data

# remove negative survival values
survival = survival[survival[,1] > 0,]  #Remove the error message with negative time to live

output = commands[4]

# Take samples that are present in all data
common = Reduce(intersect, list(rownames(mat),rownames(survival),rownames(pivot)))
print(paste(length(common), "samples"))

# stop at low sample numbers
if(length(common) < 20) stop("two few samples")  # If the sample is too small, it cannot be analyzed

not_included = setdiff(CTL_genes, colnames(pivot))  # The CTL gene and the pivot gene cannot overlap

if(length(not_included) > 0) stop(paste(c("pivot genes", not_included, "are not included"), ' '))

pivot = rowMeans(pivot[common,CTL_genes])  # pivot (cytotoxic T cell expression) data
mat = mat[common,,drop=F]     # Matrix of expressions of all available samples
survival = survival[common,,drop=F]   # Survival information for all available samples

# stop at low death rate
death_rate = sum(survival[,2])/dim(survival)[1]
if(death_rate < 0.1) stop(paste("death rate", death_rate, "is too low"))  # If the mortality rate is too low, the dysfunction obtained is unreliable

# split up survival and background
surv = Surv(survival[,1], survival[,2])  # State of life and time

if(dim(survival)[2] > 2){    #Various variables other than survival status and time
  B = survival[,3:dim(survival)[2], drop=F]
}else{
  B = survival[,c(), drop=F]
}

# build up regression data space  # Build data
B = cbind(B, pivot, pivot, pivot)
B = as.data.frame(B)
N_B = dim(B)[2]  # Number of columns of data

Perform survival analysis

# test pivot effect first   # Get the regression coefficient of TIL
coxph.pivot = summary(coxph(surv~., data=B[,c(-N_B, -(N_B-1)), drop=F]))$coef
write.table(coxph.pivot, file=paste0(output, ".pivot"), sep='\t', quote=F)

colnames(B)[N_B-1] = "partner"
colnames(B)[N_B] = "Interaction"

# iterate over features # Get all the characteristic variables and prepare to loop
features = colnames(mat)
N = length(features)

# Matrix of results
result_interaction = matrix(nrow=N, ncol=2, dimnames=list(features, c('z','p')))
result_main = matrix(nrow=N, ncol=2, dimnames=list(features, c('z','p')))
result_partial = matrix(nrow=N, ncol=2, dimnames=list(features, c('z','p')))
result_base = matrix(nrow=N, ncol=2, dimnames=list(features, c('z','p')))

# Loop
for (i in 1:N)
{
  title = features[i]
  partner = mat[,i]

  B[,N_B-1] = partner   # partner就是当前循环的基因的表达量
  B[,N_B] = partner * pivot   # partner和pivot的乘积就是interaction

  # region 1: model with interaction  # 有interaction的模型，得到的interaction部分的z值就是dysfunction的权重大小
  errflag = F
  coxph.fit = tryCatch(coxph(surv~., data=B),
                       error = function(e) errflag <<- T,
                       warning = function(w) errflag <<- T)

  if(!errflag)
  {
    reg.summary = summary(coxph.fit)$coef
    result_interaction[i,] = reg.summary["Interaction", c("z", "Pr(>|z|)")]
    result_main[i,] = reg.summary["partner", c("z", "Pr(>|z|)")]
  }

  # region 2: model without interaction  # Model without interaction
  errflag = F
  coxph.fit = tryCatch(coxph(surv~., data=B[,-N_B]),
                       error = function(e) errflag <<- T,
                       warning = function(w) errflag <<- T)

  if(!errflag)
  {
    reg.summary = summary(coxph.fit)$coef
    result_partial[i,] = reg.summary["partner", c("z", "Pr(>|z|)")]
  }

  # region 3: model with only base effect  # Model with only other clinical information
  errflag = F
  coxph.fit = tryCatch(coxph(surv~., data=B[,c(-N_B, -(N_B-2)), drop=F]),
                       error = function(e) errflag <<- T,
                       warning = function(w) errflag <<- T)

  if(!errflag)
  {
    reg.summary = summary(coxph.fit)$coef
    result_base[i,] = reg.summary["partner", c("z", "Pr(>|z|)")]
  }
}

writemat(result_interaction, paste0(output, ".interaction"))
writemat(result_main, paste0(output, ".main"))
writemat(result_partial, paste0(output, ".partial"))
writemat(result_base, paste0(output, ".base"))

Result explanation

As shown in the figure above, the regression coefficient of TGFB1 multiplied by CTL is positive, then it is an antagnonistic gene with CTL, the regression coefficient of SOX10 multiplied by CTL is negative, then it is a positive CTL plays a synergistic role. The zscore of the gene obtained here is the weight of the dysfunction. Multiplying the weight by the expression of the gene and summing it up is the dysfunction of this sample. score

Reference

Peng J , Shengqing G , Deng P , et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response[J]. Nature Medicine, 2018.

Original code

http://tide.dfci.harvard.edu/download/

Recent Posts

Most Used Categories

The Code Search

Calculation of dysfunction score in TIDE

Introduction

Code explanation

Read in the data, prepare the matrix

Filter data

Perform survival analysis

Result explanation

Reference

Original code

Recent Posts

Most Used Categories

Calculation of dysfunction score in TIDE

Introduction

Code explanation

Read in the data, prepare the matrix

Filter data

Perform survival analysis

Result explanation

Reference

Original code

You may like

All codes for scRNA analysis in R

Significance test of PPI(protein–protein interaction) interface mutations.