Classify the samples according to the distance from the reference class
Introduction
The following are the codes for classifying the samples according to the distance from the centroids of reference class.
Code exmaple
library(data.table)
centroids <- read.table("data/other/GSE10886/pam50_centroids.txt")
gene_name <- fread("data/other/GSE10886/pam_gene_name.txt", header = T)
rownames(centroids) <- gene_name[data.table(pcrID = rownames(centroids)), GeneName, on = "pcrID"] # Get gene name
# Get intersect genes between my data and centoid data
inter_gene <- intersect(rownames(centroids), rownames(mt))
dat <- mt[inter_gene,]
centroids <- centroids[inter_gene,]
# Calculaed dist
d1=apply(centroids,2,function(i){
dist(t(cbind(i,dat)))[1:ncol(dat)]
})
# Select the classe with min distance
p1=apply(d1,1, function(x){
colnames(d1)[which.min(x)]
})
p1 # Predicted class
References
None
Original
None