Title: | Identification of Cancer Dysfunctional Subpathway with Omics Data |
---|---|
Description: | Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways. |
Authors: | Junwei Han [cre], Baotong Zheng [aut], Siyao Liu [ctb] |
Maintainer: | Junwei Han <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.3 |
Built: | 2024-10-31 05:07:24 UTC |
Source: | https://github.com/hanjunwei-lab/icds |
Identify Cancer Dysfunctional Subpathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional subpathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional subpathways.
Maintainer: Junwei Han [email protected]
Authors:
Baotong Zheng [email protected]
Other contributors:
Siyao Liu [email protected] [contributor]
'combinep_three' combine three kinds of p-values,then,calculate z-score for them.
combinep_three(p1, p2, p3)
combinep_three(p1, p2, p3)
p1 |
the p-values or corrected p-values |
p2 |
the p-values or corrected p-values |
p3 |
the p-values or corrected p-values |
A numeric vector of z_scores
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") cnv.p<-GetExampleData("cnv.p") combinep_three(exp.p,meth.p,cnv.p)
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") cnv.p<-GetExampleData("cnv.p") combinep_three(exp.p,meth.p,cnv.p)
'combinep_two' combine two kinds of p-values,then,calculate z-score for them.
combinep_two(p1, p2)
combinep_two(p1, p2)
p1 |
A numeric vector of p-values or corrected p-values |
p2 |
A numeric vector of p-values or corrected p-values |
A numeric vector of z_scores
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") combinep_two(exp.p,meth.p)
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") combinep_two(exp.p,meth.p)
'coverp2zscore' calculate z-scores for p-values
coverp2zscore(pdata)
coverp2zscore(pdata)
pdata |
A numeric vector of p-values or corrected p-values |
A numeric vector of z_scores
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") cnv.p<-GetExampleData("cnv.p") coverp2zscore(exp.p) coverp2zscore(meth.p) coverp2zscore(cnv.p)
exp.p<-GetExampleData("exp.p") meth.p<-GetExampleData("meth.p") cnv.p<-GetExampleData("cnv.p") coverp2zscore(exp.p) coverp2zscore(meth.p) coverp2zscore(cnv.p)
Identify Cancer Dysfunctional Subpathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional subpathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional subpathways.
An environment variable
The environment variable includes the variable exp_data
, meth_data
,cnv_data
,amp_gene
,del_gene
,zzz
,exp.p
,meth.p
,cnv.p
,label1
,label2
,subpathdata
,opt_subpathways
Junwei Han[email protected],Baotong Zheng[email protected],Siyao Liu [email protected]
'FindSubPath' uses a greedy search algorithm to search for key subpathways in each entire pathway.
FindSubPath( zz, Pathway = "kegg", delta = 0.05, seed_p = 0.05, min.size = 5, out.F = FALSE, out.file = "Subpath.txt" )
FindSubPath( zz, Pathway = "kegg", delta = 0.05, seed_p = 0.05, min.size = 5, out.F = FALSE, out.file = "Subpath.txt" )
zz |
A numeric vector of z_scores. |
Pathway |
The name of the pathway database. |
delta |
Diffusion coefficient in each step of searching subpath. |
seed_p |
Define gene whose p-value smaller than seed_p as seed gene. |
min.size |
The smallest size of subpathways. |
out.F |
Logical,tell if output subpathways. |
out.file |
file name of subpathways. |
Key dysfunctional subpathways in each pathway, in which the risk score of the genes were significantly higher.
require(graphite) zz<-GetExampleData("zzz") k<-FindSubPath(zz)
require(graphite) zz<-GetExampleData("zzz") k<-FindSubPath(zz)
'getCnvp' perform t-test on copy number variation data
getCnvp( exp_data, cnv_data, amp_gene, del_gene, p.adjust = TRUE, method = "fdr" )
getCnvp( exp_data, cnv_data, amp_gene, del_gene, p.adjust = TRUE, method = "fdr" )
exp_data |
A data frame |
cnv_data |
Copy number variation data |
amp_gene |
A vector of strings, the IDs of amplified genes. |
del_gene |
A vector of strings, the IDs of deleted genes. |
p.adjust |
Logical,tell if returns corrected p-values |
method |
Correction method,which can be one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", |
cnv_data is TCGA level4 data.if p.adjust=TRUE,return corrected p-values,if p.adjust=FALSE,return p-values
A numeric vector of p-values or corrected p-values
exp_data<-GetExampleData("exp_data") meth_data<-GetExampleData("meth_data") cnv_data<-GetExampleData("cnv_data") amp_gene<-GetExampleData("amp_gene") del_gene<-GetExampleData(("del_gene")) getCnvp(exp_data,cnv_data,amp_gene,del_gene,p.adjust=FALSE,method="fdr")
exp_data<-GetExampleData("exp_data") meth_data<-GetExampleData("meth_data") cnv_data<-GetExampleData("cnv_data") amp_gene<-GetExampleData("amp_gene") del_gene<-GetExampleData(("del_gene")) getCnvp(exp_data,cnv_data,amp_gene,del_gene,p.adjust=FALSE,method="fdr")
Get the example data of test package for litte trials.
GetExampleData(exampleData)
GetExampleData(exampleData)
exampleData |
A character, should be one of "exp_data", "meth_data", "cnv_data", "amp_gene", "del_gene" ,"label1","label2","zz","exp.p","meth.p","cnv.p"and "pathdata". |
The function getExampleData(ExampleData = "exp.p)") obtains a vector of lncRNAs confirmed to be related with breast cancer. The function getExampleData(ExampleData = "Profile") obtains the expression pr
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S. et al. (2005) Gene set enrichment analysis: a knowledgebased approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A, 102, 15545-15550.
'getExpp' perform t-test on Expression profile data
getExpp(exp_data, label, p.adjust = TRUE, method = "fdr")
getExpp(exp_data, label, p.adjust = TRUE, method = "fdr")
exp_data |
A data frame, the expression profile to calculate p-value for each gene, the rownames should be the symbol of genes. |
label |
A vector of 0/1s, indicating the class of samples in the expression profile, 0 represents case, 1 represents control. |
p.adjust |
Logical,tell if returns corrected p-values |
method |
Correction method,which can be one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", |
For a given expression profile of two conditions, ICDS package provide t-test method to calculate p-values or corrected p-values(if p.adjust=TRUE,return corrected p-values,if p.adjust=FALSE,return p-values.) for each genes. The row of the expression profile should be gene symbols and the column of the expression profile should be names of samples. Samples should be under two conditions and the label should be given as 0 and 1.
A numeric vector of p-values or corrected p-values
profile<-GetExampleData("exp_data") label<-GetExampleData("label1") getExpp(profile,label,p.adjust=FALSE)
profile<-GetExampleData("exp_data") label<-GetExampleData("label1") getExpp(profile,label,p.adjust=FALSE)
'getMethp' perform t-test on Methylation profile data
getMethp(meth_data, label, p.adjust = TRUE, method = "fdr")
getMethp(meth_data, label, p.adjust = TRUE, method = "fdr")
meth_data |
A data frame, the Methylation profile to calculate p-value for each gene, the rownames should be the symbol of genes. |
label |
label A vector of 0/1s, indicating the class of samples in the Methylation profile, 0 represents case, 1 represents control. |
p.adjust |
Logical,tell if returns corrected p-values |
method |
Correction method,which can be one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", |
For a given Methylation profile of two conditions, ICDS package provide t-test method to calculate p-values or corrected p-values(if p.adjust=TRUE,return corrected p-values,if p.adjust=FALSE,return p-values.) for each genes. The row of the Methylation profile should be gene symbols and the column of the Methylation profile should be names of samples. Samples should be under two conditions and the label should be given as 0 and 1.
A numeric vector of p-values or corrected p-values
profile<-GetExampleData("meth_data") label<-GetExampleData("label2") getMethp(profile,label,p.adjust=FALSE)
profile<-GetExampleData("meth_data") label<-GetExampleData("label2") getMethp(profile,label,p.adjust=FALSE)
'opt_subpath' Optimize interested subpathways.If the number of genes shared by the two pathways accounted for more than the Overlap ratio of each pathway genes,then combine two pathways.
opt_subpath(subpathdata, zz, overlap = 0.6)
opt_subpath(subpathdata, zz, overlap = 0.6)
subpathdata |
interested subpathways |
zz |
a vector of z-scores |
overlap |
Overlap ratio of each two pathway genes |
Optimized subpathway:the number of genes shared by any two pathways accounted for less than the Overlap ratio of each pathway genes.
zz<-GetExampleData("zzz") subpathdata<-GetExampleData("subpathdata") optsubpath<-opt_subpath(subpathdata,zz,overlap=0.6)
zz<-GetExampleData("zzz") subpathdata<-GetExampleData("subpathdata") optsubpath<-opt_subpath(subpathdata,zz,overlap=0.6)
the permutation test method 1 and method 2 were used to calculate the statistical significance level for these optimal subpathways.
Permutation( subpathwayz, zz, nperm1 = 1000, method1 = TRUE, nperm2 = 1000, method2 = FALSE )
Permutation( subpathwayz, zz, nperm1 = 1000, method1 = TRUE, nperm2 = 1000, method2 = FALSE )
subpathwayz |
Optimize intersted subpathways |
zz |
a vector of z-scores |
nperm1 |
times of permutation to perform use method1 |
method1 |
permutation analysis method1 |
nperm2 |
times of permutation to perform use method2 |
method2 |
permutation analysis method2 |
the statistical significance p value and FDR for these optimal subpathways
require(graphite) keysubpathways<-GetExampleData("keysubpathways") zzz<-GetExampleData("zzz") Permutation(keysubpathways,zzz,nperm1=10,method1=TRUE,nperm2=10,method2=FALSE)
require(graphite) keysubpathways<-GetExampleData("keysubpathways") zzz<-GetExampleData("zzz") Permutation(keysubpathways,zzz,nperm1=10,method1=TRUE,nperm2=10,method2=FALSE)
PlotSubpathway:plot a network graph when user input a list of gene
PlotSubpathway( subpID, pathway.name, zz, Pathway = "kegg", layout = layout.fruchterman.reingold )
PlotSubpathway( subpID, pathway.name, zz, Pathway = "kegg", layout = layout.fruchterman.reingold )
subpID |
gene list of a interested subpathway |
pathway.name |
name of the interested subpathway |
zz |
z-score of each gene |
Pathway |
the name of the pathway database |
layout |
The layout specification( |
Network graph
require(graphite) subpID<-unlist(strsplit("ACSS1/ALDH3B2/ADH1B/ADH1A/ALDH2/DLAT/ACSS2","/")) pathway.name="Glycolysis / Gluconeogenesis" zzz<- GetExampleData("zzz") PlotSubpathway(subpID=subpID,pathway.name=pathway.name,zz=zzz)
require(graphite) subpID<-unlist(strsplit("ACSS1/ALDH3B2/ADH1B/ADH1A/ALDH2/DLAT/ACSS2","/")) pathway.name="Glycolysis / Gluconeogenesis" zzz<- GetExampleData("zzz") PlotSubpathway(subpID=subpID,pathway.name=pathway.name,zz=zzz)