大家好,又见面了,我是你们的朋友全栈君。
最近在分析观测数据的因果关系时,发现一个很好用的工具包——CausalDiscoveryToolbox(以下简称Cdt),功能齐全,轻松上手因果发现。 下面简单整理下该工具包的原理+用法。
Cdt工具包对一般的因果建模流程进行了概括:
#mermaid-svg-OH6QleDehJ64NK3r .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-OH6QleDehJ64NK3r .label text{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .node rect,#mermaid-svg-OH6QleDehJ64NK3r .node circle,#mermaid-svg-OH6QleDehJ64NK3r .node ellipse,#mermaid-svg-OH6QleDehJ64NK3r .node polygon,#mermaid-svg-OH6QleDehJ64NK3r .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-OH6QleDehJ64NK3r .node .label{text-align:center;fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .node.clickable{cursor:pointer}#mermaid-svg-OH6QleDehJ64NK3r .arrowheadPath{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-OH6QleDehJ64NK3r .flowchart-link{stroke:#333;fill:none}#mermaid-svg-OH6QleDehJ64NK3r .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-OH6QleDehJ64NK3r .edgeLabel rect{opacity:0.9}#mermaid-svg-OH6QleDehJ64NK3r .edgeLabel span{color:#333}#mermaid-svg-OH6QleDehJ64NK3r .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-OH6QleDehJ64NK3r .cluster text{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-OH6QleDehJ64NK3r .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-OH6QleDehJ64NK3r text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-OH6QleDehJ64NK3r .actor-line{stroke:grey}#mermaid-svg-OH6QleDehJ64NK3r .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-OH6QleDehJ64NK3r .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-OH6QleDehJ64NK3r #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-OH6QleDehJ64NK3r .sequenceNumber{fill:#fff}#mermaid-svg-OH6QleDehJ64NK3r #sequencenumber{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r #crosshead path{fill:#333;stroke:#333}#mermaid-svg-OH6QleDehJ64NK3r .messageText{fill:#333;stroke:#333}#mermaid-svg-OH6QleDehJ64NK3r .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-OH6QleDehJ64NK3r .labelText,#mermaid-svg-OH6QleDehJ64NK3r .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-OH6QleDehJ64NK3r .loopText,#mermaid-svg-OH6QleDehJ64NK3r .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-OH6QleDehJ64NK3r .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-OH6QleDehJ64NK3r .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-OH6QleDehJ64NK3r .noteText,#mermaid-svg-OH6QleDehJ64NK3r .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-OH6QleDehJ64NK3r .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-OH6QleDehJ64NK3r .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-OH6QleDehJ64NK3r .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-OH6QleDehJ64NK3r .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .section{stroke:none;opacity:0.2}#mermaid-svg-OH6QleDehJ64NK3r .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-OH6QleDehJ64NK3r .section2{fill:#fff400}#mermaid-svg-OH6QleDehJ64NK3r .section1,#mermaid-svg-OH6QleDehJ64NK3r .section3{fill:#fff;opacity:0.2}#mermaid-svg-OH6QleDehJ64NK3r .sectionTitle0{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .sectionTitle1{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .sectionTitle2{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .sectionTitle3{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-OH6QleDehJ64NK3r .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .grid path{stroke-width:0}#mermaid-svg-OH6QleDehJ64NK3r .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-OH6QleDehJ64NK3r .task{stroke-width:2}#mermaid-svg-OH6QleDehJ64NK3r .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .taskText:not([font-size]){font-size:11px}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-OH6QleDehJ64NK3r .task.clickable{cursor:pointer}#mermaid-svg-OH6QleDehJ64NK3r .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-OH6QleDehJ64NK3r .taskText0,#mermaid-svg-OH6QleDehJ64NK3r .taskText1,#mermaid-svg-OH6QleDehJ64NK3r .taskText2,#mermaid-svg-OH6QleDehJ64NK3r .taskText3{fill:#fff}#mermaid-svg-OH6QleDehJ64NK3r .task0,#mermaid-svg-OH6QleDehJ64NK3r .task1,#mermaid-svg-OH6QleDehJ64NK3r .task2,#mermaid-svg-OH6QleDehJ64NK3r .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutside0,#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutside2{fill:#000}#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutside1,#mermaid-svg-OH6QleDehJ64NK3r .taskTextOutside3{fill:#000}#mermaid-svg-OH6QleDehJ64NK3r .active0,#mermaid-svg-OH6QleDehJ64NK3r .active1,#mermaid-svg-OH6QleDehJ64NK3r .active2,#mermaid-svg-OH6QleDehJ64NK3r .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-OH6QleDehJ64NK3r .activeText0,#mermaid-svg-OH6QleDehJ64NK3r .activeText1,#mermaid-svg-OH6QleDehJ64NK3r .activeText2,#mermaid-svg-OH6QleDehJ64NK3r .activeText3{fill:#000 !important}#mermaid-svg-OH6QleDehJ64NK3r .done0,#mermaid-svg-OH6QleDehJ64NK3r .done1,#mermaid-svg-OH6QleDehJ64NK3r .done2,#mermaid-svg-OH6QleDehJ64NK3r .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-OH6QleDehJ64NK3r .doneText0,#mermaid-svg-OH6QleDehJ64NK3r .doneText1,#mermaid-svg-OH6QleDehJ64NK3r .doneText2,#mermaid-svg-OH6QleDehJ64NK3r .doneText3{fill:#000 !important}#mermaid-svg-OH6QleDehJ64NK3r .crit0,#mermaid-svg-OH6QleDehJ64NK3r .crit1,#mermaid-svg-OH6QleDehJ64NK3r .crit2,#mermaid-svg-OH6QleDehJ64NK3r .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-OH6QleDehJ64NK3r .activeCrit0,#mermaid-svg-OH6QleDehJ64NK3r .activeCrit1,#mermaid-svg-OH6QleDehJ64NK3r .activeCrit2,#mermaid-svg-OH6QleDehJ64NK3r .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-OH6QleDehJ64NK3r .doneCrit0,#mermaid-svg-OH6QleDehJ64NK3r .doneCrit1,#mermaid-svg-OH6QleDehJ64NK3r .doneCrit2,#mermaid-svg-OH6QleDehJ64NK3r .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-OH6QleDehJ64NK3r .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-OH6QleDehJ64NK3r .milestoneText{font-style:italic}#mermaid-svg-OH6QleDehJ64NK3r .doneCritText0,#mermaid-svg-OH6QleDehJ64NK3r .doneCritText1,#mermaid-svg-OH6QleDehJ64NK3r .doneCritText2,#mermaid-svg-OH6QleDehJ64NK3r .doneCritText3{fill:#000 !important}#mermaid-svg-OH6QleDehJ64NK3r .activeCritText0,#mermaid-svg-OH6QleDehJ64NK3r .activeCritText1,#mermaid-svg-OH6QleDehJ64NK3r .activeCritText2,#mermaid-svg-OH6QleDehJ64NK3r .activeCritText3{fill:#000 !important}#mermaid-svg-OH6QleDehJ64NK3r .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-OH6QleDehJ64NK3r g.classGroup text .title{font-weight:bolder}#mermaid-svg-OH6QleDehJ64NK3r g.clickable{cursor:pointer}#mermaid-svg-OH6QleDehJ64NK3r g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-OH6QleDehJ64NK3r g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-OH6QleDehJ64NK3r .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-OH6QleDehJ64NK3r .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-OH6QleDehJ64NK3r .dashed-line{stroke-dasharray:3}#mermaid-svg-OH6QleDehJ64NK3r #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r .commit-id,#mermaid-svg-OH6QleDehJ64NK3r .commit-msg,#mermaid-svg-OH6QleDehJ64NK3r .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-OH6QleDehJ64NK3r g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-OH6QleDehJ64NK3r g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-OH6QleDehJ64NK3r g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-OH6QleDehJ64NK3r .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-OH6QleDehJ64NK3r .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-OH6QleDehJ64NK3r .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-OH6QleDehJ64NK3r .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-OH6QleDehJ64NK3r .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-OH6QleDehJ64NK3r .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-OH6QleDehJ64NK3r .edgeLabel text{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-OH6QleDehJ64NK3r .node circle.state-start{fill:black;stroke:black}#mermaid-svg-OH6QleDehJ64NK3r .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-OH6QleDehJ64NK3r #statediagram-barbEnd{fill:#9370db}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-state .divider{stroke:#9370db}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-OH6QleDehJ64NK3r .note-edge{stroke-dasharray:5}#mermaid-svg-OH6QleDehJ64NK3r .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family:'"trebuchet ms", verdana, arial';--mermaid-font-family:"Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-OH6QleDehJ64NK3r .error-icon{fill:#522}#mermaid-svg-OH6QleDehJ64NK3r .error-text{fill:#522;stroke:#522}#mermaid-svg-OH6QleDehJ64NK3r .edge-thickness-normal{stroke-width:2px}#mermaid-svg-OH6QleDehJ64NK3r .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-OH6QleDehJ64NK3r .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-OH6QleDehJ64NK3r .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-OH6QleDehJ64NK3r .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-OH6QleDehJ64NK3r .marker{fill:#333}#mermaid-svg-OH6QleDehJ64NK3r .marker.cross{stroke:#333}</p> <p>:root{--mermaid-font-family:"trebuchet ms", verdana, arial;}#mermaid-svg-OH6QleDehJ64NK3r{color:rgba(0, 0, 0, 0.75);font:;}
观测数据
因果发现算法
图恢复算法
无向图
因果有向图
Cdt工具包可以直接从观测数据中进行因果发现(获得因果有向图),也可以先恢复图结构(获得无向依赖图)之后,再进行因果发现(获得因果有向图)。
在Cdt工具包中实现了两种从原始数据中恢复无向依赖图的方法:
Cdt包的主要焦点是从观测数据中发现因果关系,从成对设置到全图建模。
直接使用pip安装:
pip install cdt
import cdt
from cdt import SETTINGS
SETTINGS.verbose=True
#SETTINGS.NJOBS=16
#SETTINGS.GPU=1
import networkx as nx
import matplotlib.pyplot as plt
plt.axis('off')
# Load data
data = pd.read_csv("lucas0_train.csv")
print(data.head())
# Finding the structure of the graph
glasso = cdt.independence.graph.Glasso()
skeleton = glasso.predict(data)
# Pairwise setting
model = cdt.causality.pairwise.ANM()
output_graph = model.predict(data, skeleton)
# Visualize causality graph
options = {
"node_color": "#A0CBE2",
"width": 1,
"node_size":400,
"edge_cmap": plt.cm.Blues,
"with_labels": True,
}
nx.draw_networkx(output_graph,**options)
上述代码输出的因果图:
示例数据: https://download.csdn.net/download/Bit_Coders/16241408
网络结构的绘制参考networkx文档: https://networkx.org/documentation/stable/index.html
该软件包分为5个模块:
程序包及其算法的结构:
cdt package
|
|- independence
| |- graph (Infering the skeleton from data)
| | |- Lasso variants (Randomized Lasso[1], Glasso[2], HSICLasso[3])
| | |- FSGNN (CGNN[12] variant for feature selection)
| | |- Skeleton recovery using feature selection algorithms (RFECV[5], LinearSVR[6], RRelief[7], ARD[8,9], DecisionTree)
| |
| |- stats (pairwise methods for dependency)
| |- Correlation (Pearson, Spearman, KendallTau)
| |- Kernel based (NormalizedHSIC[10])
| |- Mutual information based (MIRegression, Adjusted Mutual Information[11], Normalized mutual information[11])
|
|- data
| |- CausalPairGenerator (Generate causal pairs)
| |- AcyclicGraphGenerator (Generate FCM-based graphs)
| |- load_dataset (load standard benchmark datasets)
|
|- causality
| |- graph (methods for graph inference)
| | |- CGNN[12]
| | |- PC[13]
| | |- GES[13]
| | |- GIES[13]
| | |- LiNGAM[13]
| | |- CAM[13]
| | |- GS[23]
| | |- IAMB[24]
| | |- MMPC[25]
| | |- SAM[26]
| | |- CCDr[27]
| |
| |- pairwise (methods for pairwise inference)
| |- ANM[14] (Additive Noise Model)
| |- IGCI[15] (Information Geometric Causal Inference)
| |- RCC[16] (Randomized Causation Coefficient)
| |- NCC[17] (Neural Causation Coefficient)
| |- GNN[12] (Generative Neural Network -- Part of CGNN )
| |- Bivariate fit (Baseline method of regression)
| |- Jarfo[20]
| |- CDS[20]
| |- RECI[28]
|
|- metrics (Implements the metrics for graph scoring)
| |- Precision Recall
| |- SHD
| |- SID [29]
|
|- utils
|- Settings -> SETTINGS class (hardware settings)
|- loss -> MMD loss [21, 22] & various other loss functions
|- io -> for importing data formats
|- graph -> graph utilities
发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/170417.html原文链接:https://javaforall.cn