Interpretable Contrastive Learning for Networks

1. Source Code of cNRL and i-cNRL

Available in GitHub [github].

2. Source Code for Generating Experimental Results

[download]

3. Network Datasets

N1. Dolphin [1] [original source] [graphml format]

N2. Karate [2] [original source] [graphml format]

N3. Random [graphml format]

N4. Price [graphml format]

N5. p2p-Gnutella08 [3,4] [original source] [graphml format]

N6. Price 2 [graphml format]

N7. Enhanced Price [graphml format]

N8. Combined-AP/MS [5] [original source] [graphml format]

N9. LC-multiple [6] [original source] [graphml format]

N10. School-Day1 [7] [original source] [graphml format]

N11. School-Day2 [7] [original source] [graphml format]

* If you use these datasets for your publication, please follow the citation policy of each original source.

* We have preprocessed each original dataset to a graphml format dataset with graph-tool. You can load a network with:
>>> from graph_tool.all import *
>>> g = gt.load_graph("NAME_OF_FILE.xml.gz")

References

[1]

Lusseau et al., The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology 54, 4 (2003), 396–405.

[2]

Zachary. An information flow model for conflict and fission insmall groups. Journal of Anthropological Research 33, 4 (1977), 452–473.

[3]

Leskovec et al., Graph evolution:Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data 1, 1 (2007), 2–es.

[4]

Ripeanu et al., Mapping the GnutellaNetwork. IEEE Internet Computing 6, 1 (2002), 50.

[5]

Collins et al., Toward a comprehensive atlas of the physical interactome of Saccharomycescerevisiae. Molecular & Cellular Proteomics 6, 3 (2007), 439–450.

[6]

Reguly et al., Comprehensive curation and analysis of global interaction networksin Saccharomyces cerevisiae. Journal of Biology 5, 4 (2006), 11.

[7]

Stehlé et al., High-resolution measurements of face-to-face contactpatterns in a primary school. PloS one 6, 8 (2011).

5. Input Data and Commands for GraphSAGE to Generate Feature Matrices

G_T: Dolphin, G_B: Karate [data]

G_T: p2p-Gnutella08, G_B: Price 2[data]

G_T: LC-multiple, G_B: Combined-AP/MS [data]

G_T: School-Day2, G_B: School-Day1 [data]

To generate feature matices with GraphSAGE, use commands below after setting up GraphSAGE. Change DATA_DIR/FILE_PREFIX based on your file path (e.g., ./data/dolphin-karate)

>>> python -m graphsage.unsupervised_train --train_prefix DATA_DIR/FILE_PREFIX --model graphsage_maxpool --max_total_steps 1000 --validate_iter 10 --dim_1 12 --dim_2 12 --base_log_dir .

Indices in a generated data include both G_T and G_B. Indices from 0 to (n_T-1) correspond to nodes in G_T. Indices from n_T to (n_T + n_B-1) correspond to nodes in G_B.

6. Learned Feature Matrices with DeepGL / GraphSAGE

G_T: Dolphin, G_B: Karate [DeepGL][GraphSAGE]

G_T: Price, G_B: Random [DeepGL][GraphSAGE]

G_T: Random, G_B: Price [DeepGL][GraphSAGE]

G_T: p2p-Gnutella08, G_B: Price 2 [DeepGL][GraphSAGE]

G_T: p2p-Gnutella08, G_B: Enhanced Price [DeepGL]

G_T: LC-multiple, G_B: Combined-AP/MS [DeepGL][GraphSAGE]

G_T: School-Day2, G_B: School-Day1 [DeepGL][GraphSAGE]

Each learned feature matrix with DeepGL contains

*_tg.npy: feature matrix of G_T

*_bg.npy: feature matrix of G_B

*.feat_defs.npy: learned feature definitions

Each learned feature matrix with GraphSAGE contains

*_tg_bg.npy: feature values for each node of both G_T and G_B

*_node_id.txt: node indices corresponding to lines of *_tg_bg.npy. Indices from 0 to (n_T-1) correspond to nodes in G_T. Indices from n_T to (n_T + n_B-1) correspond to nodes in G_B.

.npy file can be loaded with NumPy: numpy.load(FILE_PATH)