

【Code】OGB:图机器学习的基准测试数据集
source link: http://mp.weixin.qq.com/s?__biz=MjM5ODkzMzMwMQ%3D%3D&%3Bmid=2650414659&%3Bidx=3&%3Bsn=a5c63ab9dbfbcf49d011aff49af0890a
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

1.1 Overview
Open Graph Benchmark(以下简称 OGB)是斯坦福大学的同学开源的 Python 库,其包含了图机器学习(以下简称图 ML)的基准数据集、数据加载器和评估器,目的在于促进可扩展的、健壮的、可复现的图 ML 的研究。
OGB 包含了多种图机器学习的多种任务,并且涵盖从社会和信息网络到生物网络,分子图和知识图的各种领域。没有数据集都有特定的数据拆分和评估指标,从而提供统一的评估协议。
OGB 提供了一个自动的端到端图 ML 的 pipeline,该 pipeline 简化并标准化了图数据加载,实验设置和模型评估的过程。如下图所示:

下图展示了 OGB 的三个维度,包括任务类型(Tasks)、可扩展性(Scale)、领域(Rich domains)。

1.2 Dataset
来看一下 OGB 现在包含的数据集:

和数据集的统计明细:

1.3 Leaderboard
OGB 也提供了标准化的评估人员和排行榜,以跟踪最新的结果,我们来看下不同任务下的部分 Leaderboard。
节点分类:
链接预测:
图分类:
2.OGB+DGL
官方给出的例子都是基于 PyG 实现的,我们这里实现一个基于 DGL 例子。
2.1 环境准备
导入数据包
import dgl import ogb import math import time import numpy as np import torch import torch.nn as nn import torch.nn.functional as F from ogb.nodeproppred import DglNodePropPredDataset, Evaluator
查看版本
print(dgl.__version__) print(torch.__version__) print(ogb.__version__)
0.4.3post2 1.5.0+cu101 1.1.1
cuda 相关信息
print(torch.version.cuda) print(torch.cuda.is_available()) print(torch.cuda.device_count()) print(torch.cuda.get_device_name(0)) print(torch.cuda.current_device())
10.1 True 1 Tesla P100-PCIE-16GB 0
2.2 数据准备
设置参数
device_id=0 # GPU 的使用 id n_layers=3 # 输入层 + 隐藏层 + 输出层的数量 n_hiddens=256 # 隐藏层节点的数量 dropout=0.5 lr=0.01 epochs=300 runs=10 # 跑 10 次,取平均 log_steps=50
定义训练函数、测试函数和日志记录
def train(model, g, feats, y_true, train_idx, optimizer): """ 训练函数 """ model.train() optimizer.zero_grad() out = model(g, feats)[train_idx] loss = F.nll_loss(out, y_true.squeeze(1)[train_idx]) loss.backward() optimizer.step() return loss.item() @torch.no_grad() def test(model, g, feats, y_true, split_idx, evaluator): """ 测试函数 """ model.eval() out = model(g, feats) y_pred = out.argmax(dim=-1, keepdim=True) train_acc = evaluator.eval({ 'y_true': y_true[split_idx['train']], 'y_pred': y_pred[split_idx['train']], })['acc'] valid_acc = evaluator.eval({ 'y_true': y_true[split_idx['valid']], 'y_pred': y_pred[split_idx['valid']], })['acc'] test_acc = evaluator.eval({ 'y_true': y_true[split_idx['test']], 'y_pred': y_pred[split_idx['test']], })['acc'] return train_acc, valid_acc, test_acc class Logger(object): """ 用于日志记录 """ def __init__(self, runs, info=None): self.info = info self.results = [[] for _ in range(runs)] def add_result(self, run, result): assert len(result) == 3 assert run >= 0 and run < len(self.results) self.results[run].append(result) def print_statistics(self, run=None): if run is not None: result = 100 * torch.tensor(self.results[run]) argmax = result[:, 1].argmax().item() print(f'Run {run + 1:02d}:') print(f'Highest Train: {result[:, 0].max():.2f}') print(f'Highest Valid: {result[:, 1].max():.2f}') print(f' Final Train: {result[argmax, 0]:.2f}') print(f' Final Test: {result[argmax, 2]:.2f}') else: result = 100 * torch.tensor(self.results) best_results = [] for r in result: train1 = r[:, 0].max().item() valid = r[:, 1].max().item() train2 = r[r[:, 1].argmax(), 0].item() test = r[r[:, 1].argmax(), 2].item() best_results.append((train1, valid, train2, test)) best_result = torch.tensor(best_results) print(f'All runs:') r = best_result[:, 0] print(f'Highest Train: {r.mean():.2f} ± {r.std():.2f}') r = best_result[:, 1] print(f'Highest Valid: {r.mean():.2f} ± {r.std():.2f}') r = best_result[:, 2] print(f' Final Train: {r.mean():.2f} ± {r.std():.2f}') r = best_result[:, 3] print(f' Final Test: {r.mean():.2f} ± {r.std():.2f}')
加载数据
device = f'cuda:{device_id}' if torch.cuda.is_available() else 'cpu' device = torch.device(device) # 加载数据,name 为 'ogbn-' + 数据集名 # 自己可以打印出 dataset 看一下 dataset = DglNodePropPredDataset(name='ogbn-arxiv') split_idx = dataset.get_idx_split() g, labels = dataset[0] feats = g.ndata['feat'] g = dgl.to_bidirected(g) feats, labels = feats.to(device), labels.to(device) train_idx = split_idx['train'].to(device)
2.3 GCN
实现一个基本的 GCN,这里对每一层都进行了一个 Batch Normalization,去掉的话,精度会下降 2% 左右。
from dgl.nn import GraphConv class GCN(nn.Module): def __init__(self, in_feats, n_hiddens, n_classes, n_layers, dropout): super(GCN, self).__init__() self.layers = nn.ModuleList() self.bns = nn.ModuleList() self.layers.append(GraphConv(in_feats, n_hiddens, 'both')) self.bns.append(nn.BatchNorm1d(n_hiddens)) for _ in range(n_layers - 2): self.layers.append(GraphConv(n_hiddens, n_hiddens, 'both')) self.bns.append(nn.BatchNorm1d(n_hiddens)) self.layers.append(GraphConv(n_hiddens, n_classes, 'both')) self.dropout = dropout def reset_parameters(self): for layer in self.layers: layer.reset_parameters() for bn in self.bns: bn.reset_parameters() def forward(self, g, x): for i, layer in enumerate(self.layers[:-1]): x = layer(g, x) x = self.bns[i](x) x = F.relu(x) x = F.dropout(x, p=self.dropout, training=self.training) x = self.layers[-1](g, x) return x.log_softmax(dim=-1)
model = GCN(in_feats=feats.size(-1), n_hiddens=n_hiddens, n_classes=dataset.num_classes, n_layers=n_layers, dropout=dropout).to(device) evaluator = Evaluator(name='ogbn-arxiv') logger = Logger(runs) for run in range(runs): model.reset_parameters() optimizer = torch.optim.Adam(model.parameters(), lr=lr) for epoch in range(1, 1 + epochs): loss = train(model, g, feats, labels, train_idx, optimizer) result = test(model, g, feats, labels, split_idx, evaluator) logger.add_result(run, result) if epoch % log_steps == 0: train_acc, valid_acc, test_acc = result print(f'Run: {run + 1:02d}, ' f'Epoch: {epoch:02d}, ' f'Loss: {loss:.4f}, ' f'Train: {100 * train_acc:.2f}%, ' f'Valid: {100 * valid_acc:.2f}% ' f'Test: {100 * test_acc:.2f}%') logger.print_statistics(run) logger.print_statistics()
Run: 01, Epoch: 50, Loss: 1.1489, Train: 68.71%, Valid: 68.93% Test: 68.32% Run: 01, Epoch: 100, Loss: 1.0565, Train: 71.29%, Valid: 69.61% Test: 68.03% Run: 01, Epoch: 150, Loss: 1.0010, Train: 72.28%, Valid: 70.57% Test: 70.00% Run: 01, Epoch: 200, Loss: 0.9647, Train: 73.18%, Valid: 69.79% Test: 67.97% Training time/epoch 0.2617543590068817 Run 01: Highest Train: 73.54 Highest Valid: 71.16 Final Train: 73.08 Final Test: 70.43 Run: 02, Epoch: 50, Loss: 1.1462, Train: 68.83%, Valid: 68.69% Test: 68.50% Run: 02, Epoch: 100, Loss: 1.0583, Train: 71.17%, Valid: 69.54% Test: 68.06% Run: 02, Epoch: 150, Loss: 1.0013, Train: 71.98%, Valid: 69.71% Test: 68.06% Run: 02, Epoch: 200, Loss: 0.9626, Train: 73.23%, Valid: 69.76% Test: 67.79% Training time/epoch 0.26154680013656617 Run 02: Highest Train: 73.34 Highest Valid: 70.87 Final Train: 72.56 Final Test: 70.42 Run: 03, Epoch: 50, Loss: 1.1508, Train: 68.93%, Valid: 68.49% Test: 67.14% Run: 03, Epoch: 100, Loss: 1.0527, Train: 70.90%, Valid: 69.75% Test: 68.77% Run: 03, Epoch: 150, Loss: 1.0042, Train: 72.54%, Valid: 70.71% Test: 69.36% Run: 03, Epoch: 200, Loss: 0.9679, Train: 73.13%, Valid: 69.92% Test: 68.05% Training time/epoch 0.26173179904619853 Run 03: Highest Train: 73.44 Highest Valid: 71.04 Final Train: 73.06 Final Test: 70.53 Run: 04, Epoch: 50, Loss: 1.1507, Train: 69.02%, Valid: 68.81% Test: 68.09% Run: 04, Epoch: 100, Loss: 1.0518, Train: 71.30%, Valid: 70.19% Test: 68.78% Run: 04, Epoch: 150, Loss: 0.9951, Train: 72.05%, Valid: 68.20% Test: 65.38% Run: 04, Epoch: 200, Loss: 0.9594, Train: 72.98%, Valid: 70.47% Test: 69.26% Training time/epoch 0.2618525844812393 Run 04: Highest Train: 73.34 Highest Valid: 70.88 Final Train: 72.86 Final Test: 70.60 Run: 05, Epoch: 50, Loss: 1.1500, Train: 68.82%, Valid: 69.00% Test: 68.47% Run: 05, Epoch: 100, Loss: 1.0566, Train: 71.13%, Valid: 70.15% Test: 69.47% Run: 05, Epoch: 150, Loss: 0.9999, Train: 72.48%, Valid: 70.88% Test: 70.27% Run: 05, Epoch: 200, Loss: 0.9648, Train: 73.37%, Valid: 70.51% Test: 68.96% Training time/epoch 0.261941517829895 Run 05: Highest Train: 73.37 Highest Valid: 70.93 Final Train: 72.77 Final Test: 70.24 Run: 06, Epoch: 50, Loss: 1.1495, Train: 69.00%, Valid: 68.76% Test: 67.89% Run: 06, Epoch: 100, Loss: 1.0541, Train: 71.24%, Valid: 69.74% Test: 68.21% Run: 06, Epoch: 150, Loss: 0.9947, Train: 71.89%, Valid: 69.81% Test: 69.77% Run: 06, Epoch: 200, Loss: 0.9579, Train: 73.45%, Valid: 70.50% Test: 69.60% Training time/epoch 0.2620268513758977 Run 06: Highest Train: 73.70 Highest Valid: 70.97 Final Train: 73.70 Final Test: 70.12 Run: 07, Epoch: 50, Loss: 1.1544, Train: 68.93%, Valid: 68.81% Test: 67.97% Run: 07, Epoch: 100, Loss: 1.0562, Train: 71.17%, Valid: 69.79% Test: 68.45% Run: 07, Epoch: 150, Loss: 1.0016, Train: 72.41%, Valid: 70.65% Test: 69.87% Run: 07, Epoch: 200, Loss: 0.9627, Train: 73.12%, Valid: 69.97% Test: 68.20% Training time/epoch 0.2620680228301457 Run 07: Highest Train: 73.40 Highest Valid: 71.02 Final Train: 73.08 Final Test: 70.49 Run: 08, Epoch: 50, Loss: 1.1508, Train: 68.89%, Valid: 68.42% Test: 67.68% Run: 08, Epoch: 100, Loss: 1.0536, Train: 71.24%, Valid: 69.24% Test: 67.01% Run: 08, Epoch: 150, Loss: 1.0015, Train: 72.36%, Valid: 69.57% Test: 67.76% Run: 08, Epoch: 200, Loss: 0.9593, Train: 73.42%, Valid: 70.86% Test: 70.02% Training time/epoch 0.2621182435750961 Run 08: Highest Train: 73.43 Highest Valid: 70.93 Final Train: 73.43 Final Test: 69.92 Run: 09, Epoch: 50, Loss: 1.1457, Train: 69.17%, Valid: 68.83% Test: 67.67% Run: 09, Epoch: 100, Loss: 1.0496, Train: 71.45%, Valid: 69.86% Test: 68.53% Run: 09, Epoch: 150, Loss: 0.9941, Train: 72.51%, Valid: 69.38% Test: 67.02% Run: 09, Epoch: 200, Loss: 0.9587, Train: 73.49%, Valid: 70.35% Test: 68.59% Training time/epoch 0.2621259101231893 Run 09: Highest Train: 73.64 Highest Valid: 70.97 Final Train: 73.22 Final Test: 70.46 Run: 10, Epoch: 50, Loss: 1.1437, Train: 69.16%, Valid: 68.43% Test: 67.17% Run: 10, Epoch: 100, Loss: 1.0473, Train: 71.43%, Valid: 70.33% Test: 69.29% Run: 10, Epoch: 150, Loss: 0.9936, Train: 71.98%, Valid: 67.93% Test: 65.06% Run: 10, Epoch: 200, Loss: 0.9583, Train: 72.93%, Valid: 68.05% Test: 65.43% Training time/epoch 0.26213142466545103 Run 10: Highest Train: 73.44 Highest Valid: 70.93 Final Train: 73.44 Final Test: 70.26 All runs: Highest Train: 73.46 ± 0.12 Highest Valid: 70.97 ± 0.09 Final Train: 73.12 ± 0.34 Final Test: 70.35 ± 0.21
3.Conclusion
目前,OGB 才刚刚起步,5 月 4 号刚发布第一个主要版本,未来还会扩展到千万级别节点的数据集。OGB 这样的多样且统一的基准的出现对 GNN 来说是非常重要的一步,希望也能形成与 NLP、CV 等领域类似的 Leaderboard,不至于每次论文都是在 Cora, CiteSeer 等玩具型数据集上做实验了。
4.Reference
-
Hu, Weihua et al. “Open Graph Benchmark: Datasets for Machine Learning on Graphs.” ArXiv abs/2005.00687 (2020): n. pag.
-
《Open Graph Benchmark》
-
《Github: snap-stanford/ogb》
-
《Github: dmlc/dgl》
-
Presentation and Discussion: Open Graph Benchmark
推荐阅读
征稿启示| 200元稿费+5000DBC(价值20个小时GPU算力)
文本自动摘要任务的“不完全”心得总结番外篇——submodular函数优化
斯坦福大学NLP组Python深度学习自然语言处理工具Stanza试用
太赞了!Springer面向公众开放电子书籍,附65本数学、编程、机器学习、深度学习、数据挖掘、数据科学等书籍链接及打包下载
数学之美中盛赞的 Michael Collins 教授,他的NLP课程要不要收藏?
关于AINLP
AINLP 是一个有趣有AI的自然语言处理社区,专注于 AI、NLP、机器学习、深度学习、推荐算法等相关技术的分享,主题包括文本摘要、智能问答、聊天机器人、机器翻译、自动生成、知识图谱、预训练模型、推荐系统、计算广告、招聘信息、求职经验分享等,欢迎关注!加技术交流群请添加AINLPer(id:ainlper),备注工作/研究方向+加群目的。
阅读至此了,点个在看吧 :point_down:
Recommend
-
58
基准测试可以说是唯一方便有效的、可以学习系统在给定工作负载下会发生什么的方法。可见基准测试的重要性,在设计基准测试前,我们要了解一下基准测试的常见错误,防止自己在测试中犯这些错误。数据库基准测试...
-
35
-
62
-
60
IT之家10月17日消息昨晚华为在英国伦敦正式发布了旗下最新旗舰华为Mate20系列。现在华为Mate20Pro的性能基准测试结果已经出炉,参与测试的机型有:华为Mate20Pro、苹果iPhoneXSMax、iPhoneX、三星Galax
-
63
-
63
-
28
OpenJDK 微基准测试套件(OpenJDK Microbenchmark Suite,JEP 230) 基于 Java Microbenchmark Harness(JMH)
-
25
什么是基准测试当我们对数据库进行优化后,只有进行测量系统性能才能知道优化是否有效,这种测量的方式就是基准测试。基准测试的定义如下:基准测试是一种测量和评估软件性能指标的活动用于建立某个时刻的性能基准,以便当系统发生软/硬件变化时,重新进行基准测试...
-
40
1. 背景 基准测试这个单词在工作中相信大家都经常会遇到,在我刚开始工作的时候,看一些文档的时候老是会碰见基准测试,当初以为基准测试就是简单的性能测试。但是随着后面的一些经验,发现基准测试并不是这么的简单,最近也在...
-
3
MLX vs MPS vs CUDA:苹果新机器学习框架的基准测试 作者:佚名 2023-12-21 14:02:35 人工智能 如果你是一个Mac用户和一个深度学习爱好者,你...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK