

Introducing the ArangoDB-DGL Adapter
source link: https://www.arangodb.com/2022/01/introducing-the-arangodb-dgl-adapter/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

ArangoDB DGL Adapter Getting Started Guide¶
Version: 1.0.2
Objective: Export Graphs from ArangoDB, a multi-model Graph Database, to Deep Graph Library (DGL), a python package for graph neural networks, and vice-versa.
Setup¶
%%capture !git clone -b oasis_connector --single-branch https://github.com/arangodb/interactive_tutorials.git !git clone -b 1.0.2 --single-branch https://github.com/arangoml/dgl-adapter.git !rsync -av interactive_tutorials/ ./ --exclude=.git !pip3 install adbdgl_adapter==1.0.2 !pip3 install matplotlib !pip3 install pyArango !pip3 install networkx ## For drawing purposes
import json import oasis import matplotlib.pyplot as plt import dgl import torch import networkx as nx from dgl import remove_self_loop from dgl.data import KarateClubDataset from dgl.data import MiniGCDataset from adbdgl_adapter.adapter import ADBDGL_Adapter from adbdgl_adapter.controller import ADBDGL_Controller from adbdgl_adapter.typings import Json, ArangoMetagraph, DGLCanonicalEType, DGLDataDict
DGL backend not selected or invalid. Assuming PyTorch for now.
Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable. Valid options are: pytorch, mxnet, tensorflow (all lowercase)
Using backend: pytorch
Understanding DGL¶
(referenced from docs.dgl.ai)
Deep Graph Library (DGL) is a Python package built for easy implementation of graph neural network model family, on top of existing DL frameworks (currently supporting PyTorch, MXNet and TensorFlow).
DGL represents a directed graph as a DGLGraph
object. You can construct a graph by specifying the number of nodes in the graph as well as the list of source and destination nodes. Nodes in the graph have consecutive IDs starting from 0.
The following code constructs a directed "star" homogeneous graph with 6 nodes and 5 edges.
# A homogeneous graph with 6 nodes, and 5 edges g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5])) print(g) # Print the graph's canonical edge types print("\nCanonical Edge Types: ", g.canonical_etypes) # [('_N', '_E', '_N')] # '_N' being the only Node type # '_E' being the only Edge type
Graph(num_nodes=6, num_edges=5, ndata_schemes={} edata_schemes={}) Canonical Edge Types: [('_N', '_E', '_N')]
In DGL, a heterogeneous graph (heterograph for short) is specified with a series of graphs as below, one per relation. Each relation is a string triplet (source node type, edge type, destination node type)
. Since relations disambiguate the edge types, DGL calls them canonical edge types:
# A heterogeneous graph with 8 nodes, and 7 edges g = dgl.heterograph({ ('user', 'follows', 'user'): (torch.tensor([0, 1]), torch.tensor([1, 2])), ('user', 'follows', 'game'): (torch.tensor([0, 1, 2]), torch.tensor([1, 2, 3])), ('user', 'plays', 'game'): (torch.tensor([1, 3]), torch.tensor([2, 3])) }) print(g) print("\nCanonical Edge Types: ", g.canonical_etypes) print("\nNode Types: ", g.ntypes) print("\nEdge Types: ", g.etypes)
Graph(num_nodes={'game': 4, 'user': 4}, num_edges={('user', 'follows', 'game'): 3, ('user', 'follows', 'user'): 2, ('user', 'plays', 'game'): 2}, metagraph=[('user', 'game', 'follows'), ('user', 'game', 'plays'), ('user', 'user', 'follows')]) Canonical Edge Types: [('user', 'follows', 'game'), ('user', 'follows', 'user'), ('user', 'plays', 'game')] Node Types: ['game', 'user'] Edge Types: ['follows', 'follows', 'plays']
Many graph data contain attributes on nodes and edges. Although the types of node and edge attributes can be arbitrary in real world, DGLGraph only accepts attributes stored in tensors (with numerical contents). Consequently, an attribute of all the nodes or edges must have the same shape. In the context of deep learning, those attributes are often called features.
You can assign and retrieve node and edge features via ndata and edata interface.
# A homogeneous graph with 6 nodes, and 5 edges g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5])) # Assign an integer value for each node. g.ndata['x'] = torch.tensor([151, 124, 41, 89, 76, 55]) # Assign a 4-dimensional edge feature vector for each edge. g.edata['a'] = torch.randn(5, 4) print(g) print("\nNode Data X attribute: ", g.ndata['x']) print("\nEdge Data A attribute: ", g.edata['a']) # NOTE: The following line ndata insertion will fail, since not all nodes have been assigned an attribute value # g.ndata['bad_attribute'] = torch.tensor([0,10,20,30,40])
Graph(num_nodes=6, num_edges=5, ndata_schemes={'x': Scheme(shape=(), dtype=torch.int64)} edata_schemes={'a': Scheme(shape=(4,), dtype=torch.float32)}) Node Data X attribute: tensor([151, 124, 41, 89, 76, 55]) Edge Data A attribute: tensor([[-0.9712, 0.3131, -1.7787, -0.4953], [ 1.5366, -0.8591, -1.4719, 0.5857], [-0.5803, 0.6757, 0.9276, -0.9756], [ 0.4396, 1.0612, 0.0943, 0.6856], [-0.8685, -1.3693, -0.1184, -1.0903]])
When multiple node/edge types are introduced, users need to specify the particular node/edge type when invoking a DGLGraph API for type-specific information. In addition, nodes/edges of different types have separate IDs.
g = dgl.heterograph({ ('user', 'follows', 'user'): (torch.tensor([0, 1]), torch.tensor([1, 2])), ('user', 'follows', 'game'): (torch.tensor([0, 1, 2]), torch.tensor([1, 2, 3])), ('user', 'plays', 'game'): (torch.tensor([1, 3]), torch.tensor([2, 3])) }) # Get the number of all nodes in the graph print("All nodes: ", g.num_nodes()) # Get the number of user nodes print("User nodes: ", g.num_nodes('user')) # Nodes of different types have separate IDs, # hence not well-defined without a type specified # print(g.nodes()) #DGLError: Node type name must be specified if there are more than one node types. print(g.nodes('user'))
All nodes: 8 User nodes: 4 tensor([0, 1, 2, 3])
To set/get features for a specific node/edge type, DGL provides two new types of syntax – g.nodes[‘node_type’].data[‘feat_name’] and g.edges[‘edge_type’].data[‘feat_name’].
Note: If the graph only has one node/edge type, there is no need to specify the node/edge type.
g = dgl.heterograph({ ('user', 'follows', 'user'): (torch.tensor([0, 1]), torch.tensor([1, 2])), ('user', 'follows', 'game'): (torch.tensor([0, 1, 2]), torch.tensor([1, 2, 3])), ('user', 'plays', 'game'): (torch.tensor([1, 3]), torch.tensor([2, 3])) }) g.nodes['user'].data['age'] = torch.tensor([21, 16, 38, 64]) # An alternative (yet equivalent) syntax: # g.ndata['age'] = {'user': torch.tensor([21, 16, 38, 64])} print(g.ndata)
defaultdict(<class 'dict'>, {'age': {'user': tensor([21, 16, 38, 64])}})
For more info, visit https://docs.dgl.ai/en/0.6.x/.
Create a Temporary ArangoDB Instance¶
# Request temporary instance from the managed ArangoDB Cloud Oasis. con = oasis.getTempCredentials() # Connect to the db via the python-arango driver db = oasis.connect_python_arango(con) print('\n--------------------') print("https://{}:{}".format(con["hostname"], con["port"])) print("Username: " + con["username"]) print("Password: " + con["password"]) print("Database: " + con["dbName"]) print('--------------------\n')
Requesting new temp credentials. Temp database ready to use. -------------------- https://tutorials.arangodb.cloud:8529 Username: TUT487i8kal98gb73c2iklds Password: TUTn5t85w8t50kcupmo2mmyb Database: TUTn187e39v9qho3768ilyk4 --------------------
Feel free to use to above URL to checkout the UI!
Data Import¶
For demo purposes, we will be using the ArangoDB Fraud Detection example graph.
%%capture !chmod -R 755 ./tools !./tools/arangorestore -c none --server.endpoint http+ssl://{con["hostname"]}:{con["port"]} --server.username {con["username"]} --server.database {con["dbName"]} --server.password {con["password"]} --replication-factor 3 --input-directory "dgl-adapter/examples/data/fraud_dump" --include-system-collections true
Instantiate the Adapter¶
Connect the ArangoDB-DGL Adapter to our temporary ArangoDB cluster:
adbdgl_adapter = ADBDGL_Adapter(con)
Connecting to https://tutorials.arangodb.cloud:8529
ArangoDB to DGL¶
Via ArangoDB Graph¶
Data source
- ArangoDB Fraud-Detection Graph
Package methods used
Important notes
- The
name
parameter in this case must point to an existing ArangoDB graph in your ArangoDB instance.
# Define graph name graph_name = "fraud-detection" # Create DGL graph from ArangoDB graph dgl_g = adbdgl_adapter.arangodb_graph_to_dgl(graph_name) # You can also provide valid Python-Arango AQL query options to the command above, like such: # dgl_g = aadbdgl_adapter.arangodb_graph_to_dgl(graph_name, ttl=1000, stream=True) # See more here: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute # Show graph data print('\n--------------------') print(dgl_g) print(dgl_g.ntypes) print(dgl_g.etypes)
DGL: fraud-detection created -------------------- Graph(num_nodes={'account': 54, 'customer': 17}, num_edges={('account', 'accountHolder', 'customer'): 54, ('account', 'transaction', 'account'): 62}, metagraph=[('account', 'customer', 'accountHolder'), ('account', 'account', 'transaction')]) ['account', 'customer'] ['accountHolder', 'transaction']
Via ArangoDB Collections¶
Data source
- ArangoDB Fraud-Detection Collections
Package methods used
Important notes
- The
name
parameter in this case is simply for naming your DGL graph. - The
vertex_collections
&edge_collections
parameters must point to existing ArangoDB collections within your ArangoDB instance.
# Define collection vertex_collections = {"account", "Class", "customer"} edge_collections = {"accountHolder", "Relationship", "transaction"} # Create DGL from ArangoDB collections dgl_g = adbdgl_adapter.arangodb_collections_to_dgl("fraud-detection", vertex_collections, edge_collections) # You can also provide valid Python-Arango AQL query options to the command above, like such: # dgl_g = adbdgl_adapter.arangodb_collections_to_dgl("fraud-detection", vertex_collections, edge_collections, ttl=1000, stream=True) # See more here: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute # Show graph data print('\n--------------------') print(dgl_g) print(dgl_g.ntypes) print(dgl_g.etypes)
DGL: fraud-detection created -------------------- Graph(num_nodes={'Class': 4, 'account': 54, 'customer': 17}, num_edges={('Class', 'Relationship', 'Class'): 4, ('account', 'accountHolder', 'customer'): 54, ('account', 'transaction', 'account'): 62}, metagraph=[('Class', 'Class', 'Relationship'), ('account', 'customer', 'accountHolder'), ('account', 'account', 'transaction')]) ['Class', 'account', 'customer'] ['Relationship', 'accountHolder', 'transaction']
Via ArangoDB Metagraph¶
Data source
- ArangoDB Fraud-Detection Collections
Package methods used
Important notes
- The
name
parameter in this case is simply for naming your DGL graph. - The
metagraph
parameter should contain collections & associated document attributes names that exist within your ArangoDB instance.
# Define Metagraph fraud_detection_metagraph = { "vertexCollections": { "account": {"rank", "Balance", "customer_id"}, "Class": {"concrete"}, "customer": {"rank"}, }, "edgeCollections": { "accountHolder": {}, "Relationship": {}, "transaction": {"receiver_bank_id", "sender_bank_id", "transaction_amt"}, }, } # Create DGL Graph from attributes dgl_g = adbdgl_adapter.arangodb_to_dgl('FraudDetection', fraud_detection_metagraph) # You can also provide valid Python-Arango AQL query options to the command above, like such: # dgl_g = adbdgl_adapter.arangodb_to_dgl(graph_name = 'FraudDetection', fraud_detection_metagraph, ttl=1000, stream=True) # See more here: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute # Show graph data print('\n--------------') print(dgl_g) print('\n--------------') print(dgl_g.ndata) print('--------------\n') print(dgl_g.edata)
DGL: FraudDetection created -------------- Graph(num_nodes={'Class': 4, 'account': 54, 'customer': 17}, num_edges={('Class', 'Relationship', 'Class'): 4, ('account', 'accountHolder', 'customer'): 54, ('account', 'transaction', 'account'): 62}, metagraph=[('Class', 'Class', 'Relationship'), ('account', 'customer', 'accountHolder'), ('account', 'account', 'transaction')]) -------------- defaultdict(<class 'dict'>, {'concrete': {'Class': tensor([True, True, True, True])}, 'Balance': {'account': tensor([5331, 7630, 1433, 2201, 4837, 5817, 1689, 1042, 4104, 10, 2338, 10, 3779, 0, 529, 0, 1992, 2912, 6367, 1819, 0, 221, 5062, 2372, 841, 5393, 1138, 8414, 4064, 5686, 6294, 6540, 7358, 3452, 0, 3993, 10, 0, 471, 8148, 5832, 1758, 1747, 1679, 6789, 1599, 8320, 0, 10, 8626, 7199, 8644, 3879, 10])}, 'customer_id': {'account': tensor([10000009, 10000004, 10000004, 10000010, 10000002, 10000011, 10000015, 10000006, 10000010, 10810, 10000002, 10000014, 10000008, 0, 10000002, 0, 10000008, 10000006, 10000012, 10000015, 10000001, 10000010, 10000015, 10000005, 10000009, 10000008, 10000011, 10000014, 10000010, 10000006, 10000002, 10000007, 10000006, 10000005, 0, 10000010, 10810, 0, 10000009, 10000006, 10000002, 10000005, 10000009, 10000012, 10000007, 10000002, 10000014, 0, 10810, 10000016, 10000006, 10000016, 10000013, 10810])}, 'rank': {'account': tensor([0.0021, 0.0031, 0.0052, 0.0021, 0.0046, 0.0037, 0.0032, 0.0042, 0.0021, 0.0021, 0.0030, 0.0037, 0.0040, 0.0037, 0.0021, 0.0046, 0.0040, 0.0030, 0.0026, 0.0032, 0.0021, 0.0034, 0.0032, 0.0021, 0.0021, 0.0035, 0.0026, 0.0026, 0.0046, 0.0021, 0.0021, 0.0035, 0.0036, 0.0036, 0.0038, 0.0055, 0.0021, 0.0041, 0.0044, 0.0021, 0.0030, 0.0035, 0.0033, 0.0026, 0.0071, 0.0036, 0.0032, 0.0059, 0.0021, 0.0090, 0.0057, 0.0032, 0.0026, 0.0021]), 'customer': tensor([0.0135, 0.0050, 0.0062, 0.0066, 0.0096, 0.0088, 0.0089, 0.0047, 0.0066, 0.0045, 0.0062, 0.0103, 0.0081, 0.0039, 0.0054, 0.0044, 0.0093])}}) -------------- defaultdict(<class 'dict'>, {'sender_bank_id': {('account', 'transaction', 'account'): tensor([10000000003, 10000000002, 10000000001, 10000000001, 10000000002, 10000000003, 10000000003, 10000000002, 10000000002, 10000000003, 10000000001, 10000000001, 0, 10000000003, 10000000003, 0, 10000000002, 0, 10000000001, 10000000003, 10000000001, 10000000003, 10000000002, 0, 10000000003, 10000000003, 10000000003, 10000000003, 10000000001, 10000000001, 10000000002, 10000000001, 10000000003, 10000000003, 10000000001, 10000000001, 0, 10000000003, 10000000002, 10000000001, 10000000002, 10000000003, 10000000003, 10000000003, 10000000002, 10000000003, 10000000002, 10000000003, 10000000002, 10000000001, 10000000001, 0, 10000000003, 10000000003, 0, 10000000003, 10000000003, 10000000001, 10000000001, 10000000003, 10000000003, 10000000002])}, 'receiver_bank_id': {('account', 'transaction', 'account'): tensor([10000000003, 10000000003, 10000000001, 10000000002, 10000000002, 10000000003, 10000000001, 10000000003, 10000000001, 10000000003, 10000000002, 10000000003, 0, 10000000003, 10000000003, 0, 10000000001, 0, 10000000002, 10000000003, 10000000003, 10000000003, 10000000001, 0, 10000000003, 10000000002, 10000000003, 10000000003, 10000000001, 10000000001, 10000000003, 10000000003, 10000000003, 10000000003, 10000000001, 10000000002, 0, 10000000001, 10000000001, 10000000002, 10000000001, 10000000003, 10000000003, 10000000003, 10000000001, 10000000003, 10000000002, 10000000003, 10000000002, 10000000001, 10000000003, 0, 10000000003, 10000000003, 0, 10000000003, 10000000002, 10000000002, 10000000001, 10000000003, 10000000003, 10000000003])}, 'transaction_amt': {('account', 'transaction', 'account'): tensor([9000, 299, 498, 954, 756, 627, 142, 946, 920, 9000, 421, 343, 9000, 457, 9000, 9000, 53, 9000, 284, 120, 441, 9000, 364, 901, 9000, 279, 9000, 9000, 273, 127, 952, 354, 795, 9000, 835, 761, 9000, 478, 172, 804, 665, 995, 9000, 9000, 670, 9000, 340, 9000, 747, 347, 52, 911, 762, 9000, 0, 790, 619, 491, 954, 9000, 9000, 843])}})
Via ArangoDB Metagraph with a custom controller¶
Data source
- ArangoDB Fraud-Detection Collections
Package methods used
Important notes
- The
name
parameter in this case is simply for naming your DGL graph. - The
metagraph
parameter should contain collections & associated document attributes names that exist within your ArangoDB instance. - We are creating a custom
ADBDGL_Controller
to specify how to convert our ArangoDB vertex/edge attributes into DGL node/edge features. View the defaultADBDGL_Controller
here.
# Define Metagraph fraud_detection_metagraph = { "vertexCollections": { "account": {"rank"}, "Class": {"concrete", "name"}, "customer": {"Sex", "Ssn", "rank"}, }, "edgeCollections": { "accountHolder": {}, "Relationship": {}, "transaction": {"receiver_bank_id", "sender_bank_id", "transaction_amt", "transaction_date", "trans_time"}, }, } # When converting to DGL via an ArangoDB Metagraph that contains non-numerical values, a user-defined # Controller class is required to specify how ArangoDB attributes should be converted to DGL features. class FraudDetection_ADBDGL_Controller(ADBDGL_Controller): """ArangoDB-DGL controller. Responsible for controlling how ArangoDB attributes are converted into DGL features, and vice-versa. You can derive your own custom ADBDGL_Controller if you want to maintain consistency between your ArangoDB attributes & your DGL features. """ def _adb_attribute_to_dgl_feature(self, key: str, col: str, val): """ Given an ArangoDB attribute key, its assigned value (for an arbitrary document), and the collection it belongs to, convert it to a valid DGL feature: https://docs.dgl.ai/en/0.6.x/guide/graph-feature.html. NOTE: You must override this function if you want to transfer non-numerical ArangoDB attributes to DGL (DGL only accepts 'attributes' (a.k.a features) of numerical types). Read more about DGL features here: https://docs.dgl.ai/en/0.6.x/new-tutorial/2_dglgraph.html#assigning-node-and-edge-features-to-graph. :param key: The ArangoDB attribute key name :type key: str :param col: The ArangoDB collection of the ArangoDB document. :type col: str :param val: The assigned attribute value of the ArangoDB document. :type val: Any :return: The attribute's representation as a DGL Feature :rtype: Any """ try: if col == "transaction": if key == "transaction_date": return int(str(val).replace("-", "")) if key == "trans_time": return int(str(val).replace(":", "")) if col == "customer": if key == "Sex": return 0 if val == "M" else 1 if key == "Ssn": return int(str(val).replace("-", "")) if col == "Class": if key == "name": if val == "Bank": return 0 elif val == "Branch": return 1 elif val == "Account": return 2 elif val == "Customer": return 3 else: return -1 except (ValueError, TypeError, SyntaxError): return 0 return super()._adb_attribute_to_dgl_feature(key, col, val) fraud_adbdgl_adapter = ADBDGL_Adapter(con, FraudDetection_ADBDGL_Controller()) # Create DGL Graph from attributes dgl_g = fraud_adbdgl_adapter.arangodb_to_dgl('FraudDetection', fraud_detection_metagraph) # You can also provide valid Python-Arango AQL query options to the command above, like such: # dgl_g = fraud_adbdgl_adapter.arangodb_to_dgl(graph_name = 'FraudDetection', fraud_detection_metagraph, ttl=1000, stream=True) # See more here: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute # Show graph data print('\n--------------') print(dgl_g) print('\n--------------') print(dgl_g.ndata) print('--------------\n') print(dgl_g.edata)
Connecting to https://tutorials.arangodb.cloud:8529 DGL: FraudDetection created -------------- Graph(num_nodes={'Class': 4, 'account': 54, 'customer': 17}, num_edges={('Class', 'Relationship', 'Class'): 4, ('account', 'accountHolder', 'customer'): 54, ('account', 'transaction', 'account'): 62}, metagraph=[('Class', 'Class', 'Relationship'), ('account', 'customer', 'accountHolder'), ('account', 'account', 'transaction')]) -------------- defaultdict(<class 'dict'>, {'concrete': {'Class': tensor([True, True, True, True])}, 'name': {'Class': tensor([0, 1, 2, 3])}, 'rank': {'account': tensor([0.0021, 0.0031, 0.0052, 0.0021, 0.0046, 0.0037, 0.0032, 0.0042, 0.0021, 0.0021, 0.0030, 0.0037, 0.0040, 0.0037, 0.0021, 0.0046, 0.0040, 0.0030, 0.0026, 0.0032, 0.0021, 0.0034, 0.0032, 0.0021, 0.0021, 0.0035, 0.0026, 0.0026, 0.0046, 0.0021, 0.0021, 0.0035, 0.0036, 0.0036, 0.0038, 0.0055, 0.0021, 0.0041, 0.0044, 0.0021, 0.0030, 0.0035, 0.0033, 0.0026, 0.0071, 0.0036, 0.0032, 0.0059, 0.0021, 0.0090, 0.0057, 0.0032, 0.0026, 0.0021]), 'customer': tensor([0.0135, 0.0050, 0.0062, 0.0066, 0.0096, 0.0088, 0.0089, 0.0047, 0.0066, 0.0045, 0.0062, 0.0103, 0.0081, 0.0039, 0.0054, 0.0044, 0.0093])}, 'Ssn': {'customer': tensor([123456786, 123456780, 123456780, 123456787, 123456780, 123456789, 123456780, 123456785, 123456783, 123456784, 123456780, 123456788, 123456782, 123456781, 123456780, 123456780, 111223333])}, 'Sex': {'customer': tensor([1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1])}}) -------------- defaultdict(<class 'dict'>, {'sender_bank_id': {('account', 'transaction', 'account'): tensor([10000000003, 10000000002, 10000000001, 10000000001, 10000000002, 10000000003, 10000000003, 10000000002, 10000000002, 10000000003, 10000000001, 10000000001, 0, 10000000003, 10000000003, 0, 10000000002, 0, 10000000001, 10000000003, 10000000001, 10000000003, 10000000002, 0, 10000000003, 10000000003, 10000000003, 10000000003, 10000000001, 10000000001, 10000000002, 10000000001, 10000000003, 10000000003, 10000000001, 10000000001, 0, 10000000003, 10000000002, 10000000001, 10000000002, 10000000003, 10000000003, 10000000003, 10000000002, 10000000003, 10000000002, 10000000003, 10000000002, 10000000001, 10000000001, 0, 10000000003, 10000000003, 0, 10000000003, 10000000003, 10000000001, 10000000001, 10000000003, 10000000003, 10000000002])}, 'receiver_bank_id': {('account', 'transaction', 'account'): tensor([10000000003, 10000000003, 10000000001, 10000000002, 10000000002, 10000000003, 10000000001, 10000000003, 10000000001, 10000000003, 10000000002, 10000000003, 0, 10000000003, 10000000003, 0, 10000000001, 0, 10000000002, 10000000003, 10000000003, 10000000003, 10000000001, 0, 10000000003, 10000000002, 10000000003, 10000000003, 10000000001, 10000000001, 10000000003, 10000000003, 10000000003, 10000000003, 10000000001, 10000000002, 0, 10000000001, 10000000001, 10000000002, 10000000001, 10000000003, 10000000003, 10000000003, 10000000001, 10000000003, 10000000002, 10000000003, 10000000002, 10000000001, 10000000003, 0, 10000000003, 10000000003, 0, 10000000003, 10000000002, 10000000002, 10000000001, 10000000003, 10000000003, 10000000003])}, 'transaction_date': {('account', 'transaction', 'account'): tensor([ 201966, 201721, 2017528, 2018924, 2017516, 2018128, 2019213, 201847, 2017914, 201966, 2017810, 20181020, 0, 2017724, 201966, 0, 2019311, 0, 2018211, 2018125, 201932, 201966, 201795, 0, 201966, 2017111, 201966, 201966, 2019822, 2017317, 2019124, 2017121, 2017110, 201966, 2017717, 20181012, 0, 20181023, 2019724, 2019611, 2019928, 2019117, 201966, 201966, 2017328, 201966, 2019316, 201966, 2017914, 2017521, 201713, 0, 2018124, 201966, 0, 201784, 201713, 20171212, 2019413, 201966, 201966, 201887])}, 'trans_time': {('account', 'transaction', 'account'): tensor([1136, 1516, 1340, 1030, 1552, 1116, 1450, 924, 1046, 1426, 1247, 1459, 0, 1459, 1258, 0, 1758, 0, 1230, 1210, 1252, 1039, 1741, 0, 1420, 1713, 1710, 1028, 1636, 1054, 1658, 1332, 1316, 955, 1629, 1642, 0, 1710, 932, 1652, 1018, 1527, 1555, 1640, 1158, 1035, 1015, 1133, 1320, 1514, 1213, 0, 1133, 1340, 0, 1026, 1312, 1027, 1745, 1342, 1520, 1141])}, 'transaction_amt': {('account', 'transaction', 'account'): tensor([9000, 299, 498, 954, 756, 627, 142, 946, 920, 9000, 421, 343, 9000, 457, 9000, 9000, 53, 9000, 284, 120, 441, 9000, 364, 901, 9000, 279, 9000, 9000, 273, 127, 952, 354, 795, 9000, 835, 761, 9000, 478, 172, 804, 665, 995, 9000, 9000, 670, 9000, 340, 9000, 747, 347, 52, 911, 762, 9000, 0, 790, 619, 491, 954, 9000, 9000, 843])}})
DGL to ArangoDB¶
Example 1: DGL Karate Graph¶
Data source
Package methods used
Important notes
- The
name
parameter in this case is simply for naming your ArangoDB graph.
# Load the dgl graph & draw dgl_karate_graph = KarateClubDataset()[0] nx.draw(dgl_karate_graph.to_networkx(), with_labels=True) # Create the ArangoDB graph name = "Karate" db.delete_graph(name, drop_collections=True, ignore_missing=True) adb_karate_graph = adbdgl_adapter.dgl_to_arangodb(name, dgl_karate_graph) print('\n--------------------') print("https://{}:{}".format(con["hostname"], con["port"])) print("Username: " + con["username"]) print("Password: " + con["password"]) print("Database: " + con["dbName"]) print('--------------------\n') print(f"Inspect the graph here: https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{name}\n") print(f"View the original graph below:")
ArangoDB: Karate created -------------------- https://tutorials.arangodb.cloud:8529 Username: TUT487i8kal98gb73c2iklds Password: TUTn5t85w8t50kcupmo2mmyb Database: TUTn187e39v9qho3768ilyk4 -------------------- Inspect the graph here: https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Karate View the original graph below:
Example 2: DGL MiniGCDataset Graphs¶
Data source
Package methods used
Important notes
- The
name
parameters in this case are simply for naming your ArangoDB graph.
# Load the dgl graphs & draw dgl_lollipop_graph = remove_self_loop(MiniGCDataset(8, 7, 8)[3][0]) plt.figure(1) nx.draw(dgl_lollipop_graph.to_networkx(), with_labels=True) dgl_hypercube_graph = remove_self_loop(MiniGCDataset(8, 8, 9)[4][0]) plt.figure(2) nx.draw(dgl_hypercube_graph.to_networkx(), with_labels=True) dgl_clique_graph = remove_self_loop(MiniGCDataset(8, 6, 7)[6][0]) plt.figure(3) nx.draw(dgl_clique_graph.to_networkx(), with_labels=True) # Create the ArangoDB graphs lollipop = "Lollipop" hypercube = "Hypercube" clique = "Clique" db.delete_graph(lollipop, drop_collections=True, ignore_missing=True) db.delete_graph(hypercube, drop_collections=True, ignore_missing=True) db.delete_graph(clique, drop_collections=True, ignore_missing=True) adb_lollipop_graph = adbdgl_adapter.dgl_to_arangodb(lollipop, dgl_lollipop_graph) adb_hypercube_graph = adbdgl_adapter.dgl_to_arangodb(hypercube, dgl_hypercube_graph) adb_clique_graph = adbdgl_adapter.dgl_to_arangodb(clique, dgl_clique_graph) print('\n--------------------') print("https://{}:{}".format(con["hostname"], con["port"])) print("Username: " + con["username"]) print("Password: " + con["password"]) print("Database: " + con["dbName"]) print('--------------------\n') print("\nInspect the graphs here:\n") print(f"1) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{lollipop}") print(f"2) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{hypercube}") print(f"3) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{clique}\n") print(f"\nView the original graphs below:")
ArangoDB: Lollipop created ArangoDB: Hypercube created ArangoDB: Clique created -------------------- https://tutorials.arangodb.cloud:8529 Username: TUT487i8kal98gb73c2iklds Password: TUTn5t85w8t50kcupmo2mmyb Database: TUTn187e39v9qho3768ilyk4 -------------------- Inspect the graphs here: 1) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Lollipop 2) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Hypercube 3) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Clique View the original graphs below:
Example 3: DGL MiniGCDataset Graphs with a custom controller¶
Data source
Package methods used
Important notes
- The
name
parameters in this case are simply for naming your ArangoDB graph. - We are creating a custom
ADBDGL_Controller
to specify how to convert our DGL node/edge features into ArangoDB vertex/edge attributes. View the defaultADBDGL_Controller
here.
from torch.functional import Tensor # Load the dgl graphs dgl_lollipop_graph = remove_self_loop(MiniGCDataset(8, 7, 8)[3][0]) dgl_hypercube_graph = remove_self_loop(MiniGCDataset(8, 8, 9)[4][0]) dgl_clique_graph = remove_self_loop(MiniGCDataset(8, 6, 7)[6][0]) # Add DGL Node & Edge Features to each graph dgl_lollipop_graph.ndata["random_ndata"] = torch.tensor( [[i, i, i] for i in range(0, dgl_lollipop_graph.num_nodes())] ) dgl_lollipop_graph.edata["random_edata"] = torch.rand(dgl_lollipop_graph.num_edges()) dgl_hypercube_graph.ndata["random_ndata"] = torch.rand(dgl_hypercube_graph.num_nodes()) dgl_hypercube_graph.edata["random_edata"] = torch.tensor( [[[i], [i], [i]] for i in range(0, dgl_hypercube_graph.num_edges())] ) dgl_clique_graph.ndata['clique_ndata'] = torch.tensor([1,2,3,4,5,6]) dgl_clique_graph.edata['clique_edata'] = torch.tensor( [1 if i % 2 == 0 else 0 for i in range(0, dgl_clique_graph.num_edges())] ) # When converting to ArangoDB from DGL, a user-defined Controller class # is required to specify how DGL features (aka attributes) should be converted # into ArangoDB attributes. NOTE: A custom Controller is NOT needed you want to # keep the numerical-based values of your DGL features. class Clique_ADBDGL_Controller(ADBDGL_Controller): """ArangoDB-DGL controller. Responsible for controlling how ArangoDB attributes are converted into DGL features, and vice-versa. You can derive your own custom ADBDGL_Controller if you want to maintain consistency between your ArangoDB attributes & your DGL features. """ def _dgl_feature_to_adb_attribute(self, key: str, col: str, val: Tensor): """ Given a DGL feature key, its assigned value (for an arbitrary node or edge), and the collection it belongs to, convert it to a valid ArangoDB attribute (e.g string, list, number, ...). NOTE: No action is needed here if you want to keep the numerical-based values of your DGL features. :param key: The DGL attribute key name :type key: str :param col: The ArangoDB collection of the (soon-to-be) ArangoDB document. :type col: str :param val: The assigned attribute value of the DGL node. :type val: Tensor :return: The feature's representation as an ArangoDB Attribute :rtype: Any """ if key == "clique_ndata": if val == 1: return "one is fun" elif val == 2: return "two is blue" elif val == 3: return "three is free" elif val == 4: return "four is more" else: # No special string for values 5 & 6 return f"ERROR! Unrecognized value, got {val}" if key == "clique_edata": return bool(val) return super()._dgl_feature_to_adb_attribute(key, col, val) # Re-instantiate a new adapter specifically for the Clique Graph Conversion clique_adbgl_adapter = ADBDGL_Adapter(con, Clique_ADBDGL_Controller()) # Create the ArangoDB graphs lollipop = "Lollipop_With_Attributes" hypercube = "Hypercube_With_Attributes" clique = "Clique_With_Attributes" db.delete_graph(lollipop, drop_collections=True, ignore_missing=True) db.delete_graph(hypercube, drop_collections=True, ignore_missing=True) db.delete_graph(clique, drop_collections=True, ignore_missing=True) adb_lollipop_graph = adbdgl_adapter.dgl_to_arangodb(lollipop, dgl_lollipop_graph) adb_hypercube_graph = adbdgl_adapter.dgl_to_arangodb(hypercube, dgl_hypercube_graph) adb_clique_graph = clique_adbgl_adapter.dgl_to_arangodb(clique, dgl_clique_graph) # Notice the new adapter here! print('\n--------------------') print("https://{}:{}".format(con["hostname"], con["port"])) print("Username: " + con["username"]) print("Password: " + con["password"]) print("Database: " + con["dbName"]) print('--------------------\n') print("\nInspect the graphs here:\n") print(f"1) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{lollipop}") print(f"2) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{hypercube}") print(f"3) https://tutorials.arangodb.cloud:8529/_db/{con['dbName']}/_admin/aardvark/index.html#graph/{clique}\n")
Connecting to https://tutorials.arangodb.cloud:8529 ArangoDB: Lollipop_With_Attributes created ArangoDB: Hypercube_With_Attributes created ArangoDB: Clique_With_Attributes created -------------------- https://tutorials.arangodb.cloud:8529 Username: TUT487i8kal98gb73c2iklds Password: TUTn5t85w8t50kcupmo2mmyb Database: TUTn187e39v9qho3768ilyk4 -------------------- Inspect the graphs here: 1) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Lollipop_With_Attributes 2) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Hypercube_With_Attributes 3) https://tutorials.arangodb.cloud:8529/_db/TUTn187e39v9qho3768ilyk4/_admin/aardvark/index.html#graph/Clique_With_Attributes
Recommend
-
12
Introducing the ArangoDB-NetworkX Adapter January 14, 2022GeneralTags: Network...
-
8
Introducing ArangoDB 3.9 – Graph Meets Analytics February 15, 2022General, Releas...
-
14
Introducing the new ArangoDB Datasource for Apache Spark March 8, 2022General, Re...
-
2
此为原创文章,转载务必保留出处 之前我们大致介绍了DGL这个框架,以及如何使用DGL编写一个GCN模型,用在学术数据集上,这样的模型是workable的。然而,现实...
-
9
此为原创文章,转载务必保留出处 图神经网络的计算模式大致相似,节点的Embedding需要汇聚其邻接节点Embedding以更新,从线性代数的角度来看,这就是邻接矩阵和特征矩阵相乘。然而邻接...
-
4
此为原创文章,转载务必保留出处 前面的文章中我们介绍了DGL如何利用采样的技术缩小计算图的规模来通过mini-batch的方式训练模型,当图特别大的时候,非常多的batches需要...
-
11
ArangoDB PyG Adapter Getting Started Guide
-
13
ArangoDB’s Exciting Updates: Introducing Our Developer Hub and GenAI Bots! Estimated reading time: 3 minutes At ArangoDB, our commitment to empowering developers and data enthusiasts w...
-
12
Estimated reading time: 7 minutes At ArangoDB, our commitment to empowering companies, developers, and data enthusiastswith cutting-edge tools and resources remains unwavering. Today, we’re thrilled to unveil ourlatest innov...
-
6
Reintroducing the ArangoDB-RDF Adapter Estimated reading time: 1 minute
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK