Skip to content

Saving and loading graphs

The fastest way to ingest a graph is to load one from Raphtory's on-disk format using the load_from_file() function on the graph. This does require first ingesting via one of the prior methods and saving the produced graph via save_to_file(), but means for large datasets you do not need to parse the data every time you run a Raphtory script.

Info

This is similar to pickling and can make a drastic difference on ingestion, especially if your datasets require a lot of preprocessing.

In the example below we ingest the edge dataframe from the last section, save this graph and reload it into a second graph. These are both printed to show they contain the same data.

Warning

Due to the ongoing development of Raphtory, a saved graph is not guaranteed to be consistent across versions.

from raphtory import Graph
import pandas as pd

edges_df = pd.read_csv("data/network_traffic_edges.csv")
edges_df["timestamp"] = pd.to_datetime(edges_df["timestamp"])

g = Graph()
g.load_edges_from_pandas(
    df=edges_df,
    src="source",
    dst="destination",
    time="timestamp",
    properties=["data_size_MB"],
    layer="transaction_type",
)
g.save_to_file("/tmp/saved_graph")
loaded_graph = Graph.load_from_file("/tmp/saved_graph")
print(g)
print(loaded_graph)

Output

Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)
Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)