Saving and loading graphs
The fastest way to ingest a graph is to load one from Raphtory's on-disk format using the load_from_file()
function on the graph. This does require first ingesting via one of the prior methods and saving the produced graph via save_to_file()
, but means for large datasets you do not need to parse the data every time you run a Raphtory script.
Info
This is similar to pickling and can make a drastic difference on ingestion, especially if your datasets require a lot of preprocessing.
In the example below we ingest the edge dataframe from the last section, save this graph and reload it into a second graph. These are both printed to show they contain the same data.
Warning
Due to the ongoing development of Raphtory, a saved graph is not guaranteed to be consistent across versions.
from raphtory import Graph
import pandas as pd
edges_df = pd.read_csv("data/network_traffic_edges.csv")
edges_df["timestamp"] = pd.to_datetime(edges_df["timestamp"])
g = Graph()
g.load_edges_from_pandas(
df=edges_df,
src="source",
dst="destination",
time="timestamp",
properties=["data_size_MB"],
layer="transaction_type",
)
g.save_to_file("/tmp/saved_graph")
loaded_graph = Graph.load_from_file("/tmp/saved_graph")
print(g)
print(loaded_graph)
Output
Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)
Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)