Getting Started with PySGN#

Introduction to Geospatial Networks#

Geospatial networks are a type of network where nodes are associated with specific geographic locations. These networks are used to model and analyze spatial relationships and interactions, such as transportation systems, communication networks, and social interactions within geographic constraints. By incorporating spatial data, geospatial networks provide insights into how location influences connectivity and network dynamics.

Geospatial Erdős-Rényi, Watts-Strogatz & Barabási-Albert Networks#

PySGN implements three popular models for generating synthetic geospatial networks: the Geospatial Erdős-Rényi model, the Geospatial Watts-Strogatz model, and the Geospatial Barabási-Albert model. All models are adaptations of classical network models that incorporate spatial considerations, allowing for the creation of networks that reflect real-world geographic constraints.

Geospatial Erdős-Rényi Model: This model is a variant of the classic Erdős-Rényi model, where each possible edge between nodes is connected with a probability that decreases with distance.

Geospatial Watts-Strogatz Model: This model extends the classic Watts-Strogatz model. Initially, each node is connected to its nearest neighbors, and edges are then rewired with a probability that depends on the distance between nodes.

Geospatial Barabási-Albert Model: This model is based on the classic Barabási–Albert model by combining preferential attachment with geographic distances. In this geospatial version, new nodes are added sequentially based on their proximity to the already placed nodes, and each new node attaches to existing nodes with a probability that increases with the target node’s degree while decreasing with distance.

By using these models, PySGN enables users to generate synthetic geospatial networks that capture the spatial characteristics and connectivity patterns of real-world systems.

Installation and Setup#

To get started with PySGN, you can install it using pip. This will also install its dependencies, such as NetworkX and GeoPandas.

pip install pysgn

Let’s import everything we need, and define functions to be used later:

import random
import statistics
from collections import Counter

import contextily as cx
import geodatasets
import geopandas as gpd
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
import matplotlib_inline
import networkx as nx
import numpy as np
import pandas as pd
from IPython.display import Image
from matplotlib.ticker import MaxNLocator
from mpl_toolkits.axes_grid1 import make_axes_locatable
from shapely.geometry import Point

from pysgn import (
    geo_barabasi_albert_network,
    geo_erdos_renyi_network,
    geo_watts_strogatz_network,
)
from pysgn.ordering import attribute_order, density_order, random_order

matplotlib_inline.backend_inline.set_matplotlib_formats("svg")


def plot_degree_distribution(graph, ax=None):
    degree_sequence = sorted([d for _, d in graph.degree()], reverse=True)
    degree_count = Counter(degree_sequence)
    deg, cnt = zip(*degree_count.items())
    if ax is None:
        _, ax = plt.subplots(figsize=(12, 8))
    ax.xaxis.set_major_locator(MaxNLocator(integer=True))
    ax.set_xlim([-0.5, max(degree_sequence) + 1.5])
    ax.bar(deg, cnt, width=0.80, color="#43a2ca")
    ax.set_title(
        f"Network Degree Histogram ({graph.number_of_nodes():,} nodes)\n(mean={np.mean(degree_sequence):.1f}, std={np.std(degree_sequence):.2f}, mode={statistics.mode(degree_sequence)})",
    )
    ax.set_ylabel("Count")
    ax.set_xlabel("Degree")

Matplotlib is building the font cache; this may take a moment.

Example: Geospatial Erdős-Rényi Network#

To ground the example in a real dataset, we’ll load the geoda.groceries points from the geodatasets package. Each point corresponds to a grocery store location that GeoDa uses in its tutorials, giving us reproducible coordinates with real-world context.

gdf = (
    gpd.read_file(geodatasets.get_path("geoda.groceries"))
    .explode(index_parts=False)
    .reset_index(drop=True)
    .to_crs("EPSG:26971")
)

_, ax = plt.subplots(figsize=(6, 6))
gdf.plot(ax=ax, markersize=6, color="#2b8cbe", alpha=0.8)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax.set_title("Grocery store locations (geodatasets)")
ax.set_axis_off()

Downloading file 'grocery.zip' from 'https://geodacenter.github.io/data-and-lab//data/grocery.zip' to '/home/docs/.cache/geodatasets'.

Extracting 'grocery/chicago_sup.shp' from '/home/docs/.cache/geodatasets/grocery.zip' to '/home/docs/.cache/geodatasets/grocery.zip.unzip'

Extracting 'grocery/chicago_sup.dbf' from '/home/docs/.cache/geodatasets/grocery.zip' to '/home/docs/.cache/geodatasets/grocery.zip.unzip'

Extracting 'grocery/chicago_sup.shx' from '/home/docs/.cache/geodatasets/grocery.zip' to '/home/docs/.cache/geodatasets/grocery.zip.unzip'

Extracting 'grocery/chicago_sup.prj' from '/home/docs/.cache/geodatasets/grocery.zip' to '/home/docs/.cache/geodatasets/grocery.zip.unzip'

_images/bc6456a92947958a6bd9e5b43cbbbc903660eb65483d38d6134401ca92615e08.svg

The geospatial Erdős-Rényi network is a good place to start with if you don’t know much about how the resulting network should look like. It can be created by simply using the geo_erdos_renyi_network function:

graph = geo_erdos_renyi_network(gdf)

The result can be visualized using NetworkX and Matplotlib:

_, ax = plt.subplots(1, 2, figsize=(12, 5))
gdf.plot(ax=ax[0], markersize=6, color="#888888", alpha=0.6)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_size=20,
    width=0.5,
    edge_color="#2b8cbe",
    node_color="#084081",
    ax=ax[0],
)
cx.add_basemap(ax[0], source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax[0].set_title("Geospatial Erdős-Rényi Network\n(geoda.groceries)")
ax[0].set_axis_off()
plot_degree_distribution(graph, ax=ax[1])

_images/ceeaf2d3016505312f33ee74e812f2416f9992b8dc058654185715b0aabea009.svg

The Geospatial Erdős-Rényi model is a variant of the Erdős-Rényi model that incorporates spatial considerations. Each possible edge in the network is connected with probability:

\[ p(d|a, \textrm{min\_dist}) = \textrm{min}\left(1, \left(\frac{d}{\textrm{min\_dist}}\right) ^ {-a}\right) \]

where d is edge length, min_dist is the minimum edge length, and a is the distance decay exponent parameter, default is 3.

The minimum length is a threshold, below which edges are created with probability 1. It is 1/20 of the bounding box diagonal by default. Users can set the scaling_factor parameter if needed, which is the inverse of the minimum length.

In summary, the probability of two nodes being connected decays exponentially with respect to their distance. Conversely, if two nodes are too close with a distance smaller than 1/scaling_factor, then they are always connected.

The probability decay function, parameterized by a and scaling_factor, looks like the following:

def connection_probability(
    distance: np.ndarray, a: float, scaling_factor: float
) -> np.ndarray:
    # Replace zero with a small number to avoid division by zero
    distance[distance == 0.0] = 1e-6
    min_dist = 1 / scaling_factor
    return np.minimum(1, (distance / min_dist) ** -a)


distances = np.linspace(0, 10, 500)
a_values = [1, 2, 3, 4]
scaling_factors = [0.1, 0.2, 0.5]

fig, axes = plt.subplots(
    1, len(scaling_factors), figsize=(15, 5), sharex=True, sharey=True
)

for j, scaling_factor in enumerate(scaling_factors):
    for a in a_values:
        probabilities = connection_probability(distances, a, scaling_factor)
        axes[j].plot(distances, probabilities, label=f"a={a}")
    axes[j].set_title(
        f"scaling_factor={scaling_factor}\n(min_dist={1/scaling_factor})", fontsize=22
    )
    axes[j].set_xlabel("Distance", fontsize=20)
    axes[j].grid(True, linestyle="--")
    axes[j].tick_params(axis="both", which="major", labelsize=20)

axes[0].set_ylabel("Probability", fontsize=20)
lines, labels = axes[0].get_legend_handles_labels()
fig.legend(
    lines, labels, loc="upper left", bbox_to_anchor=(1.0, 0.87), prop={"size": 20}
)

plt.tight_layout()

_images/f647090b33ef446581643e31374793edfb46bbd151eda2e2794e9a623203daae.svg

Example: Geospatial Watts-Strogatz Network#

The geospatial Watts-Strogatz network provides more control on the resulting network, by allowing two more parameters:

k (int or str): The number of nearest neighbors each node is initially connected to.
- If an integer, it specifies the initial number of neighbors for all nodes. The resulting network has a mean degree of k.
- If a string, it refers to a column in the GeoDataFrame that contains the expected degree centrality for each node.
p (float): The probability of rewiring each edge. This parameter controls the randomness of the network. A value of 0 results in a regular lattice, while a value of 1 results in a random network.

First, the model connects each node to its k nearest neighbors.

Then, it rewires each edge with probability p. When an edge is rewired, it is removed and a new edge is added to a random node. The probability of being rewired to a new node is determined by the distance between the nodes:

\[ p(d|a, \textrm{min\_dist}) = \textrm{min}\left(1, \left(\frac{d}{\textrm{min\_dist}}\right) ^ {-a}\right) \]

where d is edge length, min_dist is the minimum edge length, and a is the distance decay exponent parameter, default is 3.

This decay function of probabilities with respect to node distance works the same as above in the geospatial Erdős-Rényi network.

We’ll first use a simple grid to visualize how k and p shape the resulting graph, then apply the model to the real grocery-store dataset introduced earlier.

x = np.linspace(0, 10, 5)
y = np.linspace(0, 10, 5)
xx, yy = np.meshgrid(x, y)
points = gpd.GeoDataFrame(
    geometry=gpd.points_from_xy(xx.flatten(), yy.flatten()), crs="EPSG:3857"
)


k_values = [2, 6]
p_values = [0, 0.1, 0.5]
fig, axes = plt.subplots(2, 3, figsize=(18, 12))

for i, k in enumerate(k_values):
    for j, p in enumerate(p_values):
        graph = geo_watts_strogatz_network(
            points, k=k, p=p, a=3, scaling_factor=0.05, random_state=42, verbose=False
        )
        ax = axes[i, j]
        pos = nx.get_node_attributes(graph, "pos")
        nx.draw(graph, pos=pos, node_size=50, ax=ax, with_labels=False)
        if p == 0:
            title = f"Local Connections Only: p={p}"
        elif p == 0.1:
            title = f"With Rewiring: p={p}"
        else:
            title = f"More Rewiring: p={p}"

        ax.set_title(f"\n{title}\n(k={k})", fontsize=22)

plt.tight_layout()

_images/3d691050fc6563d8a31decb3689b61d60e8daee516634b7de0ab7663178bd34d.svg

Now let’s switch back to the grocery-store points, so we can see how the same parameters behave on real-world geometry:

graph = geo_watts_strogatz_network(gdf, k=6, p=0.3)

_, ax = plt.subplots(1, 2, figsize=(10, 5))
gdf.plot(ax=ax[0], markersize=6, color="#b0b0b0", alpha=0.5)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_size=20,
    width=0.5,
    edge_color="#3182bd",
    node_color="#08519c",
    ax=ax[0],
)
cx.add_basemap(ax[0], source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax[0].set_title("Geospatial Watts-Strogatz Network\n(geoda.groceries)")
ax[0].set_axis_off()
plot_degree_distribution(graph, ax=ax[1])

_images/37d59f623bf9c5ce5eb20011fd1efbbaeb1db9e733b1597b8677e253c0cf0483.svg

As we can see, the average degree centrality is 6, the same as our k parameter.

Instead of setting a global k parameter for the average degree centrality of all nodes, it is possible to specify each node’s expected degree individually. In this case, the k parameter can be set to the column name containing such information.

For instance, you may want to generate a synthetic geospatial network to represent people’s social connections. From certain studies, you find out that people live in certain areas in general have higher degrees than those who live in other areas. Hence, you want to create a synthetic geospatial network that captures this information.

This is illustrated in the following example, where each node has an expected degree value based on its location.

grid_size = 3
expected_degrees = np.array([[4, 4, 4], [4, 6, 4], [4, 6, 10]])

xmin, ymin, xmax, ymax = gdf.total_bounds
x_edges = np.linspace(xmin, xmax, grid_size + 1)
y_edges = np.linspace(ymin, ymax, grid_size + 1)


def assign_expected_degree(point, x_edges, y_edges, expected_degrees):
    x_bin = np.clip(np.digitize(point.x, x_edges) - 1, 0, grid_size - 1)
    y_bin = np.clip(grid_size - np.digitize(point.y, y_edges), 0, grid_size - 1)
    return expected_degrees[y_bin, x_bin]


gdf["expected_degree"] = gdf.geometry.apply(
    assign_expected_degree, args=(x_edges, y_edges, expected_degrees)
)

gdf.head()

	OBJECTID	Ycoord	Xcoord	Status	Address	Chain	Category	geometry	expected_degree
0	16	41.973266	-87.657073	OPEN	1051 W ARGYLE ST, CHICAGO, IL. 60640	VIET HOA PLAZA	NaN	POINT (356089.003 589348.545)	4
1	18	41.696367	-87.681315	OPEN	10800 S WESTERN AVE, CHICAGO, IL. 60643-3226	COUNTY FAIR FOODS	NaN	POINT (354270.546 558669.106)	6
2	22	41.868634	-87.638638	OPEN	1101 S CANAL ST, CHICAGO, IL. 60607-4932	WHOLE FOODS MARKET	NaN	POINT (357627.75 577726.825)	4
3	23	41.877590	-87.654953	OPEN	1101 W JACKSON BLVD, CHICAGO, IL. 60607-2905	TARGET/SUPER	new	POINT (356310.838 578755.826)	4
4	27	41.737696	-87.625795	OPEN	112 W 87TH ST, CHICAGO, IL. 60620-1318	FOOD 4 LESS	NaN	POINT (358747.876 563046.713)	10

_, ax = plt.subplots(1, 1, figsize=(6, 6))
divider = make_axes_locatable(ax)
cax = divider.append_axes("bottom", size="5%", pad=0.3)
gdf.plot(
    column="expected_degree",
    cmap="viridis_r",
    legend=True,
    cax=cax,
    legend_kwds={"label": "expected_degree", "orientation": "horizontal"},
    ax=ax,
    markersize=15,
    alpha=0.9,
)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax.set_title("Grocery locations colored by expected degree")
ax.set_axis_off()

_images/065716695a25ea9bf6dfbb29f78b222a5496e1247b60de3046dced58b33fa27a.svg

We can then set k to be the expected_degree column:

graph = geo_watts_strogatz_network(gdf, k="expected_degree", p=0.3)

degrees = dict(graph.degree())
cmap = plt.colormaps.get_cmap("viridis_r")
node_colors = [cmap(degree / max(degrees.values())) for node, degree in degrees.items()]

_, ax = plt.subplots(1, 2, figsize=(10, 5))
gdf.plot(ax=ax[0], markersize=4, color="#d9d9d9", alpha=0.4)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_color=node_colors,
    node_size=20,
    width=0.5,
    ax=ax[0],
)
cx.add_basemap(ax[0], source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax[0].set_title("Watts-Strogatz network with per-node expected degrees")
ax[0].set_axis_off()
plot_degree_distribution(graph, ax=ax[1])

_images/2ae809b777874cf09c258f2fbec0651c065bc35e48d7488b1c30b2bba22f7387.svg

As a result, nodes in the lower right corner have higher degrees (i.e., more connections) than nodes in other areas.

Example: Geospatial Barabási-Albert Network#

The geospatial Barabási-Albert network model combines preferential attachment with spatial constraints. In the original Barabási-Albert model, new nodes are more likely to connect to existing nodes with high degrees, leading to a “rich-get-richer” effect and a scale-free network structure.

In the geospatial variant, the probability of connection is influenced by both the node’s degree and its geographic distance:

\[ P(i) \propto k_i \cdot \textrm{min}\left(1, \left(\frac{\textrm{distance}}{\textrm{min\_dist}}\right) ^ {-a}\right) \]

where:

\(k_i\) is the degree of existing node i
\(\textrm{distance}\) is the geographic distance between existing node i and the new node
\(\textrm{min\_dist}\) is the minimum distance threshold (1/scaling_factor)
\(a\) is the distance decay exponent parameter

The model has several key parameters:

m (int): Number of edges to attach from a new node to existing nodes (and size of the seed network).
node_order (callable or str): Order in which nodes are added to the network
a (int): Distance decay exponent, controlling how strongly distance affects connection probability
scaling_factor (float): Inverse of minimum distance threshold

Distance decay exponent a and parameter scaling_factor work the same as above in the geospatial Erdős-Rényi and Watts-Strogatz networks.

The node_order parameter controls the order in which nodes are added to the network. By default, nodes are added sequentially as they appear in the GeoDataFrame (first row is added first, second row is added second, and so on), but users can specify different ordering strategies:

random order
attribute order: Order by values in specific column(s) in ascending order
density order (KNN or KDE): Order by spatial density (high to low)
custom function: User-defined function that takes a GeoDataFrame and returns an order array

The sequence in which nodes join the network impacts the final structure. Early nodes typically become hubs with higher connectivity. Our implementation provides flexible ordering strategies based on spatial patterns, attributes, or custom logic. The choice of ordering strategy allows modeling various real-world network growth processes, such as urban expansion from dense centers or infrastructure development based on population needs.

Next, we’ll reuse the grocery-store dataset but assign a few synthetic attributes (e.g., store “utility” or density ranking) so we can explore different node-ordering strategies with real coordinates.

rng = np.random.default_rng(42)
gdf["utility"] = rng.integers(100, 10000, size=len(gdf))
points = np.column_stack((gdf.geometry.x.values, gdf.geometry.y.values))

gdf.head()

	OBJECTID	Ycoord	Xcoord	Status	Address	Chain	Category	geometry	expected_degree	utility
0	16	41.973266	-87.657073	OPEN	1051 W ARGYLE ST, CHICAGO, IL. 60640	VIET HOA PLAZA	NaN	POINT (356089.003 589348.545)	4	983
1	18	41.696367	-87.681315	OPEN	10800 S WESTERN AVE, CHICAGO, IL. 60643-3226	COUNTY FAIR FOODS	NaN	POINT (354270.546 558669.106)	6	7762
2	22	41.868634	-87.638638	OPEN	1101 S CANAL ST, CHICAGO, IL. 60607-4932	WHOLE FOODS MARKET	NaN	POINT (357627.75 577726.825)	4	6580
3	23	41.877590	-87.654953	OPEN	1101 W JACKSON BLVD, CHICAGO, IL. 60607-2905	TARGET/SUPER	new	POINT (356310.838 578755.826)	4	4444
4	27	41.737696	-87.625795	OPEN	112 W 87TH ST, CHICAGO, IL. 60620-1318	FOOD 4 LESS	NaN	POINT (358747.876 563046.713)	10	4386

_, ax = plt.subplots(1, 1, figsize=(6, 6))
divider = make_axes_locatable(ax)
cax = divider.append_axes("bottom", size="5%", pad=0.3)
gdf.plot(
    column="utility",
    legend=True,
    cmap="viridis_r",
    cax=cax,
    legend_kwds={"label": "utility", "orientation": "horizontal"},
    ax=ax,
    markersize=20,
    alpha=0.9,
)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
ax.set_title("Grocery locations with synthetic 'utility'")
ax.set_axis_off()

_images/e9d49accac602d77e1308e79ee032d30ad4d719dbf63ab743f61cfb23b98bcc4.svg

Let’s visualize different node orderings:

# Apply different ordering strategies
random_idx = random_order(gdf, random_state=42)
utility_idx = attribute_order(gdf, "utility")
density_knn_idx = density_order(gdf, method="knn", k=5)
density_kde_idx = density_order(gdf, method="kde", bandwidth=0.5)

# Visualize different orderings
fig, axs = plt.subplots(2, 2, figsize=(13, 10))
titles = [
    "Random Order",
    "Attribute Order ('utility' column)",
    "Density Order (KNN)",
    "Density Order (KDE)",
]
orders = [random_idx, utility_idx, density_knn_idx, density_kde_idx]

for i, (ax, title, order) in enumerate(zip(axs.flatten(), titles, orders)):
    x, y = points[order, 0], points[order, 1]
    scatter = ax.scatter(x, y, c=np.arange(len(x)), cmap="viridis", s=80, alpha=0.9)
    ax.set_title(title)
    ax.set_axis_off()

fig.tight_layout()
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.8, 0.055, 0.02, 0.9])
cbar = fig.colorbar(scatter, cax=cbar_ax)
cbar.set_label("Node Addition Order")

_images/7e602b77b8e91455011e8e4f10e3f7dbce055b9920c5dd1b5e91e46a029512f6.svg

We can create synthetic networks using each ordering strategy and compare the results.

ordering_functions = [
    lambda frame: random_order(frame, random_state=42),
    lambda frame: attribute_order(frame, "utility"),
    lambda frame: density_order(frame, method="knn", k=5),
    lambda frame: density_order(frame, method="kde", bandwidth=0.5),
]

fig, axs = plt.subplots(2, 4, figsize=(18, 8))
for i, (title, order_fn) in enumerate(zip(titles, ordering_functions)):
    graph = geo_barabasi_albert_network(
        gdf,
        m=2,
        node_order=order_fn,
        random_state=42,
    )
    degrees = dict(graph.degree())
    cmap = plt.colormaps.get_cmap("viridis_r")
    node_colors = [
        cmap(degree / max(degrees.values())) for node, degree in degrees.items()
    ]
    gdf.plot(ax=axs[0, i], markersize=3, color="#cccccc", alpha=0.4)
    nx.draw(
        graph,
        pos=nx.get_node_attributes(graph, "pos"),
        node_color=node_colors,
        node_size=15,
        width=0.3,
        ax=axs[0, i],
        alpha=0.9,
    )
    cx.add_basemap(axs[0, i], source=cx.providers.CartoDB.Positron, crs=gdf.crs)
    axs[0, i].set_title(title)
    axs[0, i].set_axis_off()
    plot_degree_distribution(graph, axs[1, i])

_images/3c8291a4155f7e91309c29d63e88aa1ffbe701b16e54bd1c17b8acd644d6f6fe.svg

The degree distribution plots show the characteristic power-law behavior of Barabási-Albert networks. The spatial constraints modify this pure power-law behavior, as geographic distance limits the potential connections. However, the network still maintains the general scale-free property, where a few hub nodes have many connections while most nodes have relatively few.

This model is particularly useful for modeling networks where:

The network grows over time
New nodes preferentially attach to well-connected nodes
Geographic distance influences connection probability
A scale-free degree distribution is expected

Examples include transportation networks, power grids, and some types of social networks where both popularity and proximity affect connection patterns.

More Options#

Including Node Attributes#

When generating a network, you can choose to include node attributes from your GeoDataFrame. This is done using the node_attributes parameter in PySGN functions. You can specify:

True: To include all variables from the GeoDataFrame as node attributes. This is the default behavior.
A string or list of strings: To include only specific attributes by their column names.
False: Only the position of the nodes will be saved as a pos attribute in the network.

Including node attributes allows you to leverage additional data in your network analysis, such as demographic information, geographic features, or any other relevant metadata.

We can keep using the grocery stores: assign each store to a synthetic “Group A/B” and then visualize how that attribute flows through a generated network.

gdf["group"] = random.choices(["Group A", "Group B"], k=len(gdf))
gdf[["group", "geometry"]].head()

	group	geometry
0	Group B	POINT (356089.003 589348.545)
1	Group A	POINT (354270.546 558669.106)
2	Group B	POINT (357627.75 577726.825)
3	Group A	POINT (356310.838 578755.826)
4	Group A	POINT (358747.876 563046.713)

By default, PySGN functions keep all variables from the GeoDataFrame as node attributes:

graph = geo_erdos_renyi_network(gdf)

groups = nx.get_node_attributes(graph, "group")
colors = {"Group A": "tab:blue", "Group B": "tab:orange"}
node_colors = [colors[groups[node]] for node in graph.nodes]

fig, ax = plt.subplots(figsize=(6, 6))
gdf.plot(ax=ax, markersize=8, color="#d9d9d9", alpha=0.4)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_color=node_colors,
    edge_color="#636363",
    node_size=20,
    width=0.5,
    ax=ax,
)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
legend_handles = [
    mpatches.Patch(color=color, label=group) for group, color in colors.items()
]
ax.legend(
    handles=legend_handles,
    title="Groups",
    loc="upper right",
    bbox_to_anchor=[1.5, 1.0],
)
ax.set_title("Erdős-Rényi network with group attribute")
ax.set_axis_off()

_images/3fc7e7d2673905ddd1dfe30236129b39bcf53e8ecc699d8360e57450911fe89e.svg

You may turn this off by setting node_attributes=False, or providing the column names that should be included in the network:

# saving only a `pos` node attribute
geo_erdos_renyi_network(gdf, node_attributes=False)

# each node has `pos` and `group` attributes
geo_erdos_renyi_network(gdf, node_attributes="group")

# each node has `pos`, `group`, and `some_other_col` attributes
geo_erdos_renyi_network(gdf, node_attributes=["group", "some_other_col"])

This works with the geo_watts_strogatz_network and geo_barabasi_albert_network methods too.

Custom Constraints#

Custom constraints allow you to impose specific rules on the connections between nodes in your network. This can be useful for modeling real-world scenarios where certain connections are only possible under specific conditions.

In PySGN, you can define a custom constraint function that takes two nodes as input and returns a boolean value indicating whether an edge should be created between them. This function can be passed to the network generation method using the constraint parameter.

We’ll continue with the grocery stores and the synthetic group attribute - first restricting edges to stores sharing the same group, then enforcing cross-group connections only.

graph = geo_watts_strogatz_network(
    gdf, k=4, p=0.3, constraint=lambda u, v: u.group == v.group
)

groups = nx.get_node_attributes(graph, "group")
colors = {"Group A": "tab:blue", "Group B": "tab:orange"}
node_colors = [colors[groups[node]] for node in graph.nodes]
edge_colors = [colors[groups[edge[0]]] for edge in graph.edges]

fig, ax = plt.subplots(figsize=(6, 6))
gdf.plot(ax=ax, markersize=6, color="#bdbdbd", alpha=0.5)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_color=node_colors,
    edge_color=edge_colors,
    node_size=20,
    width=0.4,
    ax=ax,
)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
legend_handles = [
    mpatches.Patch(color=color, label=group) for group, color in colors.items()
]
ax.legend(
    handles=legend_handles, title="Groups", loc="upper right", bbox_to_anchor=[1.5, 1.0]
)
ax.set_title("Same-group constraint (Watts-Strogatz)")
ax.set_axis_off()

_images/5524700e2e22f05be76afc9abe02d265a64c87fe4a9fcc9a22e18e92002028d5.svg

Conversely, if you want nodes to be connected to nodes of different group only:

graph = geo_watts_strogatz_network(
    gdf, k=4, p=0.3, constraint=lambda u, v: u.group != v.group
)

groups = nx.get_node_attributes(graph, "group")
colors = {"Group A": "tab:blue", "Group B": "tab:orange"}
node_colors = [colors[groups[node]] for node in graph.nodes]

fig, ax = plt.subplots(figsize=(6, 6))
gdf.plot(ax=ax, markersize=6, color="#bdbdbd", alpha=0.5)
nx.draw(
    graph,
    pos=nx.get_node_attributes(graph, "pos"),
    node_color=node_colors,
    edge_color="#636363",
    node_size=20,
    width=0.4,
    ax=ax,
)
cx.add_basemap(ax, source=cx.providers.CartoDB.Positron, crs=gdf.crs)
legend_handles = [
    mpatches.Patch(color=color, label=group) for group, color in colors.items()
]
ax.legend(
    handles=legend_handles, title="Groups", loc="upper right", bbox_to_anchor=[1.5, 1.0]
)
ax.set_title("Cross-group constraint (Watts-Strogatz)")
ax.set_axis_off()

_images/9b3744643e8e951fbd8fcefca1c6ef3d5e2be8ff1a38bf4f16a39bbb493a2f7b.svg