pysgn.geo_barabasi_albert_network#
- geo_barabasi_albert_network(gdf, m: int, *, a: int = 3, scaling_factor: float | None = None, max_degree: int = 150, id_col: str | None = None, node_attributes: bool | str | list[str] = True, constraint: Callable | None = None, node_order: Callable[[DataFrame], ndarray] | str | None = None, random_state: int | None = None, verbose: bool = False) Graph[source]#
Construct a geo Barabási-Albert network with geospatial preferential attachment.
The Geospatial Barabási-Albert (BA) model is a geospatial modification of the classical BA model, incorporating spatial factors into the preferential attachment mechanism. When adding a new node, the probability of attaching to an existing node is proportional to the existing node’s degree and a geospatial decay function based on the distance between the nodes:
\[p_i(\textrm{distance}|a, \textrm{min\_dist}) \propto k_i \cdot \textrm{min}\left(1, \left(\frac{\textrm{distance}}{\textrm{min\_dist}}\right) ^ {-a}\right)\]where \(k_i\) is the degree of existing node \(i\), \(min\_dist\) is the minimum distance between nodes, and \(a\) is the distance decay exponent parameter, default is 3. The minimum distance is a threshold, below which nodes are connected with probability 1, if an edge is chosen to be rewired. It is 1/20 of the bounding box diagonal by default. Users can set the scaling factor directly if needed, which is the inverse of the minimum distance.
The new node attaches to m different nodes chosen without replacement with these normalized probabilities.
For the first m nodes, a seed network is created by fully connecting them. The seed network is then used to grow the network by adding one node at a time.
- Args:
gdf (gpd.GeoDataFrame): GeoDataFrame containing nodes.
m (int): Number of edges to attach from a new node to existing nodes (and size of the seed network).
- Keyword Args:
a (int): distance decay exponent parameter, default is 3
- scaling_factor (float): scaling factor is the inverse of the minimum distance between nodes, default is None.
The minimum distance is a threshold, below which nodes are connected with probability 1, if an edge is chosen to be rewired. If None, the scaling factor will be calculated based on the bounding box of the GeoDataFrame.
max_degree (int): maximum degree centrality allowed, default is 150
- id_col (str): column name containing unique IDs, default is None.
If “index”, the index of the GeoDataFrame will be used as the unique ID. If a column name, the values in the column will be used as the unique ID. If None, the positional index of the node will be used as the unique ID.
- node_attributes (bool | str | list[str]): node attributes to save in the graph, default is True.
If True, all attributes will be saved as node attributes. If False, only the position of the nodes will be saved as a pos attribute. If a string or a list of strings, the attributes will be saved as node attributes.
- constraint (Callable | None): constraint function to filter out invalid neighbors, default is None
Example: constraint=lambda u, v: u.household != v.household This will ensure that nodes from the same household are not connected.
- node_order (Callable[[gpd.GeoDataFrame], np.ndarray] | str | None): A function or column name to determine the order in which nodes are added.
If None, nodes are added sequentially as they appear in the GeoDataFrame. If a callable, the function should take a GeoDataFrame and return an array of indices. If a string, the string is interpreted as a column name that contains order indices.
random_state (int | None): random seed for reproducibility, default is None.
verbose (bool): whether to show detailed progress messages, default is False
- Returns:
nx.Graph: a geo barabasi-albert network graph