spatialcells.measurements

Module for computing region-based measurements.

Access via either spatialcells.measurements or spatialcells.msmt.

getCellKDE

spatialcells.msmt.getCellKDE(adata, regions, phenotype_col=None, phenotype_subset=[], bandwidth=1, name='kde_likelihood')

Get per cell log likelihood based on a kernel density estimate. This can be normalized by the area of the region to be comparable across regions. Likelihoods will be stored in adata.obs[name].

Parameters:
  • adata – Anndata object

  • regions – A list of regions to compute the density in

  • phenotype_col – A list of columns to stratify the density by.

  • phenotype_subset – A list of cell type markers to subset the data by

getDistanceFromObject

spatialcells.msmt.getDistanceFromObject(adata, object, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, name='distance', inplace=True, binned=False, binsize=10)

Get the minimum euclidean distance between each cell and a shapely object.

Parameters:
  • adata – Anndata object

  • object – Shapely object to measure distance from

  • x – Name of the column containing the x coordinate. Default is “X_centroid”.

  • y – Name of the column containing the y coordinate. Default is “Y_centroid”.

  • region_col – Name of the column containing the region. Default is “region”.

  • region_subset – List of regions to consider. If None, consider all cells.

  • name – Name of the column to store the distance in. Default is “distance”.

  • inplace – If True, add the distance column to adata.obs. If False, return a copy

  • binned – If True, bin the distances into bins of size binsize.

  • binsize – Size of the bins to use for binning. Default is 10.

Returns:

If inplace is False, return a copy of adata with the distance column added

getDistanceFromPoint

spatialcells.msmt.getDistanceFromPoint(adata, point, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, metric='angular', name='distance', inplace=True, binned=False, binsize=10)

Get the distance of each cell from a point.

Parameters:
  • adata – Anndata object

  • point – iterable coordinate of a point in (x, y) to calculate distance from

  • x – Name of the column containing the x coordinate. Default is “X_centroid”.

  • y – Name of the column containing the y coordinate. Default is “Y_centroid”.

  • region_col – Name of the column containing the region. Default is “region”.

  • region_subset – List of regions to consider. If None, consider all cells.

  • metric – metric to use for distance calculation. Metric can be “angular” or “euclidean”. Default is “angular”.

  • name – Name of the column to store the distance in. Default is “distance”.

  • inplace – If True, add the distance column to adata.obs. If False, return a copy

  • binned – If True, bin the distances into bins of size binsize.

  • binsize – Size of the bins to use for binning. Default is 10.

Returns:

If inplace is False, return a copy of adata with the distance column added

getMPI

spatialcells.msmt.getMPI(adata, prolif_markers, arrest_markers, thresh_prolif=0.5, thresh_arrest=0.5, use_obs=False, use_layer=None, col_name='MPI', inplace=True)

Get MPI from a list of markers and thresholds, adapted from Gaglia et al. 2022 https://doi.org/10.1038/s41556-022-00860-9. The MPI is defined as follows: -1 if max(arrest_markers) > thresh_arrest 1 else if max(prolif_markers) > thresh_prolif 0 otherwise

Parameters:
  • adata – AnnData object

  • prolif_marker – List of proliferation markers

  • arrest_markers – List of arrest markers

  • thresh_prolif – Threshold for proliferation. Default is 0.5

  • thresh_arrest – Threshold for arrest, which should be set based on the expression levels of KI67 marker. Default is 0.5

  • use_obs – If True, use adata.obs[use_obs] to get the markers. Overrides use_layer. If use_obs==False and use_layer is None, use adata.X

  • use_layer – Layer to use for the analysis. If use_obs==False and use_layer is None, use adata.X

  • col_name – Name of the column to add to adata.obs

  • inplace – If True, add the column to adata.obs. If False, return a copy of adata with the column added

Returns:

None, adds a column to adata.obs

getMinCellTypesDistance

spatialcells.msmt.getMinCellTypesDistance(adata1, adata2)

Return the minimum distance between cell types of two AnnData objects.

Parameters:
  • adata1 – Anndata object

  • adata2 – Anndata object

Returns:

minimum distance between cell types

getRegionArea

spatialcells.msmt.getRegionArea(boundary, exclude_holes=True)

Get the area of a region defined by a MultiPolygon boundary. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region.

Parameters:
  • boundary – MultiPolygon boundary of the region

  • exclude_holes – whether to exclude the holes in the region

Returns:

area of the region

getRegionCentroid

spatialcells.msmt.getRegionCentroid(boundary)

Get the centroid of a region defined by a list of region boundary components.

Parameters:

boundary – A MultiPolygon object defining the boundary of the region

Returns:

The centroid of the region

getRegionComposition

spatialcells.msmt.getRegionComposition(adata, phenotype_col, regions=None, regioncol='region')

Get the cell type composition of a region.

Parameters:
  • adata – Anndata object

  • phenotype_col – list of columns containing the cell type markers

  • regions – List of regions to consider. If None, consider all cells.

  • regioncol – Column containing the region information

Returns:

A dataframe containing the cell type composition of the region

getRegionDensity

spatialcells.msmt.getRegionDensity(adata, boundary, region_col='region', region_subset=None, phenotype_col=[], exclude_holes=True)

Get the density of cells in a region defined by a list of Polygon objects. If phenotype_col is empty, return the total density. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region for density calculation.

Parameters:
  • adata – Anndata object

  • boundary – A multiPolygon object defining the boundary of the region

  • region_col – Name of the column containing the region. Default is “region”.

  • region_subset – List of regions to consider. If None, consider all cells.

  • phenotype_col – A list of columns to stratify the density by. If empty, return the total density.

  • exclude_holes – whether to exclude the holes in the region

Returns:

density of cells in the region as a pandas Series stratified by phenotype_col

getSlidingWindowsComposition

spatialcells.msmt.getSlidingWindowsComposition(adata, window_size, step_size, phenotype_col, region_col='region', region_subset=None, min_cells=0)

Get Sliding window cell composition for cells in region subset.

Parameters:
  • adata – Anndata object

  • window_size – Size of the sliding window

  • step_size – Size of the step

  • phenotype_col – list of columns containing the cell type markers, for cell type composition

  • region_col – Column containing the region information

  • region_subset – List of regions to consider. If None, consider all cells.

  • min_cells – Minimum number of cells in a window to consider it

Returns:

A dataframe containing the cell type composition of the region in each window

get_comp_mask

spatialcells.msmt.get_comp_mask(df, pheno_col, pheno_vals, step_size)

Get a mask of the composition of the region in each window

Parameters:
  • df – A dataframe containing the cell type composition of pheno_vals in each window

  • pheno_col – Column containing the cell type information

  • pheno_vals – List of cell types to consider

  • step_size – Size of the step

Returns:

A np array mask of the composition of the region in each window