spatialcells.measurements

Module for computing region-based measurements.

Access via either spatialcells.measurements or spatialcells.msmt.

getCellKDE

spatialcells.msmt.getCellKDE(adata, regions, phenotype_col=None, phenotype_subset=[], bandwidth=1, name='kde_likelihood')

Get per cell log likelihood based on a kernel density estimate. This can be normalized by the area of the region to be comparable across regions. Likelihoods will be stored in adata.obs[name].

Parameters:

adata – Anndata object
regions – A list of regions to compute the density in
phenotype_col – A list of columns to stratify the density by.
phenotype_subset – A list of cell type markers to subset the data by

getDistanceFromObject

spatialcells.msmt.getDistanceFromObject(adata, object, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, name='distance', inplace=True, binned=False, binsize=10)

Get the minimum euclidean distance between each cell and a shapely object.

Parameters:

adata – Anndata object
object – Shapely object to measure distance from
x – Name of the column containing the x coordinate. Default is “X_centroid”.
y – Name of the column containing the y coordinate. Default is “Y_centroid”.
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
name – Name of the column to store the distance in. Default is “distance”.
inplace – If True, add the distance column to adata.obs. If False, return a copy
binned – If True, bin the distances into bins of size binsize.
binsize – Size of the bins to use for binning. Default is 10.

Returns:

If inplace is False, return a copy of adata with the distance column added

getDistanceFromPoint

spatialcells.msmt.getDistanceFromPoint(adata, point, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, metric='angular', name='distance', inplace=True, binned=False, binsize=10)

Get the distance of each cell from a point.

Parameters:

adata – Anndata object
point – iterable coordinate of a point in (x, y) to calculate distance from
x – Name of the column containing the x coordinate. Default is “X_centroid”.
y – Name of the column containing the y coordinate. Default is “Y_centroid”.
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
metric – metric to use for distance calculation. Metric can be “angular” or “euclidean”. Default is “angular”.
name – Name of the column to store the distance in. Default is “distance”.
inplace – If True, add the distance column to adata.obs. If False, return a copy
binned – If True, bin the distances into bins of size binsize.
binsize – Size of the bins to use for binning. Default is 10.

Returns:

If inplace is False, return a copy of adata with the distance column added

getMPI

spatialcells.msmt.getMPI(adata, prolif_markers, arrest_markers, thresh_prolif=0.5, thresh_arrest=0.5, use_obs=False, use_layer=None, col_name='MPI', inplace=True)

Get MPI from a list of markers and thresholds, adapted from Gaglia et al. 2022 https://doi.org/10.1038/s41556-022-00860-9. The MPI is defined as follows: -1 if max(arrest_markers) > thresh_arrest 1 else if max(prolif_markers) > thresh_prolif 0 otherwise

Parameters:

adata – AnnData object
prolif_marker – List of proliferation markers
arrest_markers – List of arrest markers
thresh_prolif – Threshold for proliferation. Default is 0.5
thresh_arrest – Threshold for arrest, which should be set based on the expression levels of KI67 marker. Default is 0.5
use_obs – If True, use adata.obs[use_obs] to get the markers. Overrides use_layer. If use_obs==False and use_layer is None, use adata.X
use_layer – Layer to use for the analysis. If use_obs==False and use_layer is None, use adata.X
col_name – Name of the column to add to adata.obs
inplace – If True, add the column to adata.obs. If False, return a copy of adata with the column added

Returns:

None, adds a column to adata.obs

getMinCellTypesDistance

spatialcells.msmt.getMinCellTypesDistance(adata1, adata2)

Return the minimum distance between cell types of two AnnData objects.

Parameters:

adata1 – Anndata object
adata2 – Anndata object

Returns:

minimum distance between cell types

getRegionArea

spatialcells.msmt.getRegionArea(boundary, exclude_holes=True)

Get the area of a region defined by a MultiPolygon boundary. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region.

Parameters:

boundary – MultiPolygon boundary of the region
exclude_holes – whether to exclude the holes in the region

Returns:

area of the region

getRegionCentroid

spatialcells.msmt.getRegionCentroid(boundary)

Get the centroid of a region defined by a list of region boundary components.

Parameters:: boundary – A MultiPolygon object defining the boundary of the region
Returns:: The centroid of the region

getRegionComposition

spatialcells.msmt.getRegionComposition(adata, phenotype_col, regions=None, regioncol='region')

Get the cell type composition of a region.

Parameters:

adata – Anndata object
phenotype_col – list of columns containing the cell type markers
regions – List of regions to consider. If None, consider all cells.
regioncol – Column containing the region information

Returns:

A dataframe containing the cell type composition of the region

getRegionDensity

spatialcells.msmt.getRegionDensity(adata, boundary, region_col='region', region_subset=None, phenotype_col=[], exclude_holes=True)

Get the density of cells in a region defined by a list of Polygon objects. If phenotype_col is empty, return the total density. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region for density calculation.

Parameters:

adata – Anndata object
boundary – A multiPolygon object defining the boundary of the region
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
phenotype_col – A list of columns to stratify the density by. If empty, return the total density.
exclude_holes – whether to exclude the holes in the region

Returns:

density of cells in the region as a pandas Series stratified by phenotype_col

getSlidingWindowsComposition

spatialcells.msmt.getSlidingWindowsComposition(adata, window_size, step_size, phenotype_col, region_col='region', region_subset=None, min_cells=0)

Get Sliding window cell composition for cells in region subset.

Parameters:

adata – Anndata object
window_size – Size of the sliding window
step_size – Size of the step
phenotype_col – list of columns containing the cell type markers, for cell type composition
region_col – Column containing the region information
region_subset – List of regions to consider. If None, consider all cells.
min_cells – Minimum number of cells in a window to consider it

Returns:

A dataframe containing the cell type composition of the region in each window

get_comp_mask

spatialcells.msmt.get_comp_mask(df, pheno_col, pheno_vals, step_size)

Get a mask of the composition of the region in each window

Parameters:

df – A dataframe containing the cell type composition of pheno_vals in each window
pheno_col – Column containing the cell type information
pheno_vals – List of cell types to consider
step_size – Size of the step

Returns:

A np array mask of the composition of the region in each window