spatialcells.measurements
Module for computing region-based measurements.
Access via either spatialcells.measurements
or spatialcells.msmt
.
getCellKDE
- spatialcells.msmt.getCellKDE(adata, regions, phenotype_col=None, phenotype_subset=[], bandwidth=1, name='kde_likelihood')
Get per cell log likelihood based on a kernel density estimate. This can be normalized by the area of the region to be comparable across regions. Likelihoods will be stored in adata.obs[name].
- Parameters:
adata – Anndata object
regions – A list of regions to compute the density in
phenotype_col – A list of columns to stratify the density by.
phenotype_subset – A list of cell type markers to subset the data by
getDistanceFromObject
- spatialcells.msmt.getDistanceFromObject(adata, object, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, name='distance', inplace=True, binned=False, binsize=10)
Get the minimum euclidean distance between each cell and a shapely object.
- Parameters:
adata – Anndata object
object – Shapely object to measure distance from
x – Name of the column containing the x coordinate. Default is “X_centroid”.
y – Name of the column containing the y coordinate. Default is “Y_centroid”.
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
name – Name of the column to store the distance in. Default is “distance”.
inplace – If True, add the distance column to adata.obs. If False, return a copy
binned – If True, bin the distances into bins of size binsize.
binsize – Size of the bins to use for binning. Default is 10.
- Returns:
If inplace is False, return a copy of adata with the distance column added
getDistanceFromPoint
- spatialcells.msmt.getDistanceFromPoint(adata, point, x='X_centroid', y='Y_centroid', region_col='region', region_subset=None, metric='angular', name='distance', inplace=True, binned=False, binsize=10)
Get the distance of each cell from a point.
- Parameters:
adata – Anndata object
point – iterable coordinate of a point in (x, y) to calculate distance from
x – Name of the column containing the x coordinate. Default is “X_centroid”.
y – Name of the column containing the y coordinate. Default is “Y_centroid”.
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
metric – metric to use for distance calculation. Metric can be “angular” or “euclidean”. Default is “angular”.
name – Name of the column to store the distance in. Default is “distance”.
inplace – If True, add the distance column to adata.obs. If False, return a copy
binned – If True, bin the distances into bins of size binsize.
binsize – Size of the bins to use for binning. Default is 10.
- Returns:
If inplace is False, return a copy of adata with the distance column added
getMPI
- spatialcells.msmt.getMPI(adata, prolif_markers, arrest_markers, thresh_prolif=0.5, thresh_arrest=0.5, use_obs=False, use_layer=None, col_name='MPI', inplace=True)
Get MPI from a list of markers and thresholds, adapted from Gaglia et al. 2022 https://doi.org/10.1038/s41556-022-00860-9. The MPI is defined as follows: -1 if max(arrest_markers) > thresh_arrest 1 else if max(prolif_markers) > thresh_prolif 0 otherwise
- Parameters:
adata – AnnData object
prolif_marker – List of proliferation markers
arrest_markers – List of arrest markers
thresh_prolif – Threshold for proliferation. Default is 0.5
thresh_arrest – Threshold for arrest, which should be set based on the expression levels of KI67 marker. Default is 0.5
use_obs – If True, use adata.obs[use_obs] to get the markers. Overrides use_layer. If use_obs==False and use_layer is None, use adata.X
use_layer – Layer to use for the analysis. If use_obs==False and use_layer is None, use adata.X
col_name – Name of the column to add to adata.obs
inplace – If True, add the column to adata.obs. If False, return a copy of adata with the column added
- Returns:
None, adds a column to adata.obs
getMinCellTypesDistance
- spatialcells.msmt.getMinCellTypesDistance(adata1, adata2)
Return the minimum distance between cell types of two AnnData objects.
- Parameters:
adata1 – Anndata object
adata2 – Anndata object
- Returns:
minimum distance between cell types
getRegionArea
- spatialcells.msmt.getRegionArea(boundary, exclude_holes=True)
Get the area of a region defined by a MultiPolygon boundary. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region.
- Parameters:
boundary – MultiPolygon boundary of the region
exclude_holes – whether to exclude the holes in the region
- Returns:
area of the region
getRegionCentroid
- spatialcells.msmt.getRegionCentroid(boundary)
Get the centroid of a region defined by a list of region boundary components.
- Parameters:
boundary – A MultiPolygon object defining the boundary of the region
- Returns:
The centroid of the region
getRegionComposition
- spatialcells.msmt.getRegionComposition(adata, phenotype_col, regions=None, regioncol='region')
Get the cell type composition of a region.
- Parameters:
adata – Anndata object
phenotype_col – list of columns containing the cell type markers
regions – List of regions to consider. If None, consider all cells.
regioncol – Column containing the region information
- Returns:
A dataframe containing the cell type composition of the region
getRegionDensity
- spatialcells.msmt.getRegionDensity(adata, boundary, region_col='region', region_subset=None, phenotype_col=[], exclude_holes=True)
Get the density of cells in a region defined by a list of Polygon objects. If phenotype_col is empty, return the total density. If exclude_holes is True, the area of the holes in the region is subtracted from the overall area of the region for density calculation.
- Parameters:
adata – Anndata object
boundary – A multiPolygon object defining the boundary of the region
region_col – Name of the column containing the region. Default is “region”.
region_subset – List of regions to consider. If None, consider all cells.
phenotype_col – A list of columns to stratify the density by. If empty, return the total density.
exclude_holes – whether to exclude the holes in the region
- Returns:
density of cells in the region as a pandas Series stratified by phenotype_col
getSlidingWindowsComposition
- spatialcells.msmt.getSlidingWindowsComposition(adata, window_size, step_size, phenotype_col, region_col='region', region_subset=None, min_cells=0)
Get Sliding window cell composition for cells in region subset.
- Parameters:
adata – Anndata object
window_size – Size of the sliding window
step_size – Size of the step
phenotype_col – list of columns containing the cell type markers, for cell type composition
region_col – Column containing the region information
region_subset – List of regions to consider. If None, consider all cells.
min_cells – Minimum number of cells in a window to consider it
- Returns:
A dataframe containing the cell type composition of the region in each window
get_comp_mask
- spatialcells.msmt.get_comp_mask(df, pheno_col, pheno_vals, step_size)
Get a mask of the composition of the region in each window
- Parameters:
df – A dataframe containing the cell type composition of pheno_vals in each window
pheno_col – Column containing the cell type information
pheno_vals – List of cell types to consider
step_size – Size of the step
- Returns:
A np array mask of the composition of the region in each window