This page was generated from notebooks/training/panopticnets/Nuclear Segmentation - DeepWatershed.ipynb
Training a segmentation model
deepcell-tf
leverages Jupyter Notebooks in order to train models. Example notebooks are available for most model architectures in the notebooks folder. Most notebooks are structured similarly to this example and thus this notebook serves as a core reference for the deepcell approach to model training.
[ ]:
import os
import matplotlib.pyplot as plt
import numpy as np
from skimage.feature import peak_local_max
import tensorflow as tf
from deepcell.applications import NuclearSegmentation
from deepcell.image_generators import CroppingDataGenerator
from deepcell.losses import weighted_categorical_crossentropy
from deepcell.model_zoo.panopticnet import PanopticNet
from deepcell.utils.train_utils import count_gpus, rate_scheduler
from deepcell_toolbox.deep_watershed import deep_watershed
from deepcell_toolbox.metrics import Metrics
from deepcell_toolbox.processing import histogram_normalization
File paths
[ ]:
data_dir = '/notebooks/data'
model_path = 'NuclearSegmentation'
metrics_path = 'metrics.yaml'
train_log = 'train_log.csv'
Load the data
The DynamicNuclearNet tracking dataset can be downloaded from https://datasets.deepcell.org/
[ ]:
with np.load(os.path.join(data_dir, 'train.npz')) as data:
X_train = data['X']
y_train = data['y']
with np.load(os.path.join(data_dir, 'val.npz')) as data:
X_val = data['X']
y_val = data['y']
with np.load(os.path.join(data_dir, 'test.npz')) as data:
X_test = data['X']
y_test = data['y']
Training parameters
The majority of DeepCell models support a variety backbone choices specified in the “backbone” parameter. Backbones are provided through keras_applications and can be instantiated with weights that are pretrained on ImageNet.
[ ]:
# Model architecture
backbone = "efficientnetv2bl"
location = True
pyramid_levels = ["P1","P2","P3","P4","P5","P6","P7"]
[ ]:
# Augmentation and transform parameters
seed = 0
min_objects = 1
zoom_min = 0.75
crop_size = 256
outer_erosion_width = 1
inner_distance_alpha = "auto"
inner_distance_beta = 1
inner_erosion_width = 0
[ ]:
# Post processing parameters
maxima_threshold = 0.1
interior_threshold = 0.01
exclude_border = False
small_objects_threshold = 0
min_distance = 10
[ ]:
# Training configuration
epochs = 16
batch_size = 16
lr = 1e-4
Create data generators
[ ]:
# data augmentation parameters
zoom_max = 1 / zoom_min
# Preprocess the data
X_train = histogram_normalization(X_train)
X_val = histogram_normalization(X_val)
# use augmentation for training but not validation
datagen = CroppingDataGenerator(
rotation_range=180,
zoom_range=(zoom_min, zoom_max),
horizontal_flip=True,
vertical_flip=True,
crop_size=(crop_size, crop_size),
)
datagen_val = CroppingDataGenerator(
crop_size=(crop_size, crop_size)
)
[ ]:
transforms = ["inner-distance", "outer-distance", "fgbg"]
transforms_kwargs = {
"outer-distance": {"erosion_width": outer_erosion_width},
"inner-distance": {
"alpha": inner_distance_alpha,
"beta": inner_distance_beta,
"erosion_width": inner_erosion_width,
},
}
train_data = datagen.flow(
{'X': X_train, 'y': y_train},
seed=seed,
min_objects=min_objects,
transforms=transforms,
transforms_kwargs=transforms_kwargs,
batch_size=batch_size,
)
print("Created training data generator.")
val_data = datagen_val.flow(
{'X': X_val, 'y': y_val},
seed=seed,
min_objects=min_objects,
transforms=transforms,
transforms_kwargs=transforms_kwargs,
batch_size=batch_size,
)
print("Created validation data generator.")
Visualize the data generator output.
[ ]:
inputs, outputs = train_data.next()
img = inputs[0]
inner_distance = outputs[0]
outer_distance = outputs[1]
fgbg = outputs[2]
fig, axes = plt.subplots(1, 4, figsize=(15, 15))
axes[0].imshow(img[..., 0])
axes[0].set_title('Source Image')
axes[1].imshow(inner_distance[0, ..., 0])
axes[1].set_title('Inner Distance')
axes[2].imshow(outer_distance[0, ..., 0])
axes[2].set_title('Outer Distance')
axes[3].imshow(fgbg[0, ..., 0])
axes[3].set_title('Foreground/Background')
plt.show()
Create the PanopticNet Model
Here we instantiate a PanopticNet
model from deepcell.model_zoo
using 3 semantic heads: inner distance (1 class), outer distance (1 class), foreground/background distance (2 classes)
[ ]:
input_shape = (crop_size, crop_size, 1)
model = PanopticNet(
backbone=backbone,
input_shape=input_shape,
norm_method=None,
num_semantic_classes=[1, 1, 2], # inner distance, outer distance, fgbg
location=location,
include_top=True,
backbone_levels=["C1", "C2", "C3", "C4", "C5"],
pyramid_levels=pyramid_levels,
)
Create a loss function for each semantic head
Each semantic head is trained with it’s own loss function. Mean Square Error is used for regression-based heads, whereas weighted_categorical_crossentropy
is used for classification heads.
The losses are saved as a dictionary and passed to model.compile
.
[ ]:
def semantic_loss(n_classes):
def _semantic_loss(y_pred, y_true):
if n_classes > 1:
return 0.01 * weighted_categorical_crossentropy(
y_pred, y_true, n_classes=n_classes
)
return tf.keras.losses.MSE(y_pred, y_true)
return _semantic_loss
loss = {}
# Give losses for all of the semantic heads
for layer in model.layers:
if layer.name.startswith("semantic_"):
n_classes = layer.output_shape[-1]
loss[layer.name] = semantic_loss(n_classes)
optimizer = tf.keras.optimizers.Adam(lr=lr, clipnorm=0.001)
model.compile(loss=loss, optimizer=optimizer)
Train the model
Call fit
on the compiled model, along with a default set of callbacks.
[ ]:
# Clear clutter from previous TensorFlow graphs.
tf.keras.backend.clear_session()
monitor = "val_loss"
csv_logger = tf.keras.callbacks.CSVLogger(train_log)
# Create callbacks for early stopping and pruning.
callbacks = [
tf.keras.callbacks.ModelCheckpoint(
model_path,
monitor=monitor,
save_best_only=True,
verbose=1,
save_weights_only=False,
),
tf.keras.callbacks.LearningRateScheduler(rate_scheduler(lr=lr, decay=0.99)),
tf.keras.callbacks.ReduceLROnPlateau(
monitor=monitor,
factor=0.1,
patience=5,
verbose=1,
mode="auto",
min_delta=0.0001,
cooldown=0,
min_lr=0,
),
tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True),
csv_logger,
]
print(f"Training on {count_gpus()} GPUs.")
# Train model.
history = model.fit(
train_data,
steps_per_epoch=train_data.y.shape[0] // batch_size,
epochs=epochs,
validation_data=val_data,
validation_steps=val_data.y.shape[0] // batch_size,
callbacks=callbacks,
)
print("Final", monitor, ":", history.history[monitor][-1])
Save prediction model
We can now create a new prediction model without the foreground background semantic head. While this head is very useful during training, the output is unused during prediction. By using model.load_weights(path, by_name=True)
, the semantic head can be removed.
[ ]:
with tempfile.TemporaryDirectory() as tmpdirname:
weights_path = os.path.join(str(tmpdirname), "model_weights.h5")
model.save_weights(weights_path, save_format="h5")
prediction_model = PanopticNet(
backbone=backbone,
input_shape=input_shape,
norm_method=None,
num_semantic_heads=2,
num_semantic_classes=[1, 1], # inner distance, outer distance
location=location, # should always be true
include_top=True,
backbone_levels=["C1", "C2", "C3", "C4", "C5"],
pyramid_levels=pyramid_levels,
)
prediction_model.load_weights(weights_path, by_name=True)
Predict on test data
[ ]:
X_test = histograph_normalization(X_test)
test_images = prediction_model.predict(X_test)
[ ]:
index = np.random.choice(X_test.shape[0])
print(index)
fig, axes = plt.subplots(1, 4, figsize=(20, 20))
masks = deep_watershed(
test_images,
radius=radius,
maxima_threshold=maxima_threshold,
interior_threshold=interior_threshold,
exclude_border=exclude_border,
small_objects_threshold=small_objects_threshold,
min_distance=min_distance
)
# calculated in the postprocessing above, but useful for visualizing
inner_distance = test_images[0]
outer_distance = test_images[1]
coords = peak_local_max(
inner_distance[index],
min_distance=min_distance
)
# raw image with centroid
axes[0].imshow(X_test[index, ..., 0])
axes[0].scatter(coords[..., 1], coords[..., 0],
color='r', marker='.', s=10)
axes[1].imshow(inner_distance[index, ..., 0], cmap='jet')
axes[2].imshow(outer_distance[index, ..., 0], cmap='jet')
axes[3].imshow(masks[index, ...], cmap='jet')
plt.show()
Evaluate results
The deepcell.metrics
package is used to measure advanced metrics for instance segmentation predictions.
[ ]:
outputs = model.predict(X_test)
y_pred = []
for i in range(outputs[0].shape[0]):
mask = deep_watershed(
[t[[i]] for t in outputs],
radius=radius,
maxima_threshold=maxima_threshold,
interior_threshold=interior_threshold,
exclude_border=exclude_border,
small_objects_threshold=small_objects_threshold,
min_distance=min_distance)
y_pred.append(mask[0])
y_pred = np.stack(y_pred, axis=0)
y_pred = np.expand_dims(y_pred, axis=-1)
y_true = y_test.copy()
m = Metrics('DeepWatershed', seg=False)
m.calc_object_stats(y_true, y_pred)
This page was generated from notebooks/training/tracking/Training and Tracking with GNNs.ipynb
This notebook is part of the deepcell-tf documentation: https://deepcell.readthedocs.io/.
Training a cell tracking model
[ ]:
import os
import numpy as np
import tensorflow as tf
from tensorflow.keras.callbacks import CSVLogger
from tensorflow_addons.optimizers import RectifiedAdam
import yaml
import deepcell
from deepcell.data.tracking import Track, random_rotate, random_translate, temporal_slice
from deepcell.losses import weighted_categorical_crossentropy
from deepcell.model_zoo.tracking import GNNTrackingModel
from deepcell.utils.tfrecord_utils import get_tracking_dataset, write_tracking_dataset_to_tfr
from deepcell.utils.train_utils import count_gpus, rate_scheduler
from deepcell_toolbox.metrics import Metrics
from deepcell_tracking import CellTracker
from deepcell_tracking.metrics import benchmark_tracking_performance, calculate_summary_stats
from deepcell_tracking.trk_io import load_trks
from deepcell_tracking.utils import get_max_cells, is_valid_lineage
The DynamicNuclearNet tracking dataset can be downloaded from https://datasets.deepcell.org/
[ ]:
# Please change these file paths to match your file system.
data_dir = '/notebooks/data'
inf_model_path = "NuclearTrackingInf"
ne_model_path = "NuclearTrackingNE"
metrics_path = "train-metrics.yaml"
train_log_path = "train_log.csv"
prediction_dir = 'output'
# Check that prediction directory exists and make if needed
if not os.path.exists(prediction_dir):
os.makedirs(prediction_dir)
Prepare the data for training
Tracked data are stored as .trks
files. These files include images and lineage data in np.arrays. To manipulate .trks
files, use deepcell_tracking.trk_io.load_trks
and deepcell_tracking.trk_io.save_trks
.
To facilitate training, we transform each movie’s image and lineage data into a Track
object. Tracks
help to encapsulate all of the feature creation from the movie, including:
Appearances:
(num_frames, num_objects, 32, 32, 1)
Morphologies:
(num_frames, num_objects, 32, 32, 3)
Centroids:
(num_frames, num_objects, 2)
Normalized Adjacency Matrix:
(num_frames, num_objects, num_objects, 3)
Temporal Adjacency Matrix (comparing across frames):
(num_frames - 1, num_objects, num_objects, 3)
Each Track
is then saved as a tfrecord file in order to load data from disk during training and reduce the total memory footprint.
[ ]:
appearance_dim = 32
distance_threshold = 64
crop_mode = "resize"
[ ]:
# This cell may take ~20 minutes to run
train_trks = load_trks(os.path.join(data_dir, "train.trks"))
val_trks = load_trks(os.path.join(data_dir, "val.trks"))
max_cells = max([get_max_cells(train_trks["y"]), get_max_cells(val_trks["y"])])
for split, trks in zip({"train", "val"}, [train_trks, val_trks]):
print(f"Preparing {split} as tf record")
with tf.device("/cpu:0"):
tracks = Track(
tracked_data=trks,
appearance_dim=appearance_dim,
distance_threshold=distance_threshold,
crop_mode=crop_mode,
)
write_tracking_dataset_to_tfr(
tracks, target_max_cells=max_cells, filename=split
)
Training
Define training parameters
[ ]:
# Model architecture
n_layers = 1 # Number of graph convolution layers
n_filters = 64
encoder_dim = 64
embedding_dim = 64
graph_layer = "gat"
norm_layer = "batch"
[ ]:
# Data and augmentation
seed = 0
track_length = 8 # Number of frames per track object
rotation_range = 180
translation_range = 512
buffer_size = 128
[ ]:
# Training configuration
batch_size = 8
epochs = 50
steps_per_epoch = 1000
validation_steps = 200
lr = 1e-3
Load TFRecord Data
[ ]:
# Augmentation functions
def sample(X, y):
return temporal_slice(X, y, track_length=track_length)
def rotate(X, y):
return random_rotate(X, y, rotation_range=rotation_range)
def translate(X, y):
return random_translate(X, y, range=translation_range)
with tf.device("/cpu:0"):
train_data = get_tracking_dataset("train")
train_data = train_data.shuffle(buffer_size, seed=seed).repeat()
train_data = train_data.map(sample, num_parallel_calls=tf.data.AUTOTUNE)
train_data = train_data.map(rotate, num_parallel_calls=tf.data.AUTOTUNE)
train_data = train_data.map(translate, num_parallel_calls=tf.data.AUTOTUNE)
train_data = train_data.batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_data = get_tracking_dataset("val")
val_data = val_data.shuffle(buffer_size, seed=seed).repeat()
val_data = val_data.map(sample, num_parallel_calls=tf.data.AUTOTUNE)
val_data = val_data.batch(batch_size).prefetch(tf.data.AUTOTUNE)
max_cells = list(train_data.take(1))[0][0]["appearances"].shape[2]
Initialize the model
[ ]:
def filter_and_flatten(y_true, y_pred):
n_classes = tf.shape(y_true)[-1]
new_shape = [-1, n_classes]
y_true = tf.reshape(y_true, new_shape)
y_pred = tf.reshape(y_pred, new_shape)
# Mask out the padded cells
y_true_reduced = tf.reduce_sum(y_true, axis=-1)
good_loc = tf.where(y_true_reduced == 1)[:, 0]
y_true = tf.gather(y_true, good_loc, axis=0)
y_pred = tf.gather(y_pred, good_loc, axis=0)
return y_true, y_pred
class Recall(tf.keras.metrics.Recall):
def update_state(self, y_true, y_pred, sample_weight=None):
y_true, y_pred = filter_and_flatten(y_true, y_pred)
super().update_state(y_true, y_pred, sample_weight)
class Precision(tf.keras.metrics.Precision):
def update_state(self, y_true, y_pred, sample_weight=None):
y_true, y_pred = filter_and_flatten(y_true, y_pred)
super().update_state(y_true, y_pred, sample_weight)
def loss_function(y_true, y_pred):
y_true, y_pred = filter_and_flatten(y_true, y_pred)
return weighted_categorical_crossentropy(
y_true, y_pred, n_classes=tf.shape(y_true)[-1], axis=-1
)
[ ]:
strategy = tf.distribute.MirroredStrategy()
print(f"Number of devices: {strategy.num_replicas_in_sync}")
with strategy.scope():
model = GNNTrackingModel(
max_cells=max_cells,
graph_layer=graph_layer,
track_length=track_length,
n_filters=n_filters,
embedding_dim=embedding_dim,
encoder_dim=encoder_dim,
n_layers=n_layers,
norm_layer=norm_layer,
)
loss = {"temporal_adj_matrices": loss_function}
optimizer = RectifiedAdam(learning_rate=lr, clipnorm=0.001)
training_metrics = [
Recall(class_id=0, name="same_recall"),
Recall(class_id=1, name="different_recall"),
Recall(class_id=2, name="daughter_recall"),
Precision(class_id=0, name="same_precision"),
Precision(class_id=1, name="different_precision"),
Precision(class_id=2, name="daughter_precision"),
]
model.training_model.compile(
loss=loss, optimizer=optimizer, metrics=training_metrics
)
Train the model
[ ]:
# Clear clutter from previous TensorFlow graphs.
tf.keras.backend.clear_session()
monitor = "val_loss"
csv_logger = CSVLogger(train_log_path)
# Create callbacks for early stopping and pruning.
callbacks = [
tf.keras.callbacks.LearningRateScheduler(rate_scheduler(lr=lr, decay=0.99)),
tf.keras.callbacks.ReduceLROnPlateau(
monitor=monitor,
factor=0.1,
patience=5,
verbose=1,
mode="auto",
min_delta=0.0001,
cooldown=0,
min_lr=0,
),
csv_logger,
]
print(f"Training on {count_gpus()} GPUs.")
# Train model.
history = model.training_model.fit(
train_data,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
validation_data=val_data,
validation_steps=validation_steps,
callbacks=callbacks,
)
print("Final", monitor, ":", history.history[monitor][-1])
[ ]:
# Save models
model.inference_model.save(inf_model_path, include_optimizer=False, overwrite=True)
model.neighborhood_encoder.save(
ne_model_path, include_optimizer=False, overwrite=True
)
[ ]:
# Record training metrics
all_metrics = {
"metrics": {"training": {k: float(v[-1]) for k, v in history.history.items()}}
}
# save a metadata.yaml file in the saved model directory
with open(metrics_path, "w") as f:
yaml.dump(all_metrics, f)
Evaluate model performance
Set tracking parameters and CellTracker
[ ]:
death = 0.99
birth = 0.99
division = 0.01
Load test data
[ ]:
test_data = load_trks(os.path.join(data_dir, "test.trks"))
X_test = test_data["X"]
y_test = test_data["y"]
lineages_test = test_data["lineages"]
# Load metadata array
with np.load(os.path.join(data_dir, "data-source.npz"), allow_pickle=True) as data:
meta = data["test"]
Predict and benchmark
[ ]:
metrics = {}
exp_metrics = {}
bad_batches = []
for b in range(len(X_test)):
# currently NOT saving any recall/precision information
gt_path = os.path.join(prediction_dir, f"{b}-gt.trk")
res_path = os.path.join(prediction_dir, f"{b}-res.trk")
# Check that lineage is valid before proceeding
if not is_valid_lineage(y_test[b], lineages_test[b]):
bad_batches.append(b)
continue
frames = find_frames_with_objects(y_test[b])
tracker = CellTracker(
movie=X_test[b][frames],
annotation=y_test[b][frames],
track_length=track_length,
neighborhood_encoder=ne_model,
tracking_model=inf_model,
death=death,
birth=birth,
division=division,
)
try:
tracker.track_cells()
except Exception as err:
print(
"Failed to track batch {} due to {}: {}".format(
b, err.__class__.__name__, err
)
)
bad_batches.append(b)
continue
tracker.dump(res_path)
gt = {
"X": X_test[b][frames],
"y_tracked": y_test[b][frames],
"tracks": lineages_test[b],
}
tracker.dump(filename=gt_path, track_review_dict=gt)
results = benchmark_tracking_performance(
gt_path, res_path, threshold=iou_thresh
)
exp = meta[b, 1] # Grab the experiment column from metadata
tmp_exp = exp_metrics.get(exp, {})
for k in results:
if k in metrics:
metrics[k] += results[k]
else:
metrics[k] = results[k]
if k in tmp_exp:
tmp_exp[k] += results[k]
else:
tmp_exp[k] = results[k]
exp_metrics[exp] = tmp_exp
[ ]:
# Calculate summary stats for each set of metrics
tmp_metrics = metrics.copy()
del tmp_metrics["mismatch_division"]
summary = calculate_summary_stats(**tmp_metrics, n_digits=3)
metrics = {**metrics, **summary}
for exp, m in exp_metrics.items():
tmp_m = m.copy()
del tmp_m["mismatch_division"]
summary = calculate_summary_stats(**tmp_m, n_digits=3)
exp_metrics[exp] = {**m, **summary}
# save a metadata.yaml file in the saved model directory
with open(metrics_path, "w") as f:
yaml.dump(all_metrics, f)
[ ]:
deepcell API
deepcell.applications
Application
- class deepcell.applications.application.Application(model, model_image_shape=(128, 128, 1), model_mpp=0.65, preprocessing_fn=None, postprocessing_fn=None, format_model_output_fn=None, dataset_metadata=None, model_metadata=None)[source]
Bases:
object
Application object that takes a model with weights and manages predictions
- Parameters:
model (tensorflow.keras.Model) –
tf.keras.Model
with loaded weights.model_image_shape (tuple) – Shape of input expected by
model
.dataset_metadata (str or dict) – Metadata for the data that
model
was trained on.model_mpp (float) – Microns per pixel resolution of the training data used for
model
.preprocessing_fn (function) – Pre-processing function to apply to data prior to prediction.
postprocessing_fn (function) – Post-processing function to apply to data after prediction. Must accept an input of a list of arrays and then return a single array.
format_model_output_fn (function) – Convert model output from a list of matrices to a dictionary with keys for each semantic head.
- Raises:
ValueError –
preprocessing_fn
must be a callable functionValueError –
postprocessing_fn
must be a callable functionValueError –
model_output_fn
must be a callable function
- _batch_predict(tiles, batch_size)[source]
Batch process tiles to generate model predictions.
The built-in keras.predict function has support for batching, but loads the entire image stack into GPU memory, which is prohibitive for large images. This function uses similar code to the underlying model.predict function without soaking up GPU memory.
- _format_model_output(output_images)[source]
Applies formatting function the output from the model if one was provided. Otherwise, returns the unmodified model output.
- _postprocess(image, **kwargs)[source]
Applies postprocessing function to image if one has been defined. Otherwise returns unmodified image.
- Parameters:
image (numpy.array or list) – Input to postprocessing function either an
numpy.array
or list ofnumpy.arrays
.- Returns:
labeled image
- Return type:
numpy.array
- _predict_segmentation(image, batch_size=4, image_mpp=None, pad_mode='constant', preprocess_kwargs={}, postprocess_kwargs={})[source]
Generates a labeled image of the input running prediction with appropriate pre and post processing functions.
Input images are required to have 4 dimensions
[batch, x, y, channel]
. Additional empty dimensions can be added usingnp.expand_dims
.- Parameters:
image (numpy.array) – Input image with shape
[batch, x, y, channel]
.batch_size (int) – Number of images to predict on per batch.
image_mpp (float) – Microns per pixel for
image
.pad_mode (str) – The padding mode, one of “constant” or “reflect”.
preprocess_kwargs (dict) – Keyword arguments to pass to the pre-processing function.
postprocess_kwargs (dict) – Keyword arguments to pass to the post-processing function.
- Raises:
ValueError – Input data must match required rank, calculated as one dimension more (batch dimension) than expected by the model.
ValueError – Input data must match required number of channels.
- Returns:
Labeled image
- Return type:
numpy.array
- _preprocess(image, **kwargs)[source]
Preprocess
image
ifpreprocessing_fn
is defined. Otherwise returnimage
unmodified.- Parameters:
image (numpy.array) – 4D stack of images
kwargs (dict) – Keyword arguments for
preprocessing_fn
.
- Returns:
The pre-processed
image
.- Return type:
numpy.array
- _resize_input(image, image_mpp)[source]
Checks if there is a difference between image and model resolution and resizes if they are different. Otherwise returns the unmodified image.
- Parameters:
image (numpy.array) – Input image to resize.
image_mpp (float) – Microns per pixel for the
image
.
- Returns:
Input image resized if necessary to match
model_mpp
- Return type:
numpy.array
- _resize_output(image, original_shape)[source]
Rescales input if the shape does not match the original shape excluding the batch and channel dimensions.
- Parameters:
image (numpy.array) – Image to be rescaled to original shape
original_shape (tuple) – Shape of the original input image
- Returns:
Rescaled image
- Return type:
numpy.array
- _run_model(image, batch_size=4, pad_mode='constant', preprocess_kwargs={})[source]
Run the model to generate output probabilities on the data.
- Parameters:
- Returns:
Model outputs
- Return type:
numpy.array
- _tile_input(image, pad_mode='constant')[source]
Tile the input image to match shape expected by model using the
deepcell_toolbox
function.Only supports 4D images.
- Parameters:
image (numpy.array) – Input image to tile
pad_mode (str) – The padding mode, one of “constant” or “reflect”.
- Raises:
ValueError – Input images must have only 4 dimensions
- Returns:
Tuple of tiled image and dict of tiling information.
- Return type:
(numpy.array, dict)
CytoplasmSegmentation
- class deepcell.applications.cytoplasm_segmentation.CytoplasmSegmentation(model=None, preprocessing_fn=deepcell_toolbox.processing.histogram_normalization, postprocessing_fn=deepcell_toolbox.deep_watershed.deep_watershed)[source]
Bases:
Application
Loads a
deepcell.model_zoo.panopticnet.PanopticNet
model for cytoplasm segmentation with pretrained weights.The
predict
method handles prep and post processing steps to return a labeled image.Example:
from skimage.io import imread from deepcell.applications import CytoplasmSegmentation # Load the image im = imread('HeLa_cytoplasm.png') # Expand image dimensions to rank 4 im = np.expand_dims(im, axis=-1) im = np.expand_dims(im, axis=0) # Create the application app = CytoplasmSegmentation() # create the lab labeled_image = app.predict(image)
- Parameters:
model (tf.keras.Model) – The model to load. If
None
, a pre-trained model will be downloaded.
- dataset_metadata = {'name': 'general_cyto', 'other': 'Pooled phase and fluorescent cytoplasm data - computationally curated'}
Metadata for the dataset used to train the model
- model_metadata = {'batch_size': 16, 'lr': 0.0001, 'lr_decay': 0.9, 'n_epochs': 8, 'training_seed': 0, 'training_steps_per_epoch': 3949, 'validation_steps_per_epoch': 986}
Metadata for the model and training process
- predict(image, batch_size=4, image_mpp=None, pad_mode='reflect', preprocess_kwargs=None, postprocess_kwargs=None)[source]
Generates a labeled image of the input running prediction with appropriate pre and post processing functions.
Input images are required to have 4 dimensions
[batch, x, y, channel]
.Additional empty dimensions can be added using
np.expand_dims
.- Parameters:
image (numpy.array) – Input image with shape
[batch, x, y, channel]
.batch_size (int) – Number of images to predict on per batch.
image_mpp (float) – Microns per pixel for
image
.pad_mode (str) – The padding mode, one of “constant” or “reflect”.
preprocess_kwargs (dict) – Keyword arguments to pass to the pre-processing function.
postprocess_kwargs (dict) – Keyword arguments to pass to the post-processing function.
- Raises:
ValueError – Input data must match required rank of the application, calculated as one dimension more (batch dimension) than expected by the model.
ValueError – Input data must match required number of channels.
- Returns:
Labeled image
- Return type:
numpy.array
NuclearSegmentation
- class deepcell.applications.nuclear_segmentation.NuclearSegmentation(model=None, preprocessing_fn=deepcell_toolbox.processing.histogram_normalization, postprocessing_fn=deepcell_toolbox.deep_watershed.deep_watershed)[source]
Bases:
Application
Loads a
deepcell.model_zoo.panopticnet.PanopticNet
model for nuclear segmentation with pretrained weights.The
predict
method handles prep and post processing steps to return a labeled image.Example:
from skimage.io import imread from deepcell.applications import NuclearSegmentation # Load the image im = imread('HeLa_nuclear.png') # Expand image dimensions to rank 4 im = np.expand_dims(im, axis=-1) im = np.expand_dims(im, axis=0) # Create the application app = NuclearSegmentation() # create the lab labeled_image = app.predict(image)
- Parameters:
model (tf.keras.Model) – The model to load. If
None
, a pre-trained model will be downloaded.
- dataset_metadata = {'name': 'general_nuclear_train_large', 'other': 'Pooled nuclear data from HEK293, HeLa-S3, NIH-3T3, and RAW264.7 cells.'}
Metadata for the dataset used to train the model
- model_metadata = {'backbone': 'efficientnetv2bl', 'batch_size': 16, 'crop_size': 256, 'epochs': 16, 'location': True, 'lr': 0.0001, 'min_objects': 1, 'pyramid_levels': 'P1-P2-P3-P4-P5-P6-P7', 'zoom_min': 0.75}
Metadata for the model and training process
- predict(image, batch_size=4, image_mpp=None, pad_mode='reflect', preprocess_kwargs=None, postprocess_kwargs=None)[source]
Generates a labeled image of the input running prediction with appropriate pre and post processing functions.
Input images are required to have 4 dimensions
[batch, x, y, channel]
.Additional empty dimensions can be added using
np.expand_dims
.- Parameters:
image (numpy.array) – Input image with shape
[batch, x, y, channel]
.batch_size (int) – Number of images to predict on per batch.
image_mpp (float) – Microns per pixel for
image
.pad_mode (str) – The padding mode, one of “constant” or “reflect”.
preprocess_kwargs (dict) – Keyword arguments to pass to the pre-processing function.
postprocess_kwargs (dict) – Keyword arguments to pass to the post-processing function.
- Raises:
ValueError – Input data must match required rank of the application, calculated as one dimension more (batch dimension) than expected by the model.
ValueError – Input data must match required number of channels.
- Returns:
Labeled image
- Return type:
numpy.array
Mesmer
- class deepcell.applications.mesmer.Mesmer(model=None)[source]
Bases:
Application
Loads a
deepcell.model_zoo.panopticnet.PanopticNet
model for tissue segmentation with pretrained weights.The
predict
method handles prep and post processing steps to return a labeled image.Example:
from skimage.io import imread from deepcell.applications import Mesmer # Load the images im1 = imread('TNBC_DNA.tiff') im2 = imread('TNBC_Membrane.tiff') # Combined together and expand to 4D im = np.stack((im1, im2), axis=-1) im = np.expand_dims(im,0) # Create the application app = Mesmer() # create the lab labeled_image = app.predict(image)
- Parameters:
model (tf.keras.Model) – The model to load. If
None
, a pre-trained model will be downloaded.
- dataset_metadata = {'name': '20200315_IF_Training_6.npz', 'other': 'Pooled whole-cell data across tissue types'}
Metadata for the dataset used to train the model
- model_metadata = {'batch_size': 1, 'lr': 1e-05, 'lr_decay': 0.99, 'n_epochs': 30, 'training_seed': 0, 'training_steps_per_epoch': 1739, 'validation_steps_per_epoch': 193}
Metadata for the model and training process
- predict(image, batch_size=4, image_mpp=None, preprocess_kwargs={}, compartment='whole-cell', pad_mode='constant', postprocess_kwargs_whole_cell={}, postprocess_kwargs_nuclear={})[source]
Generates a labeled image of the input running prediction with appropriate pre and post processing functions.
Input images are required to have 4 dimensions
[batch, x, y, channel]
. Additional empty dimensions can be added usingnp.expand_dims
.- Parameters:
image (numpy.array) – Input image with shape
[batch, x, y, channel]
.batch_size (int) – Number of images to predict on per batch.
image_mpp (float) – Microns per pixel for
image
.compartment (str) – Specify type of segmentation to predict. Must be one of
"whole-cell"
,"nuclear"
,"both"
.preprocess_kwargs (dict) – Keyword arguments to pass to the pre-processing function.
postprocess_kwargs (dict) – Keyword arguments to pass to the post-processing function.
- Raises:
ValueError – Input data must match required rank of the application, calculated as one dimension more (batch dimension) than expected by the model.
ValueError – Input data must match required number of channels.
- Returns:
Instance segmentation mask.
- Return type:
numpy.array
CellTracking
- class deepcell.applications.cell_tracking.CellTracking(model=None, neighborhood_encoder=None, distance_threshold=64, appearance_dim=32, birth=0.99, death=0.99, division=0.01, track_length=8, embedding_axis=0, crop_mode='resize', norm=True)[source]
Bases:
Application
Loads a
deepcell.model_zoo.tracking.GNNTrackingModel
model for object tracking with pretrained weights using a simplepredict
interface.- Parameters:
model (
tf.keras.model
) – Tracking inference model, defaults to latest published modelneighborhood_encoder (
tf.keras.model
) – Tracking neighborhood encoder, defaults to latest published modeldistance_threshold (int) – Maximum distance between two cells to be considered adjacent
appearance_dim (int) – Length of appearance dimension
birth (float) – Cost of new cell in linear assignment matrix.
death (float) – Cost of cell death in linear assignment matrix.
division (float) – Cost of cell division in linear assignment matrix.
track_length (int) – Number of frames per track
crop_mode (str) – Type of cropping around each cell
norm (str) – Type of normalization layer
- dataset_metadata = {'name': 'tracked_nuclear_train_large', 'other': 'Pooled tracked nuclear data from HEK293, HeLa-S3, NIH-3T3, and RAW264.7 cells.'}
Metadata for the dataset used to train the model
- model_metadata = {'appearance_dim': 32, 'batch_size': 8, 'buffer_size': 128, 'crop_mode': 'resize', 'data_fraction': 1, 'distance_threshold': 64, 'embedding_dim': 64, 'encoder_dim': 64, 'epochs': 50, 'graph_layer': 'gat', 'lr': 0.001, 'n_filters': 64, 'n_layers': 1, 'norm_layer': 'batch', 'rotation_range': 180, 'steps_per_epoch': 1000, 'translation_range': 512, 'validation_steps': 200}
Metadata for the model and training process
LabelDetectionModel
- deepcell.applications.label_detection.LabelDetectionModel(input_shape=(None, None, 1), inputs=None, backbone='mobilenetv2', num_classes=3)[source]
Classify a microscopy image as Nuclear, Cytoplasm, or Phase.
This can be helpful in determining the type of data (nuclear, cytoplasm, etc.) so that this data can be forwared to the correct segmenation model.
Based on a standard backbone with an intiial
ImageNormalization2D
and finalAveragePooling2D
,TensorProduct
, andSoftmax
layers.- Parameters:
ScaleDetectionModel
- deepcell.applications.scale_detection.ScaleDetectionModel(input_shape=(None, None, 1), inputs=None, backbone='mobilenetv2')[source]
Create a
ScaleDetectionModel
for detecting scales of input data.This enables data to be scaled appropriately for other segmentation models which may not be resolution tolerant.
Based on a standard backbone with an intiial
ImageNormalization2D
and finalAveragePooling2D
andTensorProduct
layers.
deepcell.datasets
deepcell.datasets.dynamic_nuclear_net
deepcell.datasets.tissue_net
deepcell.datasets.spot_net
Module contents
deepcell.image_generators
fully_convolutional
sample
scale
tracking
deepcell.layers
Custom Layers
location
Layers to encode location data
- class deepcell.layers.location.Location2D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Location Layer for 2D cartesian coordinate locations.
- Parameters:
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
- class deepcell.layers.location.Location3D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Location Layer for 3D cartesian coordinate locations.
- Parameters:
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
normalization
Layers to noramlize input images for 2D and 3D images
- class deepcell.layers.normalization.ImageNormalization2D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Image Normalization layer for 2D data.
- Parameters:
norm_method (str) – Normalization method to use, one of: “std”, “max”, “whole_image”, None.
filter_size (int) – The length of the convolution window.
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.activation (function) – Activation function to use. If you don’t specify anything, no activation is applied (ie. “linear” activation:
a(x) = x
).use_bias (bool) – Whether the layer uses a bias.
kernel_initializer (function) – Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs.bias_initializer (function) – Initializer for the bias vector. If None, the default initializer will be used.
kernel_regularizer (function) – Regularizer function applied to the
kernel
weights matrix.bias_regularizer (function) – Regularizer function applied to the bias vector.
activity_regularizer (function) – Regularizer function applied to.
kernel_constraint (function) – Constraint function applied to the
kernel
weights matrix.bias_constraint (function) – Constraint function applied to the bias vector.
- class deepcell.layers.normalization.ImageNormalization3D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Image Normalization layer for 3D data.
- Parameters:
norm_method (str) – Normalization method to use, one of: “std”, “max”, “whole_image”, None.
filter_size (int) – The length of the convolution window.
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.activation (function) – Activation function to use. If you don’t specify anything, no activation is applied (ie. “linear” activation:
a(x) = x
).use_bias (bool) – Whether the layer uses a bias.
kernel_initializer (function) – Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs.bias_initializer (function) – Initializer for the bias vector. If None, the default initializer will be used.
kernel_regularizer (function) – Regularizer function applied to the
kernel
weights matrix.bias_regularizer (function) – Regularizer function applied to the bias vector.
activity_regularizer (function) – Regularizer function applied to.
kernel_constraint (function) – Constraint function applied to the
kernel
weights matrix.bias_constraint (function) – Constraint function applied to the bias vector.
padding
Layers for padding for 2D and 3D images
- class deepcell.layers.padding.ReflectionPadding2D(*args: Any, **kwargs: Any)[source]
Bases:
ZeroPadding2D
Reflection-padding layer for 2D input (e.g. picture).
This layer can add rows and columns of reflected values at the top, bottom, left and right side of an image tensor.
- Parameters:
padding (int, tuple) – If int, the same symmetric padding is applied to height and width. If tuple of 2 ints, interpreted as two different symmetric padding values for height and width:
(symmetric_height_pad, symmetric_width_pad)
. If tuple of 2 tuples of 2 ints, interpreted as((top_pad, bottom_pad), (left_pad, right_pad))
.data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
- class deepcell.layers.padding.ReflectionPadding3D(*args: Any, **kwargs: Any)[source]
Bases:
ZeroPadding3D
Reflection-padding layer for 3D data (spatial or spatio-temporal).
- Parameters:
padding (int, tuple) – The pad-width to add in each dimension. If an int, the same symmetric padding is applied to height and width. If a tuple of 3 ints, interpreted as two different symmetric padding values for height and width:
(symmetric_dim1_pad, symmetric_dim2_pad, symmetric_dim3_pad)
. If tuple of 3 tuples of 2 ints, interpreted as((left_dim1_pad, right_dim1_pad), (left_dim2_pad, right_dim2_pad), (left_dim3_pad, right_dim3_pad))
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
pooling
Layers to encode location data
- class deepcell.layers.pooling.DilatedMaxPool2D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Dilated max pooling layer for 2D inputs (e.g. images).
- Parameters:
pool_size (int) – An integer or tuple/list of 2 integers: (pool_height, pool_width) specifying the size of the pooling window. Can be a single integer to specify the same value for all spatial dimensions.
strides (int) – An integer or tuple/list of 2 integers, specifying the strides of the pooling operation. Can be a single integer to specify the same value for all spatial dimensions.
dilation_rate (int) – An integer or tuple/list of 2 integers, specifying the dilation rate for the pooling.
padding (str) – The padding method, either
"valid"
or"same"
(case-insensitive).data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
- class deepcell.layers.pooling.DilatedMaxPool3D(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Dilated max pooling layer for 3D inputs.
- Parameters:
pool_size (int) – An integer or tuple/list of 2 integers: (pool_height, pool_width) specifying the size of the pooling window. Can be a single integer to specify the same value for all spatial dimensions.
strides (int) – An integer or tuple/list of 2 integers, specifying the strides of the pooling operation. Can be a single integer to specify the same value for all spatial dimensions.
dilation_rate (int) – An integer or tuple/list of 2 integers, specifying the dilation rate for the pooling.
padding (str) – The padding method, either
"valid"
or"same"
(case-insensitive).data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
tensor_product
Layers to generate tensor products for 2D and 3D data
- class deepcell.layers.tensor_product.TensorProduct(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Just your regular densely-connected NN layer.
Dense implements the operation:
output = activation(dot(input, kernel) + bias)
where
activation
is the element-wise activation function passed as theactivation
argument,kernel
is a weights matrix created by the layer, andbias
is a bias vector created by the layer (only applicable ifuse_bias
isTrue
).Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with
kernel
.- Parameters:
output_dim (int) – Positive integer, dimensionality of the output space.
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.activation (function) – Activation function to use. If you don’t specify anything, no activation is applied (ie. “linear” activation:
a(x) = x
).use_bias (bool) – Whether the layer uses a bias.
kernel_initializer (function) – Initializer for the
kernel
weights matrix, used for the linear transformation of the inputs.bias_initializer (function) – Initializer for the bias vector. If None, the default initializer will be used.
kernel_regularizer (function) – Regularizer function applied to the
kernel
weights matrix.bias_regularizer (function) – Regularizer function applied to the bias vector.
activity_regularizer (function) – Regularizer function applied to.
kernel_constraint (function) – Constraint function applied to the
kernel
weights matrix.bias_constraint (function) – Constraint function applied to the bias vector.
- Input shape:
nD tensor with shape: (batch_size, …, input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).
- Output shape:
nD tensor with shape: (batch_size, …, output_dim). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, output_dim).
upsample
Upsampling layers
- class deepcell.layers.upsample.UpsampleLike(*args: Any, **kwargs: Any)[source]
Bases:
Layer
Layer for upsampling a Tensor to be the same shape as another Tensor.
Adapted from https://github.com/fizyr/keras-retinanet.
- Parameters:
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.
deepcell.losses
deepcell.metrics
deepcell.model_zoo
FeatureNet
PanOpticNet
FPN
Tracking
deepcell.running
deepcell.tracking
deepcell.training
deepcell.utils
Deepcell Utilities Module
backbone_utils
Functions for creating model backbones
- deepcell.utils.backbone_utils.featurenet_3D_backbone(input_tensor=None, input_shape=None, n_filters=32, **kwargs)[source]
Construct the deepcell backbone with five convolutional units
- deepcell.utils.backbone_utils.featurenet_3D_block(x, n_filters)[source]
Add a set of layers that make up one unit of the featurenet 3D backbone
- Parameters:
x (tensorflow.keras.Layer) – Keras layer object to pass to backbone unit
n_filters (int) – Number of filters to use for convolutional layers
- Returns:
Keras layer object
- Return type:
tensorflow.keras.Layer
- deepcell.utils.backbone_utils.featurenet_backbone(input_tensor=None, input_shape=None, n_filters=32, **kwargs)[source]
Construct the deepcell backbone with five convolutional units
- deepcell.utils.backbone_utils.featurenet_block(x, n_filters)[source]
Add a set of layers that make up one unit of the featurenet backbone
- Parameters:
x (tensorflow.keras.Layer) – Keras layer object to pass to backbone unit
n_filters (int) – Number of filters to use for convolutional layers
- Returns:
Keras layer object
- Return type:
tensorflow.keras.Layer
- deepcell.utils.backbone_utils.get_backbone(backbone, input_tensor=None, input_shape=None, use_imagenet=False, return_dict=True, frames_per_batch=1, **kwargs)[source]
Retrieve backbones for the construction of feature pyramid networks.
- Parameters:
backbone (str) – Name of the backbone to be retrieved.
input_tensor (tensor) – The input tensor for the backbone. Should have channel dimension of size 3
use_imagenet (bool) – Load pre-trained weights for the backbone
return_dict (bool) – Whether to return a dictionary of backbone layers, e.g.
{'C1': C1, 'C2': C2, 'C3': C3, 'C4': C4, 'C5': C5}
. If false, the whole model is returned insteadkwargs (dict) – Keyword dictionary for backbone constructions. Relevant keys include
'include_top'
,'weights'
(should beNone
),'input_shape'
, and'pooling'
.
- Returns:
An instantiated backbone
- Return type:
tensorflow.keras.Model
- Raises:
ValueError – bad backbone name
ValueError – featurenet backbone with pre-trained imagenet
data_utils
Functions for making training data
- deepcell.utils.data_utils.get_data(file_name, mode='sample', test_size=0.2, seed=0)[source]
Load data from NPZ file and split into train and test sets
- Parameters:
file_name (str) – path to NPZ file to load
mode (str) – if ‘siamese_daughters’, returns lineage information from .trk file otherwise, returns the same data that was loaded.
test_size (float) – percent of data to leave as testing holdout
seed (int) – seed number for random train/test split repeatability
- Returns:
dict of training data, and a dict of testing data
- Return type:
- deepcell.utils.data_utils.get_max_sample_num_list(y, edge_feature, output_mode='sample', padding='valid', window_size_x=30, window_size_y=30)[source]
For each set of images and each feature, find the maximum number of samples for to be used. This will be used to balance class sampling.
- Parameters:
- Returns:
list of maximum sample size for all classes
- Return type:
- deepcell.utils.data_utils.relabel_movie(y)[source]
Relabels unique instance IDs to be from 1 to N
- Parameters:
y (numpy.array) – tensor of integer labels
- Returns:
relabeled tensor with sequential labels
- Return type:
numpy.array
- deepcell.utils.data_utils.reshape_matrix(X, y, reshape_size=256)[source]
Reshape matrix of dimension 4 to have x and y of size reshape_size. Adds overlapping slices to batches. E.g.
reshape_size
of 256 yields (1, 1024, 1024, 1) -> (16, 256, 256, 1) The input image is divided into subimages of side length reshape_size, with the last row and column of subimages overlapping the one before the last if the original image side lengths are not divisible byreshape_size
.- Parameters:
X (numpy.array) – raw 4D image tensor
y (numpy.array) – label mask of 4D image data
reshape_size (int, list) – size of the output tensor If input is int, output images are square with side length equal reshape_size. If it is a list of 2 ints, then the output images size is reshape_size[0] x reshape_size[1]
- Returns:
reshaped
X
andy
4D tensors inshape[1:3] = (reshape_size, reshape_size)
, ifreshape_size
is anint
, andshape[1:3] = reshape_size
, ifreshape_size
is a list of length 2- Return type:
numpy.array
- Raises:
ValueError –
X.ndim
is not 4ValueError –
y.ndim
is not 4
- deepcell.utils.data_utils.reshape_movie(X, y, reshape_size=256)[source]
Reshape tensor of dimension 5 to have x and y of size
reshape_size
. Adds overlapping slices to batches. E.g.reshape_size
of 256 yields(1, 5, 1024, 1024, 1) -> (16, 5, 256, 256, 1)
- Parameters:
X (numpy.array) – raw 5D image tensor
y (numpy.array) – label mask of 5D image tensor
reshape_size (int) – size of the square output tensor
- Returns:
reshaped
X
andy
tensors in shape(reshape_size, reshape_size)
- Return type:
numpy.array
- Raises:
ValueError –
X.ndim
is not 5ValueError –
y.ndim
is not 5
- deepcell.utils.data_utils.sample_label_matrix(y, window_size=(30, 30), padding='valid', max_training_examples=10000000.0, data_format=None)[source]
Sample a 4D Tensor, creating many small images of shape window_size.
- Parameters:
- Returns:
4 arrays of coordinates of each sampled pixel
- Return type:
- deepcell.utils.data_utils.sample_label_movie(y, window_size=(30, 30, 5), padding='valid', max_training_examples=10000000.0, data_format=None)[source]
Sample a 5D Tensor, creating many small voxels of shape window_size.
- Parameters:
- Returns:
5 arrays of coordinates of each sampled pixel
- Return type:
- deepcell.utils.data_utils.trim_padding(nparr, win_x, win_y, win_z=None)[source]
Trim the boundaries of the numpy array to allow for a sliding window of size (win_x, win_y) to not slide over regions without pixel data
- Parameters:
- Returns:
trimmed numpy array of size
x - 2 * win_x - 1, y - 2 * win_y - 1
- Return type:
numpy.array
- Raises:
ValueError – nparr.ndim is not 4 or 5
export_utils
Save Keras models as a SavedModel for TensorFlow Serving
- deepcell.utils.export_utils.export_model(keras_model, export_path, model_version=0, weights_path=None, include_optimizer=True, overwrite=True, save_format='tf')[source]
Export a model for use with TensorFlow Serving.
DEPRECATED:
tf.keras.models.save_model
is preferred.- Parameters:
keras_model (tensorflow.keras.Model) – Instantiated Keras model.
export_path (str) – Destination to save the exported model files.
model_version (int) – Integer version of the model.
weights_path (str) – Path to a
.h5
or.tf
weights file.include_optimizer (bool) – Whether to export the optimizer.
overwrite (bool) – Whether to overwrite any existing files in
export_path
.save_format (str) – Saved model format, one of
'tf'
or'h5'
.
- deepcell.utils.export_utils.export_model_to_tflite(model_file, export_path, calibration_images, norm=True, location=True, file_name='model.tflite')[source]
Export a saved keras model to tensorflow-lite with int8 precision.
Deprecated since version 0.12.4: The
export_model_to_tflite
function is deprecated and will be removed in 0.13. Usetf.keras.models.save_model
instead.This export function has only been tested with
PanopticNet
models. For the export to be successful, thePanopticNet
model must havenorm_method
set toNone
,location
set toFalse
, and the upsampling layers must usebilinear
interpolation.- Parameters:
model_file (str) – Path to saved model file
export_path (str) – Directory to save the exported tflite model
calibration_images (numpy.array) – Array of images used for calibration during model quantization
norm (bool) – Whether to normalize calibration images.
location (bool) – Whether to append a location image to calibration images.
file_name (str) – File name for the exported model. Defaults to ‘model.tflite’
io_utils
Utilities for reading/writing files
- deepcell.utils.io_utils.get_image(file_name)[source]
DEPRECATED. Use
skimage.io.imread
instead.Read image from file and returns it as a tensor.
- Parameters:
file_name (str) – path to image file
- Returns:
numpy array of image data
- Return type:
numpy.array
misc_utils
Miscellaneous utility functions
plot_utils
Utilities plotting data
- deepcell.utils.plot_utils.cf(x_coord, y_coord, sample_image)[source]
Format x and y coordinates for printing
- deepcell.utils.plot_utils.create_rgb_image(input_data, channel_colors)[source]
Takes a stack of 1- or 2-channel data and converts it to an RGB image
- Parameters:
input_data – 4D stack of images to be converted to RGB
channel_colors – list specifying the color for each channel
- Returns:
transformed version of input data into RGB version
- Return type:
numpy.array
- Raises:
ValueError – if
len(channel_colors)
is not equal to number of channelsValueError – if invalid
channel_colors
providedValueError – if input_data is not 4D, with 1 or 2 channels
- deepcell.utils.plot_utils.get_js_video(images, batch=0, channel=0, cmap='jet', vmin=0, vmax=0, interval=200, repeat_delay=1000)[source]
Create a JavaScript video as HTML for visualizing 3D data as a movie.
- Parameters:
- Returns:
JS HTML to display video
- Return type:
- deepcell.utils.plot_utils.make_outline_overlay(rgb_data, predictions)[source]
Overlay a segmentation mask with image data for easy visualization
- Parameters:
rgb_data – 3 channel array of images, output of
create_rgb_data
predictions – segmentation predictions to be visualized
- Returns:
overlay image of input data and predictions
- Return type:
numpy.array
- Raises:
ValueError – If predictions are not 4D
ValueError – If there is not matching RGB data for each prediction
- deepcell.utils.plot_utils.plot_error(loss_hist_file, saved_direc, plot_name)[source]
Plot the training and validation error from the npz file
tracking_utils
Utilities for tracking cells
train_utils
Utilities for training neural nets
- deepcell.utils.train_utils.count_gpus()[source]
Get the number of available GPUs.
- Returns:
count of GPUs as integer
- Return type:
- deepcell.utils.train_utils.get_callbacks(model_path, save_weights_only=False, lr_sched=None, tensorboard_log_dir=None, reduce_lr_on_plateau=False, monitor='val_loss', verbose=1)[source]
Returns a list of callbacks used for training
- Parameters:
model_path – (str) path for the
h5
model file.save_weights_only – (bool) if True, then only the model’s weights will be saved.
lr_sched (function) – learning rate scheduler per epoch. from
rate_scheduler
.tensorboard_log_dir (str) – log directory for tensorboard.
monitor (str) – quantity to monitor.
verbose (int) – verbosity mode, 0 or 1.
- Returns:
a list of callbacks to be passed to
model.fit()
- Return type:
transform_utils
Utilities for data transformations
- deepcell.utils.transform_utils.inner_distance_transform_2d(mask, bins=None, erosion_width=None, alpha=0.1, beta=1)[source]
Transform a label mask with an inner distance transform.
inner_distance = 1 / (1 + beta * alpha * distance_to_center)
- Parameters:
mask (numpy.array) – A label mask (
y
data).bins (int) – The number of transformed distance classes.
erosion_width (int) – number of pixels to erode edges of each labels
alpha (float, str) – coefficent to reduce the magnitude of the distance value. If “auto”, determines
alpha
for each cell based on the cell area.beta (float) – scale parameter that is used when
alpha
is “auto”.
- Returns:
a mask of same shape as input mask, with each label being a distance class from 1 to
bins
.- Return type:
numpy.array
- Raises:
ValueError –
alpha
is a string but not set to “auto”.
- deepcell.utils.transform_utils.inner_distance_transform_3d(mask, bins=None, erosion_width=None, alpha=0.1, beta=1, sampling=[0.5, 0.217, 0.217])[source]
Transform a label mask for a z-stack with an inner distance transform.
inner_distance = 1 / (1 + beta * alpha * distance_to_center)
- Parameters:
mask (numpy.array) – A label mask (
y
data).bins (int) – The number of transformed distance classes.
erosion_width (int) – Number of pixels to erode edges of each labels
alpha (float, str) – Coefficent to reduce the magnitude of the distance value. If
'auto'
, determines alpha for each cell based on the cell area.beta (float) – Scale parameter that is used when
alpha
is “auto”.sampling (list) – Spacing of pixels along each dimension.
- Returns:
A mask of same shape as input mask, with each label being a distance class from 1 to
bins
.- Return type:
numpy.array
- Raises:
ValueError –
alpha
is a string but not set to “auto”.
- deepcell.utils.transform_utils.inner_distance_transform_movie(mask, bins=None, erosion_width=None, alpha=0.1, beta=1)[source]
Transform a label mask with an inner distance transform. Applies the 2D transform to each frame.
- Parameters:
mask (numpy.array) – A label mask (
y
data).bins (int) – The number of transformed distance classes.
erosion_width (int) – Number of pixels to erode edges of each labels.
alpha (float, str) – Coefficent to reduce the magnitude of the distance value. If “auto”, determines
alpha
for each cell based on the cell area.beta (float) – Scale parameter that is used when
alpha
is “auto”.
- Returns:
A mask of same shape as input mask, with each label being a distance class from 1 to
bins
.- Return type:
numpy.array
- Raises:
ValueError –
alpha
is a string but not set to “auto”.
- deepcell.utils.transform_utils.outer_distance_transform_2d(mask, bins=None, erosion_width=None, normalize=True)[source]
Transform a label mask with an outer distance transform.
- Parameters:
mask (numpy.array) – A label mask (
y
data).bins (int) – The number of transformed distance classes. If
None
, returns the continuous outer transform.erosion_width (int) – Number of pixels to erode edges of each labels
normalize (bool) – Normalize the transform of each cell by that cell’s largest distance.
- Returns:
A mask of same shape as input mask, with each label being a distance class from 1 to
bins
.- Return type:
numpy.array
- deepcell.utils.transform_utils.outer_distance_transform_3d(mask, bins=None, erosion_width=None, normalize=True, sampling=[0.5, 0.217, 0.217])[source]
Transforms a label mask for a z stack with an outer distance transform. Uses scipy’s distance_transform_edt
- Parameters:
mask (numpy.array) – A z-stack of label masks (
y
data).bins (int) – The number of transformed distance classes.
erosion_width (int) – Number of pixels to erode edges of each labels.
normalize (bool) – Normalize the transform of each cell by that cell’s largest distance.
sampling (list) – Spacing of pixels along each dimension.
- Returns:
3D Euclidiean Distance Transform
- Return type:
numpy.array
- deepcell.utils.transform_utils.outer_distance_transform_movie(mask, bins=None, erosion_width=None, normalize=True)[source]
Transform a label mask for a movie with an outer distance transform. Applies the 2D transform to each frame.
- Parameters:
- Returns:
a mask of same shape as input mask, with each label being a distance class from 1 to
bins
- Return type:
numpy.array
- deepcell.utils.transform_utils.pixelwise_transform(mask, dilation_radius=None, data_format=None, separate_edge_classes=False)[source]
Transforms a label mask for a z stack edge, interior, and background
- Parameters:
mask (numpy.array) – tensor of labels
dilation_radius (int) – width to enlarge the edge feature of each instance
data_format (str) – A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.separate_edge_classes (bool) – Whether to separate the cell edge class into 2 distinct cell-cell edge and cell-background edge classes.
- Returns:
An array with the same shape as
mask
, except the channel axis will be a one-hot encoded semantic segmentation for 3 main features:[cell_edge, cell_interior, background]
. Ifseparate_edge_classes
isTrue
, thecell_interior
feature is split into 2 features and the resulting channels are:[bg_cell_edge, cell_cell_edge, cell_interior, background]
.- Return type:
numpy.array
DeepCell API Key
DeepCell models and training datasets are licensed under a modified Apache license for non-commercial academic use only. An API key for accessing datasets and models can be obtained at https://users.deepcell.org/login/.
For more information about datasets published through DeepCell, please see Deepcell Datasets.
API Key Usage
The token that is issued by users.deepcell.org should be added as an environment variable through one of the following methods:
Save the token in your shell config script (e.g.
.bashrc
,.zshrc
,.bash_profile
, etc.)
export DEEPCELL_ACCESS_TOKEN=<token-from-users.deepcell.org>
Save the token as an environment variable during a python session. Please be careful to avoid commiting your token to any public repositories.
import os
os.environ.update({"DEEPCELL_ACCESS_TOKEN": "<token-from-users.deepcell.org>"})
Deepcell Datasets
Note
Go to the end to download the full example code
SpotNet

SpotNet is a training dataset for a deep learning model for spot detection published in Laubscher et al. 2023.
This dataset is licensed under a modified Apache license for non-commercial academic use only.
The dataset can be accessed using deepcell.datasets
with a DeepCell API key.
For more information about using a DeepCell API key, please see DeepCell API Key.
Each batch of the dataset contains two components:
X: raw images of fluorescent spots
y: coordinate annotations for spot locations
from deepcell.datasets import SpotNet
spotnet = SpotNet(version='1.0')
X_val, y_val = spotnet.load_data(split='val')
Note
Go to the end to download the full example code
TissueNet

TissueNet is a training dataset for nuclear and whole cell segmentation in tissues published in Greenwald, Miller et al. 2022.
The TissueNet dataset is composed of a train, val, and test split.
The train split is composed of aproximately 2600 images, each of which are 512x512 pixels. During training, we select random crops of size 256x256 from each image as a form of data augmentation.
The val split is composed of aproximately 300 images, each of which is originally of size 512x512. However, because we do not perform any augmentation on the validation dataset during training, we reshape these 512x512 images into 256x256 images so that no cropping is needed in order to pass them through the model. Finally, we make two copies of the val set at different image resolutions and concatenate them all together, resulting in a total of aproximately 3000 images of size 256x256,
The test split is composed of aproximately 300 images, each of which is originally of size 512x512. However, because the model was trained on images that are size 256x256, we reshape these 512x512 images into 256x256 images, resulting in aproximately 1200 images.
Change Log
TissueNet 1.0 (July 2021): The original dataset used for all experiments in Greenwald, Miller at al.
TissueNet 1.1 (April 2022): This version of TissueNet has gone through an additional round of manual QC to ensure consistency in labeling across the entire dataset.
This dataset is licensed under a modified Apache license for non-commercial academic use only
The dataset can be accessed using deepcell.datasets
with a DeepCell API key.
For more information about using a DeepCell API key, please see DeepCell API Key
from deepcell.datasets import TissueNet
tissuenet = TissueNet(version='1.1')
X_val, y_val, meta_val = tissuenet.load_data(split='val')
Note
Go to the end to download the full example code
DynamicNuclearNet

DynamicNuclearNet is a training dataset for nuclear segmentation and tracking published in Schwartz et al. 2023. The dataset is made up of two subsets, one for tracking and one for segmentation.
This dataset is licensed under a modified Apache license for non-commercial academic use only
The dataset can be accessed using deepcell.datasets
with a DeepCell API key.
For more information about using a DeepCell API key, please see DeepCell API Key
Tracking
Each batch of the dataset contains three components
X: raw fluorescent nuclear data
y: nuclear segmentation masks
lineages: lineage records including the cell id, frames present and division links from parent to daughter cells
from deepcell.datasets import DynamicNuclearNetTracking
dnn_trk = DynamicNuclearNetTracking(version='1.0')
X_val, y_val, lineage_val = dnn_trk.load_data(split='val')
data_source = dnn_trk.load_source_metadata()
Segmentation
Each batch of the dataset includes three components
X: raw fluorescent nuclear data
y: nuclear segmentation masks
metadata: description of the source of each batch
from deepcell.datasets import DynamicNuclearNetSegmentation
dnn_seg = DynamicNuclearNetSegmentation(version='1.0')
X_val, y_val, meta_val = dnn_seg.load_data(split='val')
Deepcell Applications
Note
Go to the end to download the full example code
Caliban: Nuclear Segmentation and Tracking
Caliban is a pipeline for nuclear segmentation and tracking in live cell imaging datasets.
The models associated with Caliban can be accessed using deepcell.applications
with a DeepCell API key.
For more information about using a DeepCell API key, please see DeepCell API Key.
import copy
import imageio
import matplotlib as mpl
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt
import numpy as np
from deepcell.applications import NuclearSegmentation, CellTracking
from deepcell.datasets import DynamicNuclearNetSample
def shuffle_colors(ymax, cmap):
"""Utility function to generate a colormap for a labeled image"""
cmap = mpl.colormaps[cmap].resampled(ymax)
nmap = cmap(range(ymax))
np.random.shuffle(nmap)
cmap = ListedColormap(nmap)
cmap.set_bad('black')
return cmap
Prepare nuclear data
x, y, _ = DynamicNuclearNetSample().load_data()
def plot(im):
fig, ax = plt.subplots(figsize=(6, 6))
ax.imshow(im, 'Greys_r', vmax=3000)
plt.axis('off')
plt.title('Raw Image Data')
fig.canvas.draw() # draw the canvas, cache the renderer
image = np.frombuffer(fig.canvas.tostring_rgb(), dtype='uint8')
image = image.reshape(fig.canvas.get_width_height()[::-1] + (3,))
plt.close(fig)
return image
imageio.mimsave('caliban-raw.gif', [plot(x[i, ..., 0]) for i in range(x.shape[0])])
View .GIF of raw cells

Nuclear Segmentation
Initialize nuclear model
The application will download pretrained weights for nuclear segmentation. For more information about application objects, please see our documentation.
app = NuclearSegmentation()
Use the application to generate labeled images
Typically, neural networks perform best on test data that is similar to the training data.
In the realm of biological imaging, the most common difference between datasets is the resolution
of the data measured in microns per pixel. The training resolution of the model can be identified
using app.model_mpp
.
print('Training Resolution:', app.model_mpp, 'microns per pixel')
The resolution of the input data can be specified in app.predict
using the image_mpp
option.
The Application
will rescale the input data to match the training resolution and then rescale
to the original size before returning the labeled image.
y_pred = app.predict(x, image_mpp=0.65)
print(y_pred.shape)
Save labeled images as a gif to visualize
ymax = np.max(y_pred)
cmap = shuffle_colors(ymax, 'tab20')
def plot(x, y):
yy = copy.deepcopy(y)
yy = np.ma.masked_equal(yy, 0)
fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0].imshow(x, cmap='Greys_r', vmax=3000)
ax[0].axis('off')
ax[0].set_title('Raw')
ax[1].imshow(yy, cmap=cmap, vmax=ymax)
ax[1].set_title('Segmented')
ax[1].axis('off')
fig.canvas.draw() # draw the canvas, cache the renderer
image = np.frombuffer(fig.canvas.tostring_rgb(), dtype='uint8')
image = image.reshape(fig.canvas.get_width_height()[::-1] + (3,))
plt.close(fig)
return image
imageio.mimsave(
'./caliban-labeled.gif',
[plot(x[i,...,0], y_pred[i,...,0])
for i in range(y_pred.shape[0])]
)
View .GIF of segmented cells
The NuclearSegmentation
application was able to create a label mask for every cell in every
frame!

Cell Tracking
The NuclearSegmentation
worked well, but the cell labels of the same cell are not preserved
across frames. To resolve this problem, we can use the CellTracker
! This object will use
another CellTrackingModel
to compare all cells and determine which cells are the same across
frames, as well as if a cell split into daughter cells.
Initalize CellTracking application
Create an instance of deepcell.applications.CellTracking
.
tracker = CellTracking()
Track the cells
tracked_data = tracker.track(x, y_pred)
y_tracked = tracked_data['y_tracked']
Visualize tracking results
ymax = np.max(y_tracked)
cmap = shuffle_colors(ymax, 'tab20')
def plot(x, y):
yy = copy.deepcopy(y)
yy = np.ma.masked_equal(yy, 0)
fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0].imshow(x, cmap='Greys_r', vmax=3000)
ax[0].axis('off')
ax[0].set_title('Raw')
ax[1].imshow(yy, cmap=cmap, vmax=ymax)
ax[1].set_title('Tracked')
ax[1].axis('off')
fig.canvas.draw() # draw the canvas, cache the renderer
image = np.frombuffer(fig.canvas.tostring_rgb(), dtype='uint8')
image = image.reshape(fig.canvas.get_width_height()[::-1] + (3,))
plt.close(fig)
return image
imageio.mimsave(
'./caliban-tracks.gif',
[plot(x[i,...,0], y_tracked[i,...,0])
for i in range(y_tracked.shape[0])]
)
View .GIF of tracked cells
Now that we’ve finished using CellTracker.track_cells
, not only do the annotations preserve
label across frames, but the lineage information has been saved in CellTracker.tracks
.

Note
Go to the end to download the full example code
Mesmer: Tissue Segmentation
Mesmer can be accessed using deepcell.applications
with a DeepCell API key.
For more information about using a DeepCell API key, please see DeepCell API Key.
from matplotlib import pyplot as plt
from deepcell.datasets import TissueNetSample
from deepcell.utils.plot_utils import create_rgb_image, make_outline_overlay
# Download multiplex data
X, y, _ = TissueNetSample().load_data()
create rgb overlay of image data for visualization
rgb_images = create_rgb_image(X, channel_colors=['green', 'blue'])
plot the data
fig, ax = plt.subplots(1, 3, figsize=(15, 5))
ax[0].imshow(X[0, ..., 0], cmap='Greys_r')
ax[1].imshow(X[0, ..., 1], cmap='Greys_r')
ax[2].imshow(rgb_images[0, ...])
ax[0].set_title('Nuclear channel')
ax[1].set_title('Membrane channel')
ax[2].set_title('Overlay')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-input.png')

The application will download pretrained weights for tissue segmentation. For more information about application objects, please see our documentation.
from deepcell.applications import Mesmer
app = Mesmer()
Whole Cell Segmentation
Typically, neural networks perform best on test data that is similar to the training data.
In the realm of biological imaging, the most common difference between datasets is the resolution
of the data measured in microns per pixel. The training resolution of the model can be identified
using app.model_mpp
.
print('Training Resolution:', app.model_mpp, 'microns per pixel')
The resolution of the input data can be specified in app.predict
using the image_mpp
option.
The Application
will rescale the input data to match the training resolution and then rescale
to the original size before returning the labeled image.
segmentation_predictions = app.predict(X, image_mpp=0.5)
create overlay of predictions
overlay_data = make_outline_overlay(rgb_data=rgb_images, predictions=segmentation_predictions)
select index for displaying
idx = 0
# plot the data
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(rgb_images[idx, ...])
ax[1].imshow(overlay_data[idx, ...])
ax[0].set_title('Raw data')
ax[1].set_title('Predictions')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-wc.png')

Nuclear Segmentation
In addition to predicting whole-cell segmentation, Mesmer can also be used for nuclear predictions
segmentation_predictions_nuc = app.predict(X, image_mpp=0.5, compartment='nuclear')
overlay_data_nuc = make_outline_overlay(
rgb_data=rgb_images,
predictions=segmentation_predictions_nuc)
select index for displaying
idx = 0
# plot the data
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(rgb_images[idx, ...])
ax[1].imshow(overlay_data_nuc[idx, ...])
ax[0].set_title('Raw data')
ax[1].set_title('Nuclear Predictions')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-nuc.png')

Fine-tuning the model output
In most cases, we find that the default settings for the model work quite well across a range of tissues. However, if you notice specific, consistent errors in your data, there are a few things you can change.
The first is the interior_threshold
parameter. This controls how conservative the model is in
estimating what is a cell vs what is background. Lower values of interior_threshold
will
result in larger cells, whereas higher values will result in smaller cells.
The second is the maxima_threshold
parameter. This controls what the model considers a unique
cell. Lower values will result in more separate cells being predicted, whereas higher values
will result in fewer cells.
To demonstrate the effect of interior_threshold
, we’ll compare the default with a much more
stringent setting
segmentation_predictions_interior = app.predict(
X,
image_mpp=0.5,
postprocess_kwargs_whole_cell={'interior_threshold': 0.5})
overlay_data_interior = make_outline_overlay(
rgb_data=rgb_images,
predictions=segmentation_predictions_interior)
select index for displaying
idx = 0
# plot the data
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(overlay_data[idx, ...])
ax[1].imshow(overlay_data_interior[idx, ...])
ax[0].set_title('Default settings')
ax[1].set_title('More restrictive interior threshold')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-interior-threshold.png')

To demonstrate the effect of maxima_threshold
, we’ll compare the default with a much more
stringent setting
segmentation_predictions_maxima = app.predict(
X,
image_mpp=0.5,
postprocess_kwargs_whole_cell={'maxima_threshold': 0.8})
overlay_data_maxima = make_outline_overlay(
rgb_data=rgb_images,
predictions=segmentation_predictions_maxima)
select index for displaying
idx = 0
# plot the data
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(overlay_data[idx, ...])
ax[1].imshow(overlay_data_maxima[idx, ...])
ax[0].set_title('Default settings')
ax[1].set_title('More stringent maxima threshold')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-maxima-threshold.png')

Finally, if your data doesn’t include in a strong membrane marker, the model will default to just
predicting the nuclear segmentation, even for whole-cell mode. If you’d like to add a manual
pixel expansion after segmentation, you can do that using the pixel_expansion
argument. This
will universally apply an expansion after segmentation to each cell
To demonstrate the effect of pixel_expansion
, we’ll compare the nuclear output
with expanded output
segmentation_predictions_expansion = app.predict(
X,
image_mpp=0.5,
compartment='nuclear',
postprocess_kwargs_nuclear={'pixel_expansion': 5}
)
overlay_data_expansion = make_outline_overlay(
rgb_data=rgb_images,
predictions=segmentation_predictions_expansion
)
select index for displaying
idx = 0
# plot the data
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(overlay_data_nuc[idx, ...])
ax[1].imshow(overlay_data_expansion[idx, ...])
ax[0].set_title('Default nuclear segmentation')
ax[1].set_title('Nuclear segmentation with an expansion')
for a in ax:
a.axis('off')
plt.show()
fig.savefig('mesmer-nuc-expansion.png')

There’s a separate dictionary passed to the model that controls the post-processing for whole-cell and nuclear predictions. You can modify them independently to fine-tune the output. The current defaults the model is using can be found here

deepcell-tf
is a deep learning library for single-cell analysis of biological images. It is written in Python and built using TensorFlow 2.
This library allows users to apply pre-existing models to imaging data as well as to develop new deep learning models for single-cell analysis. This library specializes in models for cell segmentation (whole-cell and nuclear) in 2D and 3D images as well as cell tracking in 2D time-lapse datasets. These models are applicable to data ranging from multiplexed images of tissues to dynamic live-cell imaging movies.
deepcell-tf
is one of several resources created by the Van Valen lab to facilitate the development and application of new deep learning methods to biology. Other projects within our DeepCell ecosystem include the DeepCell Toolbox for pre and post-processing the outputs of deep learning models, DeepCell Tracking for creating cell lineages with deep-learning-based tracking models, and the DeepCell Kiosk for deploying workflows on large datasets in the cloud. Additionally, we have developed DeepCell Label for annotating high-dimensional biological images to use as training data.
Read the documentation at deepcell.readthedocs.io.
For more information on deploying models in the cloud refer to the the Kiosk documentation.
Examples
Raw Image | Tracked Image |
![]() |
![]() |
Getting Started
Install with pip
The fastest way to get started with deepcell-tf
is to install the package with pip
:
pip install deepcell
Install with Docker
There are also docker containers with GPU support available on DockerHub. To run the library locally on a GPU, make sure you have CUDA and Docker v19.03 or later installed. For prior Docker versions, use nvidia-docker. Alternatively, Google Cloud Platform (GCP) offers prebuilt virtual machines preinstalled with CUDA, Docker, and the NVIDIA Container Toolkit.
Once docker
is installed, run the following command:
# Start a GPU enabled container on one GPUs
docker run --gpus '"device=0"' -it --rm \
-p 8888:8888 \
-v $PWD/notebooks:/notebooks \
-v $PWD/data:/data \
vanvalenlab/deepcell-tf:latest-gpu
This will start a Docker container with deepcell-tf
installed and start a jupyter session using the default port 8888. This command also mounts a data folder ($PWD/data
) and a notebooks folder ($PWD/notebooks
) to the docker container so it can access data and Juyter notebooks stored on the host workstation. Data and models must be saved in these mounted directories to persist them outside of the running container. The default port can be changed to any non-reserved port by updating -p 8888:8888
to, e.g., -p 8080:8888
. If you run across any errors getting started, you should either refer to the deepcell-tf
for developers section or raise an issue on GitHub.
For examples of how to train models with the deepcell-tf
library, check out the following notebooks:
DeepCell Applications and DeepCell Datasets
deepcell-tf
contains two modules that greatly simplify the development and usage of deep learning models for single cell analysis. The first is deepcell.datasets, a collection of biological images that have single-cell annotations. These data include live-cell imaging movies of fluorescent nuclei (approximately 10,000 single-cell trajectories over 30 frames), as well as static images of whole cells (both phase and fluorescence images - approximately 75,000 single cell annotations). The second is deepcell.applications, which contains pre-trained models (fluorescent nuclear and phase/fluorescent whole cell) for single-cell analysis. Provided data is scaled so that the physical size of each pixel matches that in the training dataset, these models can be used out of the box on live-cell imaging data. We are currently working to expand these modules to include data and models for tissue images. Please note that they may be spun off into their own GitHub repositories in the near future.
DeepCell-tf for Developers
deepcell-tf
uses docker
and tensorflow
to enable GPU processing. If using GCP, there are pre-built images which come with CUDA and Docker pre-installed. Otherwise, you will need to install docker and CUDA separately.
Build a local docker container, specifying the tensorflow version with TF_VERSION
git clone https://github.com/vanvalenlab/deepcell-tf.git
cd deepcell-tf
docker build --build-arg TF_VERSION=2.8.0-gpu -t $USER/deepcell-tf .
Run the new docker image
# '"device=0"' refers to the specific GPU(s) to run DeepCell-tf on, and is not required
docker run --gpus '"device=0"' -it \
-p 8888:8888 \
$USER/deepcell-tf:latest-gpu
It can also be helpful to mount the local copy of the repository and the notebooks to speed up local development. However, if you are going to mount a local version of the repository, you must first run the docker image without the local repository mounted so that the C extensions can be compiled and then copied over to your local version.
# First run the docker image without mounting externally
docker run --gpus '"device=0"' -it \
-p 8888:8888 \
$USER/deepcell-tf:latest-gpu
# Use ctrl-p, ctrl-q (or ctrl+p+q) to exit the running docker image without shutting it down
# Then, get the container_id corresponding to the running image of DeepCell-tf
container_id=$(docker ps -q --filter ancestor="$USER/deepcell-tf")
# Copy the compiled c extensions into your local version of the codebase:
docker cp "$container_id:/usr/local/lib/python3.6/dist-packages/deepcell/utils/compute_overlap.cpython-36m-x86_64-linux-gnu.so" deepcell/utils/compute_overlap.cpython-36m-x86_64-linux-gnu.so
# close the running docker
docker kill $container_id
# you can now start the docker image with the code mounted for easy editing
docker run --gpus '"device=0"' -it \
-p 8888:8888 \
-v $PWD/deepcell:/usr/local/lib/python3.6/dist-packages/deepcell/ \
-v $PWD/notebooks:/notebooks \
-v $PWD:/data \
$USER/deepcell-tf:latest-gpu
How to Cite
Copyright
Copyright © 2016-2023 The Van Valen Lab at the California Institute of Technology (Caltech), with support from the Shurl and Kay Curci Foundation, Google Research Cloud, the Paul Allen Family Foundation, & National Institutes of Health (NIH) under Grant U24CA224309-01. All rights reserved.
License
This software is licensed under a modified APACHE2. See LICENSE for full details.
Trademarks
All other trademarks referenced herein are the property of their respective owners.