High-dimensional cytometry is a powerful technique for deciphering the immunopathological factors common to multiple individuals. However, rational comparisons of multiple batches of experiments performed on different occasions or at different sites are challenging because of batch effects. In this study, we describe the integration of multibatch cytometry datasets (iMUBAC), a flexible, scalable, and robust computational framework for unsupervised cell-type identification across multiple batches of high-dimensional cytometry datasets, even without technical replicates. After overlaying cells from multiple healthy controls across batches, iMUBAC learns batch-specific cell-type classification boundaries and identifies aberrant immunophenotypes in patient samples from multiple batches in a unified manner. We illustrate unbiased and streamlined immunophenotyping using both public and in-house mass cytometry and spectral flow cytometry datasets. The method is available as the R package iMUBAC (https://github.com/casanova-lab/iMUBAC).

This article is featured in Top Reads, p.1

High-dimensional cytometry—including mass cytometry (i.e., cytometry by time-of-flight [CyTOF]) and spectral flow cytometry (e.g., Cytek Aurora)—facilitates the phenotyping of precious samples from patients for various immune cell subsets at single-cell resolution, which is of particular interest in human immunology. Ideally, a large number of samples should be processed simultaneously to ensure comparability. However, this is not always possible. For example, patients may be recruited across the globe and over decades, and they may be longitudinally monitored in prospectively expanding cohorts for rare diseases, such as inborn errors of immunity (1). Similarly, some investigations may involve multiple pilot studies with small numbers of patients followed by larger-scale validation studies. In such situations, the integration of multiple batches of experiments processed on different occasions and at different sites is crucial. The simplest solution for multibatch integration is to gate cell subsets with manual batch-to-batch adjustments. However, manual analyses of cytometry data are inherently subjective, knowledge driven, and nonscalable for multiple batches of datasets. More objective, unbiased, and scalable methods are therefore desired.

Efforts have been made to facilitate high-dimensional data inspection and unsupervised cell-subset identification [e.g., viSNE (2), SPADE (3), FlowSOM (4, 5), CITRUS (6), and CellCNN (7)]. However, these automated approaches are themselves sensitive to the batch effects resulting from the separate processing of different experiments. Both single-dimensional [i.e., CytoNorm (8) and CytofBatchAdjust (9)] and multidimensional [i.e., CytofRUV (10)] signal intensity normalization methods have been described for reducing batch effects before unsupervised cell-type identification. However, the requirement for technical replicates shared across all batches is a major practical obstacle inherent to these methods. For example, in a prospectively expanding cohort of patients with rare immunological disorders, as experiments accumulate and with the discovery and recruitment of new patients, it becomes increasingly difficult, if not almost impossible, to include identical technical replicate samples in all experiments. Moreover, batch correction on cells from patients is undesirable because of the inherent uncertainty about over- or undercorrection, particularly for patients expected to have markedly altered immunophenotypes. Thus, the computational approaches currently available remain inadequate for unbiased and high-resolution immunophenotyping on multiple batches of cytometry datasets without technical replicates.

PBMCs were isolated from heparinized venous blood samples by Ficoll-Hypaque density gradient centrifugation (GE Healthcare). Cells were cryopreserved in FBS supplemented with 10% DMSO and stored at −150°C until use. Patients with various severe infectious diseases were included in this study. When blood samples from patients were collected at distant sites, the blood samples were transported to either the Paris or the New York branch of our laboratory overnight and processed. We accounted for the effect of blood sample transportation by collecting samples from healthy volunteers or healthy family members and transporting and processing them simultaneously with the transported samples (travel/family controls). Cells from healthy local volunteers collected and processed locally were also used (local controls). We included multiple local controls in each cytometry experiment.

PBMCs from 38 healthy individuals (29 adults and nine children/adolescents age 1–17 y), 31 patients with various forms of unusually severe infectious diseases and autoimmune diseases, and 16 travel/family controls for these patients were studied in seven different batches of experiments. Nine healthy adult individuals and one patient were studied twice in different batches as either technical or biological replicates. Freshly thawed PBMCs (1.0 × 106 cells per panel) were first stained with Fc block and then with a panel of metal-conjugated Abs obtained from Fluidigm or by customized conjugation. The panel consisted of the following Abs: anti-CD45-89Y, anti-CD57-113In, anti-CD11c-115In, anti-CD33-141Pr, anti-CD19-142Nd, anti-CD45RA-143Nd, anti-CD141-144Nd, anti-CD4-145Nd, anti-CD8-146Nd, anti-CD20-147Sm, anti-CD16-148Nd, anti-CD127-149Sm, anti-CD1c-150Nd, anti-CD123-151Eu, anti-CD66b-152Sm, anti–PD-1–153Eu, anti-CD86-154Sm, anti-CD27-155Gd, anti-CCR5-156Gd, anti-CD117-158Gd, anti-CD24-159Tb, anti-CD14-160Gd, anti-CD56-161Dy, anti-γδTCR-162Dy, anti-CRTH2-163Dy, anti-CLEC12A-164Dy, anti-CCR6-165Ho, anti-CD25-166Er, anti-CCR7-167Er, anti-CD3-168Er, anti-CX3CR1-169Tm, anti-CD38-170Er, anti-CD161-171Yb, anti-CD209-172Yb, anti-CXCR3-173Yb, anti-HLA-DR-174Yb, anti-CCR4-176Yb, and anti-CD11b-209Bi. Stained cells were washed, fixed, and permeabilized with barcode permeabilization buffer (Fluidigm) and barcoded with the Cell-ID 20-Plex Pd Barcoding Kit (Fluidigm). Samples were then washed, pooled into a single tube, and stored until acquisition. Dead cells and doublets were excluded by staining the cells with a rhodium-based dead cell exclusion intercalator (Rh103) before fixation and with cationic iridium nucleic acid intercalators (Ir191 and Ir193) after fixation. Cells were acquired on a Helios mass cytometer (Fluidigm). Data are available from FlowRepository (accession: FR-FCM-Z3YK, https://flowrepository.org/id/FR-FCM-Z3YK).

Two experiments were performed on separate dates. We studied frozen PBMCs from 11 locally recruited healthy adult controls and three patients (two patients with homozygous loss-of-function mutations of FAS and one patient with a heterozygous gain-of-function STAT3 mutation). Freshly thawed PBMCs (2 × 106 cells for controls and 5 × 106 cells for the patient) were stained with the Zombie NIR Fixable Viability Kit (1:2000; BioLegend) for 15 min on ice. Cells were then stained with the following panel on ice for 30 min: FcR Blocking Reagent (1:50; Miltenyi Biotec), anti-CD3-BD Horizon V450 (1:450; BD Biosciences), anti-CD4-BUV563 (1:450; BD Biosciences), anti-CD8-BUV737 (1:150; BD Biosciences), anti-CD14-BUV395 (1:100; BD Biosciences), anti-CD16-PE-Dazzle (1:150; BioLegend), anti-CD20-BV785 (1:150; BD Biosciences), anti-CD56-BV605 (1:50; BioLegend), anti-γδTCR-Alexa Fluor 647 (1:50; BioLegend), anti-Vδ1-FITC (1:450; Miltenyi Biotec), anti-Vδ2-APC-Fire750 (1:1350; BioLegend), anti-Vα7.2-Alexa Fluor 700 (1:50; BioLegend), MR1-BV421 (provided by the NIH Tetramer Core Facility, 1:200), anti-Vα24-Jα18-BV480 (1:50; BD Biosciences), and anti-Vβ11-APC (1:150; Miltenyi Biotec) Abs. The cells were fixed on ice for 30 min and permeabilized with the True-Nuclear Transcription Factor Buffer Set (BioLegend). The cells were then stained for intracellular transcription factors overnight in the dark at 4°C with the following panel: anti-T-bet-PE-Cy7 (1:1350; BioLegend), anti-GATA3-BV711 (1:50; BioLegend), anti-RORγT-PE (1:50; BioLegend), and anti-EOMES-PerCP-eFluor 710 (1:50; eBioscience) Abs. Cells were acquired with an Aurora cytometer (Cytek). Data were inspected with FlowJo, and manually gated live singlets were then exported as flow cytometry standard (fcs) files for subsequent integration of multibatch cytometry datasets (iMUBAC) analysis. Data are available from FlowRepository (accession: FR-FCM-Z3YL, https://flowrepository.org/id/FR-FCM-Z3YL).

As a technical validation, PBMCs from two locally recruited healthy adult controls and three patients (two patients with homozygous loss-of-function mutations of FAS and one patient with a homozygous gain-of-function STAT3 mutation) were also analyzed by conventional flow cytometry. Cells were stained with the following panel on ice for 20 min: FcR Blocking Reagent (1:50; Miltenyi Biotec), anti-αβTCR-PE/Cy7 (1:100; BioLegend), anti-CD3-BV421 (1:100; BioLegend), anti-CD4-redFluor 710 (1:100; Tonbo Biosciences), anti-CD8-PerCP/Cy7 (1:100; BioLegend), anti-CD14-APC/Cy7 (1:100; BioLegend), anti-CD16-APC (1:100; BioLegend), anti-CD19-Super Bright 645 (1:100; eBioscience), and anti-CD56-Alexa Fluor 488 (1:100; BD Biosciences). Cells were then stained with 7-AAD (1:200; Tonbo Biosciences) on ice for 10 min and then acquired with a BD FACS Aria (BD Biosciences). Compensation was performed with single-stained PBMCs as controls. Data were analyzed with FlowJo.

The CyTOF dataset used in the CytofRUV study (10) was downloaded from FlowRepository (accession: FR-FCM-Z2L2, https://flowrepository.org/id/FR-FCM-Z2L2). The dataset consists of PBMC samples from nine healthy controls. All samples are stained with a panel of Abs targeting 19 surface and 12 intracellular proteins. Each sample was studied twice in two batches of experiments (i.e., technical replicates). Only samples from healthy controls were used in this study. In the benchmark analysis, the first healthy control (VBDR996) was used as an anchor sample for CytoNorm, CytofBatchAdjust, and CytofRUV.

The CyTOF dataset from the checkpoint blockade study (11) was downloaded from FlowRepository (accession: FR-FCM-ZYQR, https://flowrepository.org/id/FR-FCM-ZYQR). The dataset consists of PBMC samples from six healthy controls studied in four different batches of experiments. All samples are stained with a panel of Abs targeting 33 surface and nine intracellular proteins. Only samples from healthy controls were used in this study. One control (GCND398064) was included as a technical replicate in all batches.

The CyTOF datasets from the PD-1 blockade study (12) were downloaded from FlowRepository (accession: FR-FCM-ZY34, https://flowrepository.org/id/FR-FCM-ZY34). The datasets consist of PBMC samples from 20 patients with stage IV melanoma before and ∼12 wk after treatment with either nivolumab or pembrolizumab and 10 healthy donors at two corresponding time points. Patients are classified as responders or nonresponders based on treatment outcomes for the first 15 wk of treatment. The datasets consist of two batches (i.e., experiments performed on two different dates), designated as discovery and validation cohorts in the original study. The first batch contains five healthy donors, five responders, and five nonresponders, whereas the second batch contains five healthy donors, six responders, and four nonresponders. The samples are stained with three different Ab panels.

All computational analyses were performed with a laptop personal computer equipped with quad-core CPUs and 64GB RAM, using the free versions of R (version 4.0) (https://www.r-project.org/) (13) and RStudio Desktop (version 1.3) (https://rstudio.com/products/rstudio/). The R package iMUBAC and its instructions, including example codes and datasets, are available from GitHub (https://github.com/casanova-lab/iMUBAC). Full scripts are available upon request.

The iMUBAC workflow consists of four steps: 1) preprocessing, 2) batch correction, 3) unsupervised clustering and cell-type annotation, and 4) batch-specific cell-type prediction.

Preprocessing.

Batch-specific preprocessing was performed as follows. First, CyTOF data files in the fcs format were imported into R with the ncdfFlow package. The ncdfFlow package can be used for the memory-efficient HDF5-based storage of cytometry data. The truncate_max_range option was disabled. For CyTOF data, the transformation option, implemented in the underlying function to read fcs files (read.FCS), was disabled as this step is optimized for flow cytometry. Channel names were then organized. This step resolves batch-to-batch differences in the panel design such that identical markers measured with different channels (e.g., fluorochrome- or metal-conjugated Abs) are aligned. For CyTOF data, we then excluded doublets and dead cells in a data-adaptive manner. In this step, all cells from all samples in a single batch of an experiment were pooled, such that identical gates were applied to all samples in a given batch. For DNA-based gating, the dnaGate function in the cydar package was used, and outliers on both the higher and lower sides (considered to be doublets and debris, respectively) were excluded. For event length and dead cell exclusion dye-based gating, the outlierGate function in the cydar package was used, and outliers on the higher side were excluded. In our in-house CyTOF datasets, the intercalator Rh103 was used to exclude dead cells, whereas the intercalators Ir191 and Ir193 were used to exclude doublets and debris. In the PD-1 blockade CyTOF datasets, the intercalator Pt198 was used to exclude dead cells, whereas the intercalators Ir191 and Ir193 were used to exclude doublets and debris. For our in-house pregated spectral flow cytometry datasets, the channel for the Zombie NIR Fixable Viability dye was used to exclude dead cells, and automated gating for doublets and DNA content was disabled. The expression values were then transformed. For CyTOF data, a hyperbolic arcsin transformation was applied, with a cofactor of five. For spectral flow cytometry data, Logicle transformation was applied, with parameters estimated in a data-adaptive manner with the estimateLogicle function implemented in the flowCore package. Finally, any event with zeros for all markers was discarded.

After the batch-specific preprocessing, the outputs were concatenated, with only markers common to all batches retained, to form a single SingleCellExperiment object in R.

Batch correction.

The goal of this step is to enable the system to learn batch-to-batch deviations due to technical effects but not biological variability. We, therefore, used only data from healthy controls for batch correction, excluding data for patients and travel/family controls. Data were first downsampled to 200,000 cells per batch (unless otherwise stated) by taking approximately equal numbers of cells from each control to reduce the computational burden. For the in-house spectral flow datasets, we used 500,000 cells per batch to ensure the robust identification of invariant NK T cells, an extremely rare innate-like T cell subset. For the PD-1 blockade CyTOF panel 3 (Myeloid Panel) dataset, we used 50,000 cells per batch because of the low total cell counts in the dataset. We then batch corrected expression values for all markers with Harmony (14), using the default parameters. Rather than performing principal component analysis, we used each marker directly as an input for batch correction. The effect of batch correction was assessed by manual inspection, with uniform manifold approximation and projection (UMAP) used for data visualization (L. McInnes, J. Healy, and J. Melville, manuscript posted on arXiv, Bibcode: 2018arXiv180203426M).

Unsupervised clustering and cell-type identification.

The goal of this step was to identify cell types in an unsupervised manner. We implemented two methods: 1) FlowSOM-guided clustering and 2) UMAP-based dimension reduction followed by the shared nearest neighbor (SNN) graph-based clustering (15) as described below.

FlowSOM method.

This approach was inspired by the workflow described by Nowicka et al. (5). Briefly, batch-corrected expression values were subjected to unsupervised clustering with FlowSOM (4), using the FlowSOM package, followed by metaclustering with the ConsensusClusterPlus package. Euclidean distance was used for metaclustering. For the in-house spectral flow cytometry and CyTOF datasets, we generated 50 and 60 clusters, respectively, to improve the resolution of cell-type identification. For the PD-1 blockade CyTOF datasets, we generated 40 clusters. We manually determined the identity of these clusters from their locations in the UMAP plot and the heatmap summarizing the median expression levels of all markers for each cluster.

SNN graph method.

This approach was inspired by the workflow for single-cell RNA sequencing datasets implemented in the scran package. Batch-corrected expression values were first dimension reduced via UMAP into 10 dimensions. These dimensions were then used to construct an SNN graph with the buildSNNGraph function in scran, using the default settings. Finally, the graph was divided into clusters with the Louvain algorithm implemented as the cluster_louvain function in the igraph package (16).

Batch-specific cell-type prediction.

The goal of this step is to allow the system to learn, automatically, the cell-type classification rules or cell-type boundaries in a batch-specific manner and to propagate the boundaries to all the cells in a given batch, including cells from travel/family controls and patients. Importantly, we used non–batch-corrected expression values tied to cluster labels defined in the unsupervised clustering section. First, cells from the healthy controls used for unsupervised clustering were further downsampled, retaining a maximum of 100 cells per cluster from a given batch. This step reduces both the computational burden and the class imbalance problem during machine learning, as there are both highly abundant cell subsets (e.g., CD14+ monocytes) and rare cell subsets (e.g., plasmacytoid dendritic cells) among human PBMCs. We tested several conditions and found that cell-type classification rules can be learned successfully from 100 events per cell type. A classifier was then trained, using the caret package (17). We selected the extremely randomized trees (18) algorithm implemented in the extraTrees package. After centering and scaling the non–batch-corrected expression values, we performed five-repeat 10-fold cross-validations with internal upsampling to maximize the κ statistic. Hyperparameters were tuned for each batch; mtry was tuned from 5 to 15, whereas numRandomCuts was tuned from one to two, the ranges being empirically determined. The batch-specific classifier was then applied for all cells in a given batch, including the cells of patients and travel/family controls, to determine the best-matching clusters in a probabilistic manner. This approach assigns all cells of patients and travel/family controls into one of the clusters defined with the cells of healthy controls. It is, therefore, still possible that the cells of patients assigned to a particular cluster display differential expression of a certain subset of markers relative to the cells of controls in the same cluster. Cluster-wise differential expression analysis is required to characterize these differential expression markers comprehensively, but this topic is not dealt with further in this article.

CytoNorm (8), CytofBatchAdjust (9), and CytofRUV (10) were downloaded from their GitHub repositories and used with default settings. Batch correction was performed with one technical replicate sample (replicated across all batches). For iMUBAC analysis, batch correction was performed without specifying technical replicates. Signals for all markers were simultaneously corrected. Earth mover’s distance (EMD) was calculated between batches for each of the markers, with the EMDomics package. When multiple healthy controls were included in a given batch, the EMD was calculated without distinguishing between individuals. Computation time was measured in R. Per-cluster batch representation was defined as the proportion of cells from a certain batch in a given cluster. All clusters defined through unsupervised clustering were subjected to the analysis.

The raw counts of cell subsets identified from the three panels were jointly tested for differential abundance (DA) with the quasi-likelihood F-test framework of the edgeR package (19). When both pre- and posttreatment datasets were used, the DA between groups (i.e., responders and nonresponders) was assessed, with adjustment for both treatment and batch effects. When only pretreatment datasets were used, the DA between groups was assessed with adjustment for batch effects only. The DA values for subsets with an absolute log2 fold-change of at least 0.5 and an adjusted p value < 0.05 were considered statistically significant.

We present iMUBAC, a flexible, robust, and scalable computational framework for rational interbatch comparisons through high-dimensional batch correction and unsupervised cell-type identification across multiple batches (Fig. 1). The workflow can be broken down into four steps. First, iMUBAC performs data-adaptive, automated preprocessing, including the exclusion of doublets and dead cells and ensuring interbatch consistency in panel design. Second, iMUBAC batch corrects cells from healthy local controls, but not from travel/family controls or patients, using Harmony (14) to reduce batch effects before clustering. The major advantage of Harmony is that it does not require technical replicates shared across batches. Third, iMUBAC performs unsupervised clustering with the batch-corrected expression values, by 1) FlowSOM-guided clustering (4) or 2) dimension reduction by UMAP (L. McInnes, et al., manuscript posted on arXiv, Bibcode: 2018arXiv180203426M) followed by SNN graph-based clustering (15). If desired, the clusters can be further merged manually and identified to improve interpretability in subsequent analyses. Fourth, iMUBAC trains batch-specific classifiers through machine learning with noncorrected expression values. In this study, the idea is to “back-propagate” cell-type annotations defined in the batch-corrected high-dimensional space into the noncorrected, batch-specific spaces in which the patients’ cells are embedded. The entire workflow is independent of patients (and of travel/family controls), for whom immunophenotypes need to be determined, thereby preventing over- or undercorrection.

We first benchmarked the batch-correction efficacy of iMUBAC against other algorithms, using two different public multibatch CyTOF datasets for PBMCs from multiple healthy volunteers, including technical replicate samples shared across all batches of experiments. The CytofRUV dataset reported by Trussart et al. (10) contains nine healthy controls analyzed in two batches of experiments. The first control (VBDR996) was used as an anchor for CytoNorm, CytofBatchAdjust, and CytofRUV. By contrast, Harmony-guided batch correction in iMUBAC used 200,000 cells per batch randomly sampled from all the available healthy controls. The checkpoint blockade dataset reported by Wei et al. (11) contains 14 samples from six healthy controls studied in four batches of experiments, including one control (GCND398064) studied as technical replicates in all batches. The shared control was used as an anchor for CytoNorm, CytofBatchAdjust, and CytofRUV. By contrast, Harmony-guided batch correction in iMUBAC used 1,000,000 cells per batch randomly sampled from all the available controls. We compared the EMD between batches for each marker before and after batch correction, as described in previous studies (8, 10). We aggregated all controls in each batch for EMD calculation. We therefore expect interindividual variability to persist even after the removal of batch effects due to technical variability. The comparison showed that iMUBAC reduced EMD in both datasets as effectively as CytoNorm, and perhaps more effectively than CytofBatchAdjust (Fig. 2A), whereas CytofRUV gave the most effective reduction of EMD in both datasets.

We then evaluated the efficacy of batch correction by iMUBAC using datasets lacking shared technical replicates across batches. To this end, we used the following multibatch cytometry datasets: in-house CyTOF, in-house spectral flow, and PD-1 blockade CyTOF datasets. The in-house CyTOF dataset contains 47 PBMC samples from 38 healthy local controls analyzed in seven batches of experiments with a panel consisting of 38 cell-surface markers. The in-house spectral flow cytometry dataset contains 11 PBMC samples from 11 healthy controls analyzed in two batches of experiments with a panel consisting of 14 cell-surface and four intracellular markers. The PD-1 blockade CyTOF dataset, reported by Krieg et al. (12), contains 10 PBMC samples from 10 healthy controls analyzed in two batches of experiments (i.e., discovery and validation cohorts) with three different Ab panels. These datasets cannot be processed by existing methods (i.e., CytoNorm, CytofBatchAdjust, or CytofRUV) because of the lack of replicates. Batch correction was performed by iMUBAC, using cells randomly sampled from all controls (n = 500,000 for the spectral flow dataset, n = 50,000 for the PD-1 blockade panel 3 CyTOF dataset, and n = 200,000 for other datasets). We found that batch correction decreased EMD in all datasets (Fig. 2B). We then evaluated the benefits of batch correction for subsequent unsupervised clustering by calculating the proportion of cells from a given batch of experiments in a given cluster, as the “per-cluster batch representation.” FlowSOM defined 60, 50, 40, 30, and 40 clusters, using noncorrected or corrected expression values, for the in-house CyTOF, in-house spectral flow, and PD-1 blockade CyTOF panel 1, 2, and 3 datasets, respectively. Batch correction improved the evenness of per-cluster batch representation in all datasets (Fig. 2C). Finally, we assessed the impact of the number of healthy controls per batch on the batch-correction performance of iMUBAC, using the in-house CyTOF dataset as a case study. We calculated both EMDs and per-cluster batch representations with the inclusion of a limited number of controls. We found that at least three controls per batch were required to achieve effective batch correction (Fig. 2D, 2E). Collectively, iMUBAC successfully integrates multibatch CyTOF and spectral flow cytometry datasets, even without shared technical replicates, by overlaying cells from multiple healthy controls as anchors.

Scalability is also an issue when a large number of high-dimensional cytometry experiments need to be integrated. We thus evaluated the scalability of the iMUBAC workflow by assessing the computation time required. We used our in-house CyTOF dataset, which was the largest dataset tested in this study. We first evaluated the relationship between the number of batches and the computation time required for Harmony-guided batch correction (using ∼200,000 cells per batch). We found that computation time increased linearly with the number of batches (Fig. 3A). It took ∼2.5 h to integrate all seven batches. We also assessed the time required for cell-type back-propagation by machine learning. Our in-house CyTOF dataset contains 95 PBMC samples from healthy local controls, patients with unusually severe infectious diseases and autoimmune diseases, and their travel/family controls (∼18 million cells from 85 individuals in total), analyzed in seven batches of experiments. After batch correction with healthy local controls, FlowSOM defined 60 clusters. Batch-specific classifiers were trained on non–batch-corrected expression values, and the cluster labels were back-propagated to all cells, including cells from travel/family controls and patients. The classifier training phase took less than an hour per ∼200,000 cells per batch, whereas the prediction phase was completed in most cases within 15 min per batch (Fig. 3B, 3C). Collectively, iMUBAC scales up linearly as the number of batches or cells increases.

The successful identification of both abundant and rare leukocyte subsets is an absolute requirement for optimal immunophenotyping. We, thus, evaluated the accuracy and immunological interpretability of the iMUBAC workflow with our in-house CyTOF dataset. After batch correction with healthy local controls, FlowSOM defined 60 clusters, which we merged and identified manually (Supplemental Fig. 1). We then calculated the percentages of the various subsets among live single leukocytes (CD45+CD66b cells, excluding CD45CD66b+ granulocytes). The cell-type frequencies of both technical (i.e., experiments performed on different dates with aliquots of identical biological materials) and biological (i.e., experiments performed on different dates with biological materials obtained from identical donors on different occasions) duplicates correlated well between the two batches tested (Supplemental Fig. 2). We also evaluated the accuracy of automatic cell-type identification by iMUBAC, by manually gating the same CyTOF dataset in FlowJo. The manually determined frequencies of both abundant (e.g., αβ T cells) and rare (e.g., conventional dendritic cells) subsets were remarkably consistent with the frequencies determined by the iMUBAC workflow (independent of manual gating) across a wide range of percentages (from ∼0.01% to ∼70%), not only for local controls but also for travel/family controls and patients (Fig. 4). Indeed, iMUBAC successfully identified both the reduction (e.g., NK cells) and expansion (e.g., B cells) of multiple cell subsets in samples from patients, demonstrating the robustness of cell-type identification regardless of abundance in a given sample. Collectively, iMUBAC rationally integrates multibatch cytometry datasets to identify cell types consistent with state-of-the-art gating in both controls and patients.

The identification, description, and quantification of disease-associated rare leukocyte subsets are crucial in many immunological studies. We, therefore, applied iMUBAC to our in-house multibatch spectral flow cytometry dataset, which contains PBMCs from 11 healthy controls and three patients with two monogenic forms of autoimmunity: FAS deficiency (n = 2) (20, 21) and STAT3 gain-of-function (n = 1) (2224). Both these disorders are associated with lymphoproliferation and high counts of CD4CD8 double-negative αβ T (DN T) cells in the peripheral bloodstream. After the manual gating out of dead cells and doublets in FlowJo, iMUBAC batch corrected the cells (∼500,000 per batch) from healthy controls and defined 50 clusters, which we then merged and identified manually (Fig. 5A, 5B, Supplemental Fig. 3). As expected, we observed an expansion of a cluster representing DN T cells in patients with FAS deficiency and, to a lesser extent, in those with STAT3 gain-of-function (Fig. 5C). Conventional flow cytometry with a different Ab panel validated this expansion of DN T cells (data not shown). iMUBAC also revealed a decrease in the levels of Vδ2 γδ T, CD8+ mucosal-associated invariant T, and CD16+ NK cells in both groups, possibly reflecting a previously unappreciated level of immunopathological homogeneity between these two inborn errors of immunity (Fig. 5C). Thus, iMUBAC readily identifies both known and unappreciated immunophenotypes in patients with rare immunological disorders.

Finally, we demonstrated the potential advantage of streamlined immunophenotyping on multibatch cytometry datasets by applying iMUBAC to the PD-1 blockade CyTOF datasets previously reported by Krieg et al. (12). These datasets consist of PBMCs from 10 healthy controls and 20 patients with stage IV melanoma before and after PD-1 blockade immunotherapy (11 responders and nine nonresponders) studied in two batches of experiments (i.e., the discovery and validation cohorts) with three different Ab panels. Healthy controls were used for batch correction, clustering with FlowSOM, and classifier training (Fig. 5D, Supplemental Fig. 4). We tested the DA between responders and nonresponders, using the quasi-likelihood F-test in edgeR (19), as previously applied to CyTOF analysis (25, 26). Eight DA subsets were identified, which remained statistically significant after adjustment for treatment status and batch effects. In particular, the CD14 Mono 2 cluster from panel 3, corresponding to CD14+CD16HLA-DRhiCD86hiPD-L1hi monocytes, was expanded in responders, consistent with the findings of the original report (12) (Fig. 5E, Supplemental Fig. 4A). Moreover, among the eight DA subsets, four T cell subsets from panel 1 were reproducibly identified even when only pretreatment datasets were used for the DA analysis (Fig. 5F, Supplemental Fig. 4B). These subsets could potentially be used as biomarkers of a better prognosis, before the initiation of PD-1 blockade immunotherapy. Thus, iMUBAC can be used to streamline an exploratory immunophenotyping analysis in clinical pilot studies of common diseases.

High-dimensional cytometry is a promising technique for human immunological studies, which are often characterized by the limited amounts of clinical samples available. A rational comparison of leukocyte subsets from patients with immunological conditions studied on different occasions would make it possible to identify the immunophenotypes unique to particular patients and those common to these patients. Moreover, platforms such as FlowRepository and Cytobank facilitate the sharing of cytometry data, making it possible to foster discoveries through meta-analysis. However, the lack of appropriate technical replicates across experiments hinders batch correction, making integrative analyses challenging. We present iMUBAC, a technical replicate-independent batch-correction and unsupervised clustering framework, available on GitHub as a readily installable R package (https://github.com/casanova-lab/iMUBAC), for the rational comparative immunophenotyping analysis of multiple batches of high-dimensional cytometry datasets. We show that iMUBAC is flexible in terms of study design (e.g., technical replicates), linearly scalable with increases in the number of batches and cells to be integrated, and robust with respect to the abundance of each of the immune cell subsets in a given sample. The combination of batch correction and unsupervised clustering provided by this package enables investigators to define leukocyte subsets comparatively across batches, eliminating the need to interpret “batch-specific” clusters originating from batch effects or artifacts. Furthermore, we demonstrate that iMUBAC can successfully identify known disease condition–specific subsets in both CyTOF and spectral flow cytometry datasets. Given the increasing popularity of high-dimensional spectral flow cytometry, a unified workflow applicable to both CyTOF and spectral flow cytometry is of major pragmatic significance.

There are several caveats to the use of iMUBAC. First, we assume that both healthy controls and patients have identical sets of cell subsets. All samples must, therefore, be derived from the same type of biological material. For example, iMUBAC cannot simultaneously handle PBMC samples from controls and tumor-infiltrating cells from patients. However, we successfully show that the subset of DN T cells, a rare T cell subset, in healthy controls is expanded in patients with lymphoproliferative autoimmunity, suggesting that iMUBAC can handle samples from patients that are quantitatively different from control samples. Second, iMUBAC makes use of multiple healthy controls per batch, rather than a single shared technical replicate, enabling it to capture an “average” immunophenotype across controls. Given the inherent variability of human immunophenotypes, any single control may deviate from the average immunophenotype. As we showed, the use of multiple controls (at least three) mitigates this unpredictable variability. Third, the functionality of iMUBAC may be restricted by the computational resources available, although this package can operate on any conventional desktop personal computer or laptop computer. The computation time (of the order of hours) of iMUBAC is undoubtedly longer than that for the other tools currently available. When shared technical replicates are available, CytofRUV may be a better choice, as it outperforms iMUBAC in terms of both the speed and efficacy of batch correction. Nevertheless, we demonstrate in this study the feasibility of comparative immunophenotyping analysis on multibatch cytometry experiments without relying on shared technical replicates. We envisage that our framework can be applied in many areas of immunological studies, thereby fostering discoveries that might otherwise be overlooked.

We thank the patients, relatives, and physicians for participating in this study; Dominick Papandrea, Yelena Nemirovskaya, Mark Woollett, Lazaro Lorenzo-Diaz, and Cécile Patissier for administrative assistance; Tatiana Kochetkov for technical assistance; and the members of the Casanova laboratory (Rockefeller University) for helpful discussions. We thank the Flow Cytometry Resource Center at the Rockefeller University for technical support for flow cytometry. We also thank the Human Immune Monitoring Core at the Icahn School of Medicine at Mount Sinai for technical assistance with mass cytometry. We thank the National Institutes of Health Tetramer Core Facility for providing the MR1 tetramer, which was developed jointly with Dr. James McCluskey, Dr. Jamie Rossjohn, and Dr. David Fairlie.

This study was supported in part by grants from the St. Giles Foundation, Rockefeller University, INSERM, the Université de Paris, the Howard Hughes Medical Institute, the National Institute of Allergy and Infectious Diseases/National Institutes of Health (R37AI095983 to J.-L.C., U19AI142737 to S.B.-D., and R01AI127372 and R01AI148963 to D.B.), the French Foundation for Medical Research (EQU201903007798), the French National Research Agency under the Investments for the Future Program (ANR-10-IAHU-01), the Integrative Biology of Emerging Infectious Diseases Laboratory of Excellence (ANR-10-LABX-62-IBEID), the GENMSMD project (ANR-16-CE17-0005-01 to J.B.), the SCOR Corporate Foundation for Science, and Fondation du Souffle (SRC2017 to J.B.). M.O. was supported by the David Rockefeller Graduate Program, the Funai Foundation for Information Technology, the Honjo International Scholarship Foundation, and the New York Hideyo Noguchi Memorial Society, Inc. R.Y. was supported by the Immune Deficiency Foundation and the Stony Wold-Herbert Fund. A.N.S. was supported by the European Commission (Horizon 2020 Marie Skłodowska-Curie Individual Fellowship 789645), the Dutch Research Council (Rubicon Grant 019.171LW.015), and the European Molecular Biology Organization (Long-Term Fellowship ALTF 84-2017, nonstipendiary). J.R. was supported by the INSERM Ph.D. Program (Poste d’Accueil INSERM).

The cytometry datasets presented in this article have been submitted to FlowRepository (https://flowrepository.org/id/FR-FCM-Z3YK) under accession numbers FR-FCM-Z3YK and FR-FCMZ3YL.

The online version of this article contains supplemental material.

Abbreviations used in this article:

CyTOF

cytometry by time-of-flight

DA

differential abundance

DN T

double-negative αβ T

EMD

earth mover’s distance

fcs

flow cytometry standard

iMUBAC

integration of multibatch cytometry datasets

SNN

shared nearest neighbor

UMAP

uniform manifold approximation and projection.

1
Tangye
,
S. G.
,
W.
Al-Herz
,
A.
Bousfiha
,
T.
Chatila
,
C.
Cunningham-Rundles
,
A.
Etzioni
,
J. L.
Franco
,
S. M.
Holland
,
C.
Klein
,
T.
Morio
, et al
.
2020
.
Human inborn errors of immunity: 2019 update on the classification from the International Union of Immunological Societies expert committee. [Published erratum appears in 2020 J. Clin. Immunol. 40: 65.]
J. Clin. Immunol.
40
:
24
64
.
2
Amir
,
A. D.
,
K. L.
Davis
,
M. D.
Tadmor
,
E. F.
Simonds
,
J. H.
Levine
,
S. C.
Bendall
,
D. K.
Shenfeld
,
S.
Krishnaswamy
,
G. P.
Nolan
,
D.
Pe’er
.
2013
.
viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia.
Nat. Biotechnol.
31
:
545
552
.
3
Qiu
,
P.
,
E. F.
Simonds
,
S. C.
Bendall
,
K. D.
Gibbs
Jr.
,
R. V.
Bruggner
,
M. D.
Linderman
,
K.
Sachs
,
G. P.
Nolan
,
S. K.
Plevritis
.
2011
.
Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE.
Nat. Biotechnol.
29
:
886
891
.
4
Van Gassen
,
S.
,
B.
Callebaut
,
M. J.
Van Helden
,
B. N.
Lambrecht
,
P.
Demeester
,
T.
Dhaene
,
Y.
Saeys
.
2015
.
FlowSOM: using self-organizing maps for visualization and interpretation of cytometry data.
Cytometry A
87
:
636
645
.
5
Nowicka
,
M.
,
C.
Krieg
,
H. L.
Crowell
,
L. M.
Weber
,
F. J.
Hartmann
,
S.
Guglietta
,
B.
Becher
,
M. P.
Levesque
,
M. D.
Robinson
.
2017
.
CyTOF workflow: differential discovery in high-throughput high-dimensional cytometry datasets.
F1000 Res.
6
:
748
.
6
Bruggner
,
R. V.
,
B.
Bodenmiller
,
D. L.
Dill
,
R. J.
Tibshirani
,
G. P.
Nolan
.
2014
.
Automated identification of stratifying signatures in cellular subpopulations.
Proc. Natl. Acad. Sci. USA
111
:
E2770
E2777
.
7
Arvaniti
,
E.
,
M.
Claassen
.
2017
.
Sensitive detection of rare disease-associated cell subsets via representation learning.
Nat. Commun.
8
:
14825
.
8
Van Gassen
,
S.
,
B.
Gaudilliere
,
M. S.
Angst
,
Y.
Saeys
,
N.
Aghaeepour
.
2020
.
CytoNorm: a normalization algorithm for cytometry data.
Cytometry A
97
:
268
278
.
9
Schuyler
,
R. P.
,
C.
Jackson
,
J. E.
Garcia-Perez
,
R. M.
Baxter
,
S.
Ogolla
,
R.
Rochford
,
D.
Ghosh
,
P.
Rudra
,
E. W. Y.
Hsieh
.
2019
.
Minimizing batch effects in mass cytometry data.
Front. Immunol.
10
:
2367
.
10
Trussart
,
M.
,
C. E.
Teh
,
T.
Tan
,
L.
Leong
,
D. H.
Gray
,
T. P.
Speed
.
2020
.
Removing unwanted variation with CytofRUV to integrate multiple CyTOF datasets.
Elife
9
: e59630.
11
Wei
,
S. C.
,
N.-A. A. S.
Anang
,
R.
Sharma
,
M. C.
Andrews
,
A.
Reuben
,
J. H.
Levine
,
A. P.
Cogdill
,
J. J.
Mancuso
,
J. A.
Wargo
,
D.
Pe’er
,
J. P.
Allison
.
2019
.
Combination anti-CTLA-4 plus anti-PD-1 checkpoint blockade utilizes cellular mechanisms partially distinct from monotherapies.
Proc. Natl. Acad. Sci.
116
:
22699
22709
.
12
Krieg
,
C.
,
M.
Nowicka
,
S.
Guglietta
,
S.
Schindler
,
F. J.
Hartmann
,
L. M.
Weber
,
R.
Dummer
,
M. D.
Robinson
,
M. P.
Levesque
,
B.
Becher
.
2018
.
High-dimensional single-cell analysis predicts response to anti-PD-1 immunotherapy. [Published erratum appears in 2018 Nat. Med. 24: 1773–1775.]
Nat. Med.
24
:
144
153
.
13
R Core Team
.
2018
.
R: A Language and Environment for Statistical Computing.
R Foundation for Statistical Computing
,
Vienna, Austria
.
14
Korsunsky
,
I.
,
N.
Millard
,
J.
Fan
,
K.
Slowikowski
,
F.
Zhang
,
K.
Wei
,
Y.
Baglaenko
,
M.
Brenner
,
P. R.
Loh
,
S.
Raychaudhuri
.
2019
.
Fast, sensitive and accurate integration of single-cell data with Harmony.
Nat. Methods
16
:
1289
1296
.
15
Xu
,
C.
,
Z.
Su
.
2015
.
Identification of cell types from single-cell transcriptomes using a novel clustering method.
Bioinformatics
31
:
1974
1980
.
16
Csárdi
,
G.
,
T.
Nepusz
.
2006
.
The igraph software package for complex network research.
InterJournal Complex Systems
1695
:
1
9
.
17
Kuhn
,
M.
2008
.
Building predictive models in R using the caret package.
J. Stat. Softw.
28
:
1
26
.
18
Geurts
,
P.
,
D.
Ernst
,
L.
Wehenkel
.
2006
.
Extremely randomized trees.
Mach. Learn.
63
:
3
42
.
19
Robinson
,
M. D.
,
D. J.
McCarthy
,
G. K.
Smyth
.
2010
.
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
Bioinformatics
26
:
139
140
.
20
Oliveira
,
J. B.
,
J. J.
Bleesing
,
U.
Dianzani
,
T. A.
Fleisher
,
E. S.
Jaffe
,
M. J.
Lenardo
,
F.
Rieux-Laucat
,
R. M.
Siegel
,
H. C.
Su
,
D. T.
Teachey
,
V. K.
Rao
.
2010
.
Revised diagnostic criteria and classification for the autoimmune lymphoproliferative syndrome (ALPS): report from the 2009 NIH International Workshop.
Blood
116
:
e35
e40
.
21
Magerus-Chatinet
,
A.
,
M.-C.
Stolzenberg
,
M. S.
Loffredo
,
B.
Neven
,
C.
Schaffner
,
N.
Ducrot
,
P. D.
Arkwright
,
B.
Bader-Meunier
,
J.
Barbot
,
S.
Blanche
, et al
.
2009
.
FAS-L, IL-10, and double-negative CD4- CD8- TCR α/β+ T cells are reliable markers of autoimmune lymphoproliferative syndrome (ALPS) associated with FAS loss of function.
Blood
113
:
3027
3030
.
22
Haapaniemi
,
E. M.
,
M.
Kaustio
,
H. L. M.
Rajala
,
A. J.
van Adrichem
,
L.
Kainulainen
,
V.
Glumoff
,
R.
Doffinger
,
H.
Kuusanmäki
,
T.
Heiskanen-Kosma
,
L.
Trotta
, et al
.
2015
.
Autoimmunity, hypogammaglobulinemia, lymphoproliferation, and mycobacterial disease in patients with activating mutations in STAT3.
Blood
125
:
639
648
.
23
Milner
,
J. D.
,
T. P.
Vogel
,
L.
Forbes
,
C. A.
Ma
,
A.
Stray-Pedersen
,
J. E.
Niemela
,
J. J.
Lyons
,
K. R.
Engelhardt
,
Y.
Zhang
,
N.
Topcagic
, et al
.
2015
.
Early-onset lymphoproliferation and autoimmunity caused by germline STAT3 gain-of-function mutations.
Blood
125
:
591
599
.
24
Nabhani
,
S.
,
C.
Schipp
,
H.
Miskin
,
C.
Levin
,
S.
Postovsky
,
T.
Dujovny
,
A.
Koren
,
D.
Harlev
,
A. M.
Bis
,
F.
Auer
, et al
.
2017
.
STAT3 gain-of-function mutations associated with autoimmune lymphoproliferative syndrome like disease deregulate lymphocyte apoptosis and can be targeted by BH3 mimetic compounds.
Clin. Immunol.
181
:
32
42
.
25
Lun
,
A. T. L.
,
A. C.
Richard
,
J. C.
Marioni
.
2017
.
Testing for differential abundance in mass cytometry data.
Nat. Methods
14
:
707
709
.
26
Weber
,
L. M.
,
M.
Nowicka
,
C.
Soneson
,
M. D.
Robinson
.
2019
.
diffcyt: differential discovery in high-dimensional cytometry via high-resolution clustering.
Commun. Biol.
2
:
183
.

The authors have no financial conflicts of interest.

Supplementary data