# Transcriptional programs of neoantigen-specific TIL in anti-PD-1-treated lung cancers

No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment.

### Patients and biospecimens

This study was approved by the Institutional Review Boards (IRB) at Johns Hopkins University (JHU) and Memorial Sloan Kettering Cancer Center and was conducted in accordance with the Declaration of Helsinki and the International Conference on Harmonization Good Clinical Practice guidelines. The patients described in this study provided written informed consent. All biospecimens were obtained from patients with stage I-IIIA NSCLC who were enrolled to a phase II clinical trial evaluating the safety and feasibility of administering two doses of anti-PD-1 (nivolumab) before surgical resection. Pathological response assessments of primary tumours were reported previously8,31. Tumours with no more than 10% residual viable tumour cells were considered to have a MPR.

### scRNA-seq–TCR-seq

Cryobanked T cells were thawed and washed twice with pre-warmed RPMI with 20% FBS and gentamicin. Cells were resuspended in PBS and stained with a viability marker (LIVE/DEAD Fixable Near-IR; ThermoFisher) for 15 min at room temperature in the dark. Cells were the incubated with Fc block for 15 min on ice and stained with antibody against CD3 (BV421, clone SK7) for 30 min on ice. After staining, highly viable CD3+ T cells were sorted into 0.04% BSA in PBS using a BD FACSAria II Cell Sorter. Sorted cells were manually counted using a hemocytometer and prepared at the desired cell concentration (1,000 cells per μl), when possible. The Single Cell 5′ V(D)J and 5′ DGE kits (10X Genomics) were used to capture immune repertoire information and gene expression from the same cell in an emulsion-based protocol at the single-cell level. Cells and barcoded gel beads were partitioned into nanolitre-scale droplets using the 10X Genomics Chromium platform to partition up to 10,000 cells per sample followed by RNA capture and cell-barcoded cDNA synthesis using the manufacturer’s standard protocols. Libraries were generated and sequenced on an Illumina NovaSeq instrument using 2 × 150-bp paired end sequencing. 5′ VDJ libraries were sequenced to a depth of ~5,000 reads per cell, for a total of 5 million to 25 million reads. The 5′ DGE libraries were sequenced to a target depth of ~50,000 reads per cell.

### Whole-exome sequencing, mutation calling and neoantigen prediction

Genomic data for most individuals in our study were reported previously8, and whole-exome sequencing, variant calling and neoantigen predictions for individuals MD043-003 and NY016-025 were performed prospectively for the present study. Whole-exome sequencing was performed on pre-treatment tumours unless otherwise noted (Supplementary Table 4) and matched normal samples. DNA was extracted from tumours and matched peripheral blood using the Qiagen DNA kit (Qiagen). Fragmented genomic DNA from tumour and normal samples was used for Illumina TruSeq library construction (Illumina) and exonic regions were captured in solution using the Agilent SureSelect v.4 kit (Agilent,) according to the manufacturers’ instructions as previously described32. Paired-end sequencing, resulting in 100 bases from each end of the fragments for the exome libraries was performed using Illumina HiSeq 2000/2500 instrumentation (Illumina). The depth of total and distinct coverage is shown in Supplementary Table 4. Somatic mutations, consisting of point mutations, insertions, and deletions across the whole exome were identified using the VariantDx custom software for identifying mutations in matched tumour and normal samples as previously described32,33. Somatic mutations, consisting of nonsynonymous single base substitutions, insertions and deletions, were evaluated for putative MHC class I neoantigens using the ImmunoSelect-R pipeline (Personal Genome Diagnostics) as previously described30. Somatic sequence alterations are listed in Supplementary Table 5.

### Identification of neoantigen-specific TCRVβ CDR3 clonotypes

We used the MANAFEST assay3 to evaluate T cell responsiveness to MANA and viral antigens. In brief, pools of MHC class I-restricted CMV, EBV and influenza peptide epitopes (CEFX, jpt Peptide Technologies), pools representing the matrix protein and nucleoprotein from H1N1 and H3N2 (jpt Peptide Technologies), and putative neoantigenic peptides defined by the ImmunoSelect-R pipeline (jpt Peptide Technologies; Supplementary Table 6) were each used to stimulate 250,000 T cells in vitro for 10 days as previously described3. The time point of peripheral blood collection used for each MANAFEST assay is described in Supplementary Tables 2, 7. In brief, on day 0, T cells were isolated from PBMC by negative selection (EasySep; STEMCELL Technologies). The T cell-negative fraction was co-cultured with an equal number of selected T cells in culture medium (IMDM/5% human serum with 50 μg ml−1 gentamicin) with 1 μg ml-1 relevant neoantigenic peptide, 1 μg ml−1 of an MHC class I-restricted CMV, EBV, and influenza peptide epitope pool (CEFX, jpt Peptide Technologies), 1 μg ml−1 of pools representing the matrix protein and nucleoprotein from H1N1 and H3N2 (jpt Peptide Technologies), or no peptide (to use as a reference for non-specific or background clonotypic expansion). On day 3, half the medium was replaced with fresh medium containing cytokines for a final concentration of 50 IU ml−1 IL-2 (Chiron), 25 ng ml−1 IL-7 (Miltenyi) and 25 ng ml−1 IL-15 (PeproTech). On day 7, half the medium was replaced with fresh culture medium containing cytokines for a final concentration of 100 IU ml−1 IL-2 and 25 ng ml−1 IL-7 and IL-15. On day 10, cells were harvested, washed twice with PBS, and the CD8+ fraction was isolated using a CD8+ negative enrichment kit (EasySep; STEMCELL Technologies). DNA was extracted from each CD8-enriched culture condition using the Qiamp micro-DNA kit according to the manufacturer’s instructions. TCR sequencing was performed on each individual peptide-stimulated T cell culture using survey-level sequencing (max depth ~60,000 reads) by Adaptive Biotechnologies using their established platform34 or by the Sidney Kimmel Comprehensive Cancer Center FEST and TCR Immunogenomics Core (FTIC) facility using the Oncomine TCR Beta short-read assay (Illumina) and sequenced on an Illumina iSeq 100 using unique dual indexes, for a maximum of ~40,000 reads per sample.

Data pre-processing was performed to eliminate non-productive TCR sequences and to align and trim the nucleotide sequences to obtain only the CDR3 region. Sequences not beginning with C or ending with F or W and having less than seven amino acids in the CDR3 were eliminated. TCR sequencing samples with less than 1,000 productive reads were excluded from downstream analysis. MD043-011-MANA_22 was the only such sample in the present study (see Supplementary Table 7). Resultant processed data files were uploaded to our publicly available MANAFEST analysis web app (http://www.stat-apps.onc.jhmi.edu/FEST) to bioinformatically identify antigen-specific T cell clonotypes.

Bioinformatic analysis of productive clones was performed to identify antigen-specific T cell clonotypes meeting the following criteria: (1) significant expansion (Fisher’s exact test with Benjamini–Hochberg correction for false discovery rate (FDR), P < 0.05) compared to T cells cultured without peptide, (2) significant expansion compared to every other peptide-stimulated culture (FDR <0.05) except for conditions stimulated with similar neoantigens derived from the same mutation, (3) an odds ratio >5 compared to the no peptide control, and (4) present in at least 10% of the cultured wells to ensure adequate distribution among culture wells. A lower read threshold of 300 was used for assays sequenced by the FTIC and a lower threshold of 30 was used for samples sequenced by Adaptive Biotechnologies. In MANAFEST assays testing less than 10 peptides or peptide pools, cultures were performed in triplicate and reactive clonotypes were defined as being significantly expanded relative to T cells cultured without peptide (FDR <0.05) in two out of three triplicates, and not significantly expanded in any other well tested. When available, TCRseq was also performed on DNA extracted from tumour, normal lung, and lymph node tissue obtained before treatment and at the time of surgical resection, as well as serial peripheral blood samples. The assays performed on each biospecimen are outlined in Supplementary Table 2.

### Peptide affinity and stability measurements

Peptide affinity for cognate HLA molecules was assessed using a luminescent oxygen channeling immunoassay (LOCI; AlphaScreen, Perkin Elmer) as previously described35. This is a proximity-based system using a donor and acceptor bead, each conjugated with an epitope tag. When the donor bead is excited with light at 650 nm and can activate an acceptor bead, resulting in a signal at 520–620 nm, which can be quantified per second as a surrogate of affinity. A higher number of counts per second indicates higher affinity of the peptide:HLA pair. The stability of peptide loaded complexes was measured by refolding MHC with peptide and subsequently challenging complexes with a titration of urea. The denaturation of MHC was monitored by ELISA as described previously36.

### TCR reconstruction and cloning

Ten MANAFEST+ TCR sequences for which the TCRα chain could be enumerated (>3 cells in single-cell data with the same TCRα–TCRβ pair) were selected for cloning. In addition, seven clones (from three individuals: MD01-004, MD01-005 and MD043-011) that have high composite signature (using the AddModuleScore function) consisting of differential gene programs of MANA-specific T cell relative to influenza-specific T cells in the TRM were selected for cloning. Relevant TCRs were analysed with the IMGT/V-Quest database (http://www.imgt.org). The database allows us to identify the TRAV and TRBV families with the highest likelihood to contain the identified segments which match the sequencing data. To generate the TCRs, the identified TCRA V-J region sequences were fused to the human TRA constant chain, and the TCRB V-D-J regions to the human TRB constant chain. The full-length TCRA and TCRB chains were then synthesized as individual gene blocks (IDT) and cloned into the pCI mammalian expression vector, containing a CMV promoter, and transformed into competent Escherichia coli cells according to the manufacturer’s instructions (NEBuilder HiFi DNA Assembly, NEB). Post transformation and plasmid miniprep, the plasmids were sent for Sanger sequencing to ensure no mutations were introduced (Genewiz).

### T cell transfection, transient TCR expression and MANA-recognition assays

To generate a Jurkat reporter cell in which we could transfer our TCRs of interest, the endogenous TCR α- and β-chains were knocked out of a specific Jurkat line that contains a luciferase reporter driven by an NFAT response element (Promega) using the Alt-R CRISPR system (Integrated DNA Technologies, IDT). Two sequential rounds of CRISPR knockout were performed using crDNA targeting the TCRα constant region (AGAGTCTCTCAGCTGGTACA) and the TCRβ constant region (AGAAGGTGGCCGAGACCCTC). Limiting dilution was then used to acquire single cell clones and clones with both TCRα and TCRβ knocked out, as confirmed by Sanger sequencing and restoration of CD3 expression only by the co-transfection of TCRα or TCRβ chains, were chose. CD8α and CD8β chains were then transduced into the TCRαTCRβ Jurkat reporter cells using the MSCV retroviral expression system (Clontech). Jurkat reporter cells were then co-electroporated with the pCI vector encoding the TCRB and TCRA gene blocks, respectively, using ECM830 Square wave electroporation system (BTX) at 275 V for 10 ms in OptiMem media in a 4-mm cuvette. Post electroporation, cells were rested overnight by incubating in in RPMI 10% FBS at 37 °C, 5% CO2. TCR expression was confirmed by flow cytometric staining for CD3 on a BD FACSCelesta and 50,000 CD3+ T cells were plated in each well of a 96-well plate. Reactivity of the TCR-transduced Jurkat cells was assessed by co-culturing with 1 × 105 autologous EBV-transformed B cells, loaded with titrating concentrations of MANA peptides, viral peptide pools or negative controls. After overnight incubation, activation of the NFAT reporter gene was measured by the Bio-Glo Luciferase Assay per manufacturer’s instructions (Promega). Jurkat cells were routinely tested for mycoplasma contamination. No cell line authentication was performed.

### COS-7 transfection with HLA allele and p53 plasmids

gBlocks (IDT) encoding HLA A*68:01, p53(R248L) and wild-type p53 were cloned into pcDNA3.4 vector (Thermo Fisher Scientific, A14697). COS-7 cells were transfected with plasmids at 70–80% confluency using Lipofectamine 3000 (Thermo Fisher Scientific, L3000015) and incubated at 37 °C overnight in T75 flasks. A total of 30 μg plasmid (1:1 ratio of HLA plasmid per target protein plasmid in co-transfections) was used. Post transfection, COS-7 cells were plated with TCRαβ-transfected JurkaT cells containing NFAT reporter gene at a 1:1 ratio. After overnight incubation, activation of the NFAT reporter gene was measured by the Bio-Glo Luciferase Assay per manufacturer’s instructions (Promega).

### Single-cell data pre-processing and quality control

Cell Ranger v3.1.0 was used to demultiplex the FASTQ reads, align them to the GRCh38 human transcriptome, and extract their cell and unique molecular identifier (UMI) barcodes. The output of this pipeline is a digital gene expression (DGE) matrix for each sample, which records the number of UMIs for each gene that are associated with each cell barcode. The quality of cells was then assessed based on (1) the number of genes detected per cell and (2) the proportion of mitochondrial gene/ribosomal gene counts. Low-quality cells were filtered if the number of detected genes was below 250 or above 3× the median absolute deviation away from the median gene number of all cells. Cells were filtered out if the proportion of mitochondrial gene counts was higher than 10% or the proportion of ribosomal genes was less than 10%. For single-cell VDJ sequencing, only cells with full-length sequences were retained. Dissociation/stress associated genes37,38, mitochondrial genes (annotated with the prefix ‘MT-’), high abundance lincRNA genes, genes linked with poorly supported transcriptional models (annotated with the prefix ‘RP-’)39 and TCR (TR) genes (TRA/TRB/TRD/TRG, to avoid clonotype bias) were removed from further analysis. In addition, genes that were expressed in less than five cells were excluded.

### Single-cell data integration and clustering

Seurat40 (3.1.5) was used to normalize the raw count data, identify highly variable features, scale features, and integrate samples. PCA was performed based on the 3,000 most variable features identified using the vst method implemented in Seurat. Gene features associated with type I Interferon (IFN) response, immunoglobulin genes and specific mitochondrial related genes were excluded from clustering to avoid cell subsets driven by the above genes39. Dimension reduction was done using the RunUMAP function. Cell markers were identified by using a two-sided Wilcoxon rank sum test. Genes with adjusted P <0.05 were retained. Clusters were labelled based on the expression of the top differential gene in each cluster as well as canonical immune cell markers. Global clustering on all CD3 T cells and refined clustering on CD8 T cells were performed using same procedure. To select for CD8+ T cells, SAVER41 was used to impute dropouts by borrowing information across similar genes and cells. A density curve was fitted to the log2-transformed SAVER imputed CD8A expression values (using the ‘density’ function in R) of all cells from all samples. A cut-off is determined as the trough of the bimodal density curve (that is, the first location where the first derivative is zero and the second derivative is positive). All cells with log2-transformed SAVER imputed CD8A expression larger than the cut-off are defined as CD8+ T cells. TRB amino acid sequences were used as a biological barcode to match MANA, EBV or influenza A-specific T cell clonotypes identified from the FEST assay with single-cell VDJ profile and were projected onto CD8+ T cell refined UMAP.

### Single-cell subset pseudobulk gene expression analysis

PCA was performed on a standardized pseudobulk gene expression profile, where each feature was standardized to have a mean of zero and unit variance. In global CD3 and CD8 TIL PCA, for each cell cluster we first aggregated read counts across cells within the cluster to produce a pseudobulk expression profile for each sample and normalized these pseudobulk expression profiles across samples by library size. Combat function in the sva R package42,43 was applied to address potential batch effects on the normalized pseudobulk profile. Highly variable genes (HVGs) were selected for each cell cluster by fitting a locally weighted scatterplot smoothing (LOESS) regression of standard deviation against the mean for each gene and identifying genes with positive residuals. For each sample, all cell clusters were then concatenated by retaining each cluster’s HVGs to construct a concatenated gene expression vector consisting of all highly variable features identified from different cell clusters. Each element in this vector represents the pseudobulk expression of a HVG in a cell cluster. Samples were embedded into the PCA space based on these concatenated gene expression vectors. Canonical correlation44,45 between the first two PCs (that is, PC1 and PC2) and a covariate of interest (that is, tissue type or response status) was calculated. Permutation test was used to assess the significance by randomly permuting the sample labels 10,000 times. In the MANA-specific PCA (Extended Data Fig. 11), MANA-enriched cell clusters, defined by clusters with MANA-specific T cell frequency at least two fold higher than randomly expected, were aggregated as one combined cell cluster. Then, a similar procedure by first identifying HVGs, computing the first 2 PCs and then calculating the canonical correlation was repeated for the combined MANA-enriched cell cluster and each of the other CD8 clusters.

### Differential analysis comparing MPR and non-MPR by total CD8 or CD4 TIL and by cell cluster

The gene expression read counts were adjusted by library size. SAVER41 was used to impute the dropouts, and further log2-transformed the imputed values after adding a pseudocount of 1. A linear mixed-effect model46 was constructed to identify genes that are significantly differential between MPR and non-MPR among total CD8/CD4 TIL and by each cell cluster, respectively. The B-H procedure47 was used to adjust the P values for multiple testing, and the statistical significance is determined using a cut-off of FDR <0.05.

### Differential-expression tests and antigen-specific T cell marker genes

Differential-expression tests for antigen-specific T cells were performed using FindAllMarkers functions in Seurat with Wilcoxon rank-sum test on SAVER imputed expression values. Genes with >0.25 log2-fold changes, at least 25% expressed in tested groups, and Bonferroni-corrected p values <0.05 were regarded as significantly differentially expressed genes (DEGs). Antigen-specific (MANA versus influenza versus EBV) T cell marker genes were identified by applying the differential expression tests for upregulated genes between cells of one antigen specificity to all other antigen-specific T cells in the dataset. MANA-specific T cell genes associated with response to ICB were identified by applying the differential expression tests comparing MANA-specific T cells from MPR versus those from non-MPR. Top ranked DEGs (by log-fold changes) with a log2-fold changes >0.8 and DEGs relating to T cell function were extracted for further visualization in a heat map using pheatmap package. SAVER-imputed expression values of selective marker genes (transcriptional regulators, memory markers, tissue-resident markers, T cell checkpoints, effector and activation markers) were plotted using the RidgePlot function in Seurat.

### In vitro short-term TIL stimulation with IL-7

Cryopreserved TIL from patient MD01-004 were thawed, counted, and stained with the LIVE/DEAD Fixable Aqua (ThermoFisher) viability marker and antibodies specific for CD3 (PE, clone SK1) and CD8 (BV786, clone RPA-T8). Thirty-thousand CD8+ T cells per condition were sorted on a BD FACSAria II Cell Sorter into a 96-well plate. Autologous T cell-depleted PBMC were added as antigen presenting cells (APC) at 1:1 ratio. The cells were stimulated with either influenza A or MD01-004-MANA 12 peptide and titrating concentrations of recombinant human IL-7 (Miltenyi) for 12 h in a round-bottomed 96-well plate.

### Gene expression analysis of IL-7-stimulated MANA- and influenza-specific TIL

Following 12 h of antigen and IL-7 stimulation, cells were spun down, counted and re-suspended in 1% BSA at desired concentration. scRNA-seq and VDJ libraries were prepared using 10X Chromium single cell platform using 5′ DGE library preparation reagents and kits according to manufacturer’s protocols (10X Genomics) and as described above. MANA- or influenza-specific T cell clonotypes from the single-cell dataset were identified by using the TRB amino acid sequences as a biological barcode. SAVER imputed gene expression was scaled and centred using the ScaleData function in Seurat. A composite score for the IL-7-upregulated gene set48 (Supplementary Data 2.3) expression was computed using the AddModuleScore function and subsequently visualized using ridgeplot. Mean ± standard error was used to show the dose–response curve of the IL-7-upregulated gene-set score by antigen-specific T cells and peptide-stimulation groups.

### Immune checkpoint and exhaustion score generation and highly correlated genes

To characterize dysfunctional CD8 MANA TIL, six best-characterized (and clinically targeted) checkpoints: CTLA4, PDCD1, LAG3, HAVCR2, TIGIT and ENTPD1, were used to compute the T cell checkpoint score, and a published gene list from exhausted T cells was used to compute the T cell exhaustion score, using AddModuleScore function in Seurat. Applying T cell checkpoint score as an anchor, genes that were maximally correlated to the score were identified using linear correlation in MANA-specific TIL from MPR and non-MPR, respectively. Top-30 genes (from HVG selected using FindVariableGenes function in Seurat and excluded the 6 genes included in immune checkpoint score generation) with the highest correlation coefficients were plotted as a bar plot. The difference of correlation coefficients of the above genes was additionally computed between MPR and non-MPR and visualized using waterfall plot.

### Evaluation of peripheral MANA-specific T cell transcriptome changes during treatment

Peripheral blood T cells from patient MD01-005 were sorted based on expression of CD8 and TCRVβ2, followed by scRNA-seq–TCR-seq and clustering on conventional CD8+ T cells (MAIT cells excluded). To evaluate whether there was a statistically significant change in the cell types of MANA cells between week 2 (W2) and week 4 (W4) samples in Fig. 4d, e, we first conducted a Fisher’s exact test, which yields a P = 0.021, indicating a statistically significant phenotype change in MANA-specific cells (Fig. 4e). We also conducted a more sophisticated test that adjust for potential background differences in cell type abundance between W2 and W4 samples. In this test, we let mc,t denote the probability that a MANA-specific T cell collected at time point t (W2 or W4) comes from cell type c, and let pc,t denote the proportion of all cells in time point t that come from cell type c. We evaluated the ratio Rc,t = mc,t/pc,t, which characterizes the relative abundance of MANA-specific T cells in each cell type. We compared the null model where this ratio does not change over time (H0: Rc,W2 = Rc,W4 for all cell type c) versus the alternative model where W2 and W4 T cells have different ratios (H1: Rc,W2Rc,W4). To do this, we computed the test statistic $$S=\sum _{c}{({R}_{c,W2}-{R}_{c,W4})}^{2}$$ using the observed data and compared it to its null distribution obtained using Monte Carlo simulations. To construct the null distribution for $$S$$, we pooled cells from W2 and W4 together and treated them as one sample to estimate the common ratio Rc,W2 = Rc,W4 = Rc shared by W2 and W4, and then derived the probability that a MANA-specific T cell collected at time point t comes from cell type c under the null model H0, which is proportional to pc,t Rc (that is, the product of the sample-specific background cell type proportion pc,t and the common MANA-abundance ratio Rc shared between samples). The MANA-specific T cells at time point t were then redistributed to different cell types randomly based on a multinomial distribution with this expected MANA-specific T cell type proportion (that is, the expected probability that a MANA-specific T cell at time point t comes from cell type c under H0 is $${p}_{c,t}{R}_{c}/(\sum _{c{\prime} }{p}_{{c}^{{\prime} },t}{R}_{{c}^{{\prime} }})$$), while keeping the total number of MANA-specific T cells at each time point the same as the observed MANA-specific T cell number at that time point. The test statistic S was then computed using this simulated sample. We repeated this simulation 10,000 times to derive the null distribution of S. Comparing the observed S to its null distribution yields a P < 10−4.

### RNA velocity-based differentiation-trajectory tracing

The RNA velocity analysis was performed by first recounting the spliced reads and unspliced reads based on aligned bam files of scRNA-seq data using the velocyto Python package. The calculation of RNA velocity values for each gene in each cell and embedding RNA velocity vector to low-dimension space were done using the SeuratWapper workflow for estimating RNA velocity using Seurat (https://github.com/satijalab/seurat-wrappers/blob/master/docs/velocity.md). The first two diffusion components from Diffusion map were used to construct the coordinates along with velocity. TSCAN (v.1.7.0) was used to reconstruct the cellular pseudotime on diffusion maps space for the PBMC T cells from three time points (samples) of one patient (MD01-005). Based on velocity analysis, the Tmem(3) cluster was specified as the starting cluster for the pseudotemporal trajectory which has branches. For each branch, log2-transformed and library size-normalized SAVER-imputed gene expression values were used for analysing gene expression dynamics along the pseudotime. 10,325 genes with normalized expression ≥ 0.01 in at least 1% of cells were retained. For each gene g, the gene expression along pseudotime t in each sample s was described as a function $${f}_{gs}(t)$$ which was obtained by fitting B-spline regression to the gene’s normalized expression values in single cells. The red curves in Fig. 4h are the mean of the function $${f}_{gs}(t)$$ of the three samples. In order to test whether the gene expression shows a significant change along pseudotime, we compared the above model with a null model in which $${f}_{gs}(t)$$ is assumed to be a constant over time. The likelihood ratio statistic between the two models was computed. To determine the P value, the null distribution of the likelihood ratio statistic was constructed by permuting the pseudotime of cells in each sample, refitting the models and recomputing the likelihood ratio statistic. The P value was calculated as the number of permutations out of a total of 1,000 permutations that produce a likelihood ratio statistic larger than the observed one. The P values from all genes were converted to FDR by Benjamini–Hochberg procedure to adjust for multiple testing. Genes with FDR < 0.05 were considered as dynamic genes with statistical significance. k-Means clustering was applied to group genes with similar dynamic expression patterns into clusters. topGO (v.2.42.0) was used to identify the enriched Gene Ontology terms by comparing the genes in each cluster to all 10,325 genes as background.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Top