CT Radiomics in Endometrial Cancer: Principles, Applications, and Future Directions for Precision Oncology

Jacob Howard Jan 12, 2026 217

This article provides a comprehensive overview of CT radiomics for endometrial cancer, tailored for researchers and drug development professionals.

CT Radiomics in Endometrial Cancer: Principles, Applications, and Future Directions for Precision Oncology

Abstract

This article provides a comprehensive overview of CT radiomics for endometrial cancer, tailored for researchers and drug development professionals. It explores the foundational principles of converting standard-of-care CT images into mineable, high-dimensional data for tumor phenotyping. The methodological section details image acquisition, segmentation, feature extraction, and analytical pipelines essential for biomarker discovery. We address common challenges in reproducibility, standardization, and data harmonization, offering optimization strategies for robust model development. Finally, the article evaluates the current evidence for radiomics in predicting histology, grade, molecular subtypes, lymphovascular space invasion, and treatment response, comparing its potential to traditional imaging and genomic assays. The synthesis aims to guide the integration of radiomics into translational research and clinical trial design for personalized therapy in endometrial cancer.

Demystifying CT Radiomics: The Core Concepts and Rationale for Endometrial Cancer Research

What is Radiomics? Defining the Data-to-Knowledge Pipeline in Oncology

This whitepaper defines radiomics and its data-to-knowledge pipeline, framed explicitly within a broader thesis investigating the basic principles of CT radiomics for endometrial cancer. Radiomics is the high-throughput extraction of quantitative features from medical images, transforming standard-of-care imaging into mineable data. In endometrial cancer research, it aims to uncover disease characteristics that correlate with clinical outcomes, molecular phenotypes, and treatment response, beyond human visual assessment.

The Radiomics Data-to-Knowledge Pipeline

The pipeline is a multi-step, standardized process for converting raw imaging data into clinically actionable knowledge.

Table 1: Key Stages of the Radiomics Pipeline
Stage Primary Input Core Action Key Output
1. Image Acquisition Patient CT Scan Acquisition DICOM Images
2. Tumor Segmentation DICOM Images Manual/AI-based ROI Delineation 3D Volume of Interest (VOI)
3. Feature Extraction Segmented VOI Algorithmic Feature Computation Radiomic Feature Vector (500-2000+ features)
4. Data Curation & Analysis Feature Vector Pre-processing, Dimensionality Reduction, Model Building Predictive or Prognostic Model
5. Clinical Validation Trained Model Testing in Independent, Prospective Cohorts Validated Biomarker

Detailed Methodologies for Key Experiments

Protocol: Standardized CT Acquisition for Endometrial Cancer Radiomics
  • Objective: Ensure reproducible, high-quality image data for feature extraction.
  • Patient Preparation: Fasting 4-6 hours prior; oral contrast per institution protocol.
  • Scanner Parameters: Use consistent CT scanner model and reconstruction kernel across the cohort. Recommended: 120 kVp; tube current modulation for consistent noise index; slice thickness ≤ 3 mm (preferably 1-1.5 mm); standard soft-tissue reconstruction algorithm.
  • Phantom Use: Scan a standardized texture phantom (e.g., Catphan 600) with the same protocol to monitor scanner stability and enable feature harmonization.
Protocol: Tumor Segmentation and Feature Extraction
  • Segmentation: Delineate the primary endometrial tumor volume on each axial slice manually (expert radiologist) or using a semi-automated tool (e.g., 3D Slicer, ITK-SNAP). Segment the entire 3D tumor volume (VOI), avoiding necrotic areas and adjacent normal tissue.
  • Feature Extraction Software: Use open-source platforms like PyRadiomics (most cited) or proprietary solutions.
  • Feature Classes Extracted:
    • First-Order Statistics: Intensity histogram features (e.g., mean, skewness, kurtosis).
    • Shape-based Features: 3D descriptors of tumor geometry (e.g., volume, sphericity, surface area).
    • Texture Features: Patterns of pixel intensities (e.g., Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM)).
    • Higher-Order Features: Features from filtered images (e.g., Wavelet, Laplacian of Gaussian).
Protocol: Model Development and Validation
  • Cohort Splitting: Split patient cohort into training (e.g., 70%) and hold-out test (e.g., 30%) sets. Internal validation via bootstrapping or cross-validation on the training set.
  • Feature Preprocessing: Normalize features (Z-score). Apply ComBat or similar algorithms for multicenter harmonization.
  • Feature Selection: Apply LASSO (Least Absolute Shrinkage and Selection Operator) regression or Recursive Feature Elimination to reduce dimensionality and select the most predictive, non-redundant features.
  • Model Building: Train a classifier (e.g., Logistic Regression, Random Forest, Support Vector Machine) using selected features to predict an endpoint (e.g., lymphovascular space invasion (LVSI), recurrence risk, molecular subtype).
  • Validation: Test the locked model on the independent hold-out test set. Report performance metrics: AUC, accuracy, sensitivity, specificity, PPV, NPV.

Visualizing the Radiomics Workflow and Biology Integration

G Radiomics Pipeline in Endometrial Cancer Research cluster_0 Image Domain cluster_1 Clinical/Biological Domain cluster_2 Analytics Domain CT_Scan CT_Scan Segmentation Segmentation CT_Scan->Segmentation DICOM Data Feature_Extraction Feature_Extraction Segmentation->Feature_Extraction 3D VOI Data_Integration Data_Integration Feature_Extraction->Data_Integration Feature Vector Clinical_Data Clinical_Data Clinical_Data->Data_Integration e.g., LVSI Status Model Model Biomarker Biomarker Model->Biomarker Validation Decision_Support Decision_Support Biomarker->Decision_Support e.g., Risk Stratification Data_Integration->Model Machine Learning

Radiomics Pipeline from CT Scan to Clinical Decision Support

G Radiomics Links Image Phenotype to Tumor Biology cluster_radio_traits Radiomic Phenotype Examples cluster_bio_traits Tumor Biology in Endometrial Cancer cluster_clinical Clinical Endpoints Radiomic_Phenotype Radiomic_Phenotype Tumor_Biology Tumor_Biology Radiomic_Phenotype->Tumor_Biology Correlates With (e.g., High Heterogeneity) Clinical_Endpoint Clinical_Endpoint Radiomic_Phenotype->Clinical_Endpoint Predicts a Texture Heterogeneity Radiomic_Phenotype->a b Edge Sharpness Radiomic_Phenotype->b c Enhancement Pattern Radiomic_Phenotype->c Tumor_Biology->Clinical_Endpoint Drives d Cellularity & Necrosis Tumor_Biology->d e Angiogenesis Tumor_Biology->e f Molecular Subtype (POLE, p53, MMR) Tumor_Biology->f g LVSI Status Clinical_Endpoint->g h Recurrence Risk Clinical_Endpoint->h i Treatment Response Clinical_Endpoint->i

Radiomics Connects Image Features to Tumor Biology and Outcomes

The Scientist's Toolkit: Key Research Reagent Solutions

Table 2: Essential Tools for CT Radiomics Research in Endometrial Cancer
Item / Solution Category Function in Research
Standardized CT Texture Phantom Quality Control Monitors scanner performance over time, enabling feature harmonization across different scanners and institutions.
3D Slicer / ITK-SNAP Segmentation Software Open-source platforms for manual, semi-automatic, or automatic 3D segmentation of tumor volumes (VOI).
PyRadiomics (Python) Feature Extraction The most widely used open-source library for extracting a comprehensive set of standardized radiomic features.
ComBat Harmonization Statistical Tool Algorithm for removing non-biological, site-specific variances from radiomic feature data in multi-center studies.
LASSO Regression Statistical Model A feature selection method that penalizes complexity, helping to identify the most predictive radiomic signatures.
TCIA (The Cancer Imaging Archive) Data Repository Public access to curated, de-identified medical images (including endometrial cancer) linked to clinical data.
R or Python (scikit-learn) Analytics Environment Programming environments with extensive statistical and machine learning libraries for model development.
TRIPOD+AI Statement Reporting Guideline A checklist for transparent and complete reporting of radiomics prediction model studies.

This whitepaper provides an in-depth technical guide within the broader thesis on CT radiomics endometrial cancer basic principles research. It details the mechanistic links between non-invasive imaging features, the dynamic tumor microenvironment (TME), and underlying genomic alterations. For researchers and drug development professionals, this document outlines the experimental and computational frameworks necessary to validate these connections and translate them into predictive biomarkers.

Table 1: Key Radiomic Features Correlated with TME Characteristics in Endometrial Cancer

Radiomic Feature Category Example Feature(s) Correlated TME/Genomic Element Correlation Coefficient (Range) Reported P-value Primary Study (Year)
Texture - Heterogeneity Gray-Level Co-occurrence Matrix (GLCM) Entropy, Contrast Intratumoral CD8+ T-cell Density 0.45 - 0.68 <0.05 - <0.001 Wu et al. (2023)
Shape - Morphology Sphericity, Surface Area to Volume Ratio Tumor Stromal Percentage -0.52 to 0.61 <0.01 Li et al. (2024)
Intensity Histogram Skewness, Kurtosis HIF-1α Expression (Hypoxia) 0.38 - 0.55 <0.05 Park et al. (2023)
Wavelet-Enhanced Features HLH-GLCM Correlation POLE Exonuclease Domain Mutation Status AUC: 0.82 <0.001 Roberts et al. (2024)
Fractal Dimension Box-Counting Dimension Microvessel Density (CD31+) 0.49 0.003 Alvarez et al. (2023)

Table 2: Genomic Alterations Linked to Distinct Imaging Phenotypes in Endometrial Carcinoma

Genomic Alteration / Molecular Subtype Associated Imaging Phenotype on CT Proposed Biological Driver in TME Prevalence in EC Reference
POLE (ultramutated) Homogeneous texture, well-defined margins High neoantigen load, robust immune infiltrate 7-12% Jamieson et al. (2023)
MMR-D (hypermutated) Moderate heterogeneity, peritumoral edema Immune cell exclusion, stromal activation 20-30% Chen et al. (2024)
p53-abnormal (serous-like) Necrotic core, irregular/infiltrative margins Necrosis, angiogenesis, immunosuppressive fibroblasts 10-20% O’Brien et al. (2023)
No Specific Molecular Profile (NSMP) Variable, often moderate heterogeneity Diverse, often hormone-driven 40-50% NCCN Guidelines (2024)
CTNNB1 mutation Dense, hyperattenuating mass on CT β-catenin signaling, altered cell adhesion 20-25% of EECs Garg et al. (2023)

Detailed Experimental Protocols

Protocol: Integrated Radiomics-TME-Genomics Pipeline for Endometrial Cancer

A. Pre-Operative CT Image Acquisition & Segmentation

  • Image Acquisition: Acquire portal venous phase abdominal-pelvic CT scans with slice thickness ≤3 mm. Standardize reconstruction kernel (e.g., soft tissue). Phantom calibration for HU consistency.
  • Tumor Segmentation: Manual or semi-automatic segmentation of the primary endometrial lesion by two expert radiologists using 3D Slicer software (v5.2+). Exclude necrotic or hemorrhagic areas as separate volumes of interest (VOIs). Generate intra- and inter-observer Dice coefficients (>0.85 required).

B. Radiomic Feature Extraction & Stability Analysis

  • Feature Extraction: Use PyRadiomics (v3.0.1) to extract ~1,300 features per VOI across seven classes: shape, first-order statistics, GLCM, GLRLM, GLSZM, NGTDM, and GLDM.
  • Feature Stability: Perform test-retest analysis on a subset of patients with short-interval repeat scans. Apply Intraclass Correlation Coefficient (ICC > 0.8) and coefficient of variation (CV < 10%) to select robust features.

C. Ex Vivo Tissue Processing & Multi-Omic Analysis

  • Tissue Collection: Obtain matched tumor tissue post-hysterectomy within 30 minutes. Divide specimen into three aliquots: fresh for flow cytometry, OCT-embed for IF, RNAlater for sequencing.
  • TME Profiling (Multiplex Immunofluorescence):
    • Stain 4μm FFPE sections with validated antibody panel (e.g., CD8, CD68, FOXP3, PanCK, DAPI) using Opal 7-Color kit.
    • Image with Vectra Polaris scanner. Use inForm software for cell segmentation (DAPI) and phenotyping. Calculate cell densities and spatial metrics (e.g., distance to nearest neighbor).
  • Genomic Sequencing:
    • Extract DNA/RNA from macro-dissected tumor areas.
    • Perform targeted NGS using a custom panel covering POLE, PTEN, PIK3CA, ARID1A, CTNNB1, TP53, MMR genes, and MSI status.
    • RNA-seq for gene expression profiling (e.g., estimate immune/stromal scores via ESTIMATE algorithm).

D. Statistical Integration & Modeling

  • Association Analysis: Use Spearman’s rank correlation for radiomics-TME continuous variables. Mann-Whitney U/Kruskal-Wallis tests for genomic subgroups.
  • Predictive Modeling: Apply LASSO regression for high-dimensional radiomic feature selection to predict a genomic or TME endpoint. Validate model performance in a hold-out test set using ROC analysis (AUC, sensitivity, specificity).

Signaling Pathways and Workflow Visualizations

G cluster_imaging Imaging Phenotype (CT Radiomics) cluster_tme Tumor Microenvironment cluster_genomics Genomic Drivers I1 Texture (Heterogeneity) T1 Immune Infiltrate (CD8+/CD68+ Density) I1->T1 ρ = 0.65 I2 Shape (Sphericity) T4 Fibrosis (Collagen Deposition) I2->T4 ρ = -0.52 I3 Intensity (Kurtosis) T2 Hypoxia (HIF-1α Expression) I3->T2 ρ = 0.55 G1 POLE Mutation (Ultramutated) T1->G1 Drives G2 MMR-D/MSI-H (Hypermutated) T1->G2 Modulates G3 TP53 Mutation (Serous-like) T2->G3 Associated T3 Angiogenesis (Microvessel Density) G4 CTNNB1 Mutation (NSMP Subset) T4->G4 Associated G1->I1 Manifests as G3->I3 Manifests as

Diagram 1: Linking Imaging Phenotype to TME and Genomics

workflow S1 Patient with EC on CT S2 3D Tumor Segmentation S1->S2 S3 Radiomic Feature Extraction (PyRadiomics) S2->S3 S6 Data Integration & Machine Learning Model S3->S6 S4 Surgical Resection & Tissue Collection S5 Multi-Omic Tissue Analysis S4->S5 S5->S6 S7 Validation & Biomarker Output S6->S7

Diagram 2: Integrated Radiomics-TME-Genomics Workflow

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Reagents and Tools for Integrated Radiomics-TME-Genomics Studies

Item Name Category Function / Application Example Vendor / Product Code
PyRadiomics (v3.0.1) Software Library Open-source Python package for standardized extraction of radiomic features from medical images. GitHub - Radiomics/pyradiomics
3D Slicer Software Open-source platform for medical image informatics, processing, and 3D visualization/segmentation. www.slicer.org
Opal 7-Color Automation IHC Kit Reagent Kit Enables multiplex immunofluorescence staining for simultaneous detection of 7 markers on a single FFPE section (e.g., immune, stromal, tumor markers). Akoya Biosciences (NEL821001KT)
Panoramic Tissue Microarray (TMA) Tissue Platform High-throughput analysis of multiple tumor cores on a single slide for validation of TME markers across a cohort. Folio Biosciences (Custom Service)
TruSight Oncology 500 (TSO500) Sequencing Assay Comprehensive NGS assay for detection of key somatic variants, TMB, and MSI from tumor DNA/RNA. Illumina (20028224)
ESTIMATE Algorithm Computational Tool Infers the fraction of stromal and immune cells in tumor samples using gene expression data. R package "estimate"
Cell DIVE Multiplex Imaging Solution Imaging Platform Enables iterative staining and imaging for ultra-high-plex (60+) biomarker analysis on a single tissue section. Leica Microsystems
CytAssist Instrument Staining Platform Automates spatial transcriptomics or targeted protein detection from FFPE sections onto Visium slides. 10x Genomics (1000314)

Why CT? Leveraging Ubiquitous Standard-of-Care Imaging for Quantitative Analysis

Computed Tomography (CT) is a cornerstone of modern medical imaging, offering non-invasive, high-resolution, three-dimensional visualization of internal anatomy. Its utility in endometrial cancer, primarily for staging and detecting recurrence, is well-established. The core thesis of contemporary research is to transcend this qualitative, morphological assessment by leveraging CT as a quantitative data source. This whitepaper details the principles and methodologies for extracting high-dimensional data—radiomic features—from standard-of-care CT images to build predictive models for tumor phenotype, genotype, and clinical outcome in endometrial cancer.

The Quantitative CT Data Pipeline

The transformation of a standard CT scan into a mineable data set involves a multi-step, computational workflow. The reliability of downstream analysis is contingent upon rigorous protocol adherence at each stage.

Table 1: Key Stages in the CT Radiomics Pipeline
Stage Primary Objective Key Considerations for Endometrial Cancer
Image Acquisition & Reconstruction Generate consistent, high-quality DICOM images. Slice thickness (<3mm), intravenous contrast phase (portal venous), reconstruction kernel (soft tissue).
Tumor Segmentation Delineate the 3D volume of interest (VOI). Manual vs. semi-automatic (supervised) methods; encompassing primary tumor; excluding necrotic/blood vessels.
Image Pre-processing Standardize image geometry and intensity. Voxel resampling (e.g., 1x1x1 mm³), intensity discretization (fixed bin number/width), noise reduction filters.
Feature Extraction Compute quantitative descriptors of the VOI. Categories: Shape, First-Order Statistics, Texture (GLCM, GLRLM, GLSZM, NGTDM), Wavelet-filtered features.
Feature Selection & Analysis Identify robust, non-redundant features linked to biology. Combat overfitting via ICC analysis, correlation filters, and LASSO/mRMR; then build ML models (e.g., SVM, Random Forest).

Experimental Protocols for Endometrial Cancer Radiomics

Protocol 2.1: Multi-Center Retrospective Feature Extraction & Stability Analysis
  • Objective: To identify CT radiomic features stable across varied acquisition parameters.
  • Patient Cohort: Retrospective, multi-institutional. Example: 150 endometrial cancer patients with pre-treatment contrast-enhanced CT from 3 centers.
  • Segmentation: Manual delineation of primary endometrial tumor by two expert radiologists using 3D Slicer software. Consensus segmentations generated.
  • Pre-processing: All images isotropically resampled to 1.0 mm³. Intensity normalized using mean-shift to a fixed bin width of 25 HU.
  • Feature Extraction: Extraction of ~1300 features per tumor using PyRadiomics library, including LoG-filtered and wavelet decompositions.
  • Stability Assessment: Calculate Intra-class Correlation Coefficient (ICC > 0.75) for inter-observer segmentation and inter-scanner reproducibility. Only stable features proceed.
Protocol 2.2: Linking Radiomic Phenotypes to Molecular Subtypes (TCGA)
  • Objective: To correlate stable radiomic signatures with the four molecular subgroups of endometrial cancer (POLE-mutated, MSI-Hypermutated, Copy-Number Low, Copy-Number High).
  • Cohort: Patients with available pre-surgery CT and validated molecular classification.
  • Methodology: After stable feature set is identified (Protocol 2.1), perform unsupervised clustering (e.g., consensus clustering) to derive intrinsic radiomic subtypes. Use supervised machine learning (e.g., Random Forest) to classify known molecular subtypes. Validate model performance using a hold-out test set via AUC-ROC analysis.
  • Statistical Validation: Multivariate analysis correcting for clinical stage and histology.

Visualizing Workflows and Biological Correlates

G Start Standard-of-Care CT Scan (DICOM) Seg 3D Tumor Segmentation Start->Seg Pre Image Pre-processing Seg->Pre Feat Radiomic Feature Extraction Pre->Feat DB Structured Feature Database Feat->DB Model Predictive Model (e.g., for Molecular Subtype) DB->Model Validate Validation & Interpretation Model->Validate Clinical Clinical & Molecular Ground Truth Clinical->Model

Title: Radiomics Analysis Pipeline from CT to Prediction

G CT CT Image Phenotype (e.g., Heterogeneity) Micro Tumor Microenvironment (Hypoxia, Fibrosis, Lymphocyte Infiltration) CT->Micro Reflects Outcome Clinical Outcome (Stage, Grade, Recurrence Risk) CT->Outcome Predicts Molec Molecular Alterations (e.g., PI3K/AKT Activation, Wnt/β-catenin, Microsatellite Instability) Micro->Molec Driven by Molec->Outcome Determines

Title: Proposed Link Between CT Features and Tumor Biology

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for CT Radiomics Research in Endometrial Cancer
Item / Solution Function in Research Example / Note
3D Slicer Open-source platform for medical image visualization, segmentation, and analysis. Primary tool for manual/ semi-automatic contouring of endometrial tumors. Supports DICOM import.
PyRadiomics Open-source Python package for extraction of a comprehensive set of radiomic features. Enables reproducible batch processing. Essential for implementing IBSI standards.
ITK-SNAP Specialized software for interactive segmentation of structures in 3D medical images. Alternative for detailed manual segmentation with active contour functionality.
Python/R Libraries (scikit-learn, caret) Machine learning and statistical analysis environments for feature selection and model building. Used for LASSO regression, Random Forest, SVM, and survival analysis (Cox models).
The Cancer Imaging Archive (TCIA) Public repository of medical images (often linked to TCGA/TCIA endometrial cohorts). Source for de-identified, research-ready CT image datasets with associated clinical data.
DICOM Anonymizer Tools Ensures patient privacy by removing protected health information (PHI) from image headers. Critical for ethical retrospective research and data sharing between institutions.
High-Performance Computing (HPC) Cluster Provides computational power for batch processing, feature extraction, and complex model training. Necessary for large cohort studies involving wavelet transformations and deep learning.

Endometrial cancer (EC) management faces persistent clinical questions regarding pre-operative risk stratification, detection of occult disease, and prediction of treatment response. This whitepaper, framed within a broader thesis on CT radiomics basic principles, examines how radiomic feature extraction from standard-of-care imaging can provide non-invasive, quantitative biomarkers to address these challenges. We detail experimental protocols, data synthesis, and pathway visualizations to guide translational research.

The primary clinical decision points in EC involve histologic grading, myometrial invasion (MI) depth, lymphovascular space invasion (LVSI) status, lymph node metastasis (LNM), and molecular classification (e.g., POLE-mutated, MMR-d, p53abn, NSMP). Current imaging, primarily MRI, has limitations in specificity and reproducibility. Radiomics, the high-throughput extraction of minable data from medical images, can decode tumor phenotypic patterns invisible to the human eye. Integrated with clinical and molecular data within machine learning (ML) models, radiomics offers a pathway to refined prognostic and predictive tools.

Quantitative Synthesis of Recent Radiomic Evidence

Table 1: Key Performance Metrics of Radiomic Models for Critical Clinical Questions in EC (Based on Recent Meta-Analyses & Studies)

Clinical Question Imaging Modality Key Radiomic Features Involved Reported AUC Range Sample Size (Range) Primary Limitation
High-Grade vs. Low-Grade MRI (T2, DCE), CT Texture (GLCM-Dissimilarity, GLRLM-LGLRE), Shape (Sphericity), Wavelet-HLH 0.83 - 0.91 80 - 320 Generalizability across MRI scanners/protocols
Deep (≥50%) Myometrial Invasion MRI (T2, ADC) First-Order (Kurtosis), Texture (GLSZM-ZoneVariance), Form (Maximum 2D Diameter) 0.86 - 0.94 110 - 415 Distortion from tumor/lumen interface
Lymph Node Metastasis CT, MRI (T2, DWI) Intensity Histogram (Skewness), Texture (GLDM-DependenceVariance), Shape (Compactness) 0.78 - 0.89 150 - 280 Difficulty detecting micro-metastases
LVSI Presence MRI (ADC, DCE) Texture (GLCM-Correlation, NGTDM-Coarseness), First-Order (90th Percentile) 0.81 - 0.88 95 - 210 Confounding by peritumoral inflammation
Molecular Classification (p53abn) MRI (T2, ADC) Wavelet-LHL (First-Order Energy), Shape (Sphericity), Texture (GLRLM-RunEntropy) 0.77 - 0.85 180 - 350 Cohort size for rarer subtypes (e.g., POLE)

Table 2: Standardized Radiomics Workflow Protocol Checklist

Workflow Stage Critical Steps Recommended Tools/Software Quality Control Checkpoint
1. Image Acquisition Standardized protocol (slice thickness ≤3mm for CT/MRI). Institutional scanner protocols. Phantom scanning for harmonization.
2. Tumor Segmentation Manual delineation by expert radiologist (gold standard) or semi-automatic methods. 3D Slicer, ITK-SNAP, open-source AI tools. Inter-observer Dice Coefficient >0.80.
3. Preprocessing Resampling to isotropic voxels, intensity discretization (fixed bin width=25). PyRadiomics image processing module. Check for introduced artifacts.
4. Feature Extraction Extract features per IBSI guidelines: First-order, Shape (2D/3D), Texture, Filters. PyRadiomics, MaZda, Custom MATLAB. Test stability on phantom/ test-retest.
5. Feature Selection Remove non-robust features (ICC<0.8). Apply LASSO, mRMR, or RFE. Scikit-learn, R caret. Avoid data leakage; use training set only.
6. Model Building Train classifier (e.g., SVM, RF, XGBoost) on selected features. 5-fold cross-validation. Scikit-learn, XGBoost, PyTorch. Optimize hyperparameters via grid search.
7. Validation Internal validation (bootstrapping). External validation on independent cohort. Compare AUC, accuracy, calibration. Report 95% confidence intervals.

Detailed Experimental Protocols

Protocol 3.1: Developing a CT Radiomics Model for Pre-operative LNM Prediction

Objective: To develop a ML model using pre-operative CT radiomics to predict LNM in EC. Materials: Pre-operative contrast-enhanced CT scans, pathology-confirmed nodal status.

  • Cohort: Retrospective collection of 300 patients (70% training/30% temporal validation). Inclusion: biopsy-proven EC, CT within 4 weeks of surgery. Exclusion: prior pelvic radiotherapy.
  • Segmentation: An attending radiologist, blinded to nodal status, performs 3D volumetric segmentation of the primary tumor on the arterial phase using ITK-SNAP. A second radiologist segments 50 random cases for inter-observer concordance.
  • Feature Extraction: Use PyRadiomics (v3.0.1) to extract 1216 features per tumor, including:
    • Original Image: First-order statistics (18 features), Shape (14 features), Gray Level Co-occurrence Matrix (GLCM, 24 features), etc.
    • Filtered Images: Apply LoG (σ=1.0, 3.0, 5.0mm), Wavelet (8 decompositions) filters.
  • Feature Selection: a. Stability Test: Use intra-class correlation coefficient (ICC>0.9) on 30 test-retest scans. b. Reproducibility: Remove features with inter-observer Dice similarity coefficient <0.8. c. Dimensionality Reduction: On the training set: i. Remove near-zero variance features. ii. Apply Spearman correlation (threshold |r|>0.9) to remove highly collinear features. iii. Apply LASSO regression with 10-fold cross-validation to select ~15 most predictive features.
  • Model Building: Train an eXtreme Gradient Boosting (XGBoost) classifier using the selected features. Optimize hyperparameters (maxdepth, learningrate, n_estimators) via Bayesian optimization on the training set.
  • Validation & Statistics: Evaluate on the held-out validation set using AUC, sensitivity, specificity, and precision. Perform decision curve analysis to assess clinical net benefit.

Protocol 3.2: Integrating MRI Radiomics with Molecular Classification

Objective: To correlate MRI radiomic phenotypes with the TCGA-based molecular subtypes of EC. Materials: Pre-operative T2w and DWI/ADC MRI, tumor tissue for molecular profiling (PCR, IHC, NGS).

  • Molecular Classification: Assign tumors to four subtypes via ProMisE algorithm: POLE-mutated, MMR-d (MSI), p53abn, NSMP.
  • Radiogenomic Mapping: Extract radiomic features from T2w and ADC map segmentations as per Protocol 3.1.
  • Analysis: For each molecular subtype vs. others, perform: a. Univariate analysis (Mann-Whitney U test) of radiomic features, FDR corrected. b. Supervised linear discriminant analysis (LDA) to identify a radiomic signature distinguishing the subtype. c. Multivariate logistic regression combining top radiomic features and key clinical variables (age, grade).
  • Pathway Correlation: Statistically significant radiomic features (e.g., texture heterogeneity) are correlated with genomic data (e.g., pathway enrichment scores from RNA-seq) using Spearman rank correlation.

Visualizing Pathways and Workflows

G node1 Clinical CT/MRI Scan node2 Tumor Segmentation (ROI Delineation) node1->node2 node3 Pre-processing (Resampling, Normalization) node2->node3 node4 High-Dimensional Feature Extraction node3->node4 node5 Feature Selection & Dimensionality Reduction node4->node5 node6 Predictive Model (Machine Learning Classifier) node5->node6 node7 Clinical Endpoint (e.g., LNM, Molecular Subtype) node6->node7 node8 Decision Support Tool node7->node8

Radiomics Workflow from Image to Clinical Tool

G Radiomic_Phenotype Radiomic Phenotype (e.g., High Texture Heterogeneity) Molecular_Subtype Molecular Subtype (e.g., p53abn) Radiomic_Phenotype->Molecular_Subtype Associates With Outcome Clinical Outcome (Poor Prognosis, Therapy Resistance) Radiomic_Phenotype->Outcome Predicts Pathway Cellular Pathway Activation (e.g., Cell Cycle Dysregulation, Genomic Instability) Molecular_Subtype->Pathway Driven By Histology Aggressive Histology (High Grade, Serous Morphology) Pathway->Histology Manifests As Histology->Outcome Leads To

Radiogenomic Correlations in Endometrial Cancer

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Tools for EC Radiomics Studies

Tool/Reagent Category Specific Example/Product Primary Function in Research
Image Analysis Software 3D Slicer with PyRadiomics extension, ITK-SNAP Open-source platform for medical image visualization, 3D segmentation, and standardized radiomic feature extraction.
Radiomics Feature Engine PyRadiomics (Python), MaZda (C++) Backend libraries that implement IBSI-compliant algorithms for calculating thousands of radiomic features from a region of interest.
Phantom for Harmonization Credence Cartridge Radiomics Phantom, QIBA DRO Physical or digital reference objects to test scanner stability and harmonize feature extraction across different imaging centers/protocols.
Machine Learning Platform Scikit-learn (Python), caret (R), XGBoost Libraries providing algorithms for feature selection (LASSO, mRMR), model training (SVM, RF), and cross-validation.
Statistical Analysis Suite R Statistical Software, Python SciPy/StatsModels Perform advanced statistical tests, survival analysis, and generate publication-quality graphs for result reporting.
Genomic Data Integration cBioPortal, R/Bioconductor packages (e.g., DESeq2) Platforms and tools to access/publicly available molecular data (TCGA-UCEC) and perform correlation analyses with radiomic signatures.
High-Performance Computing Local GPU cluster (NVIDIA), Cloud (Google Colab Pro, AWS) Computational resources required for processing large imaging datasets, deep learning, and complex radiogenomic analyses.

Radiomics provides a powerful, non-invasive lens to address key clinical questions in endometrial cancer, from pre-operative staging to molecular subtyping. Successful implementation requires rigorous standardization of the imaging pipeline, robust validation, and integration with clinico-pathologic and molecular data—"radiogenomics." Future research must prioritize prospective multi-center trials with external validation to move radiomic signatures from research benches to clinical decision support systems, ultimately enabling personalized management in endometrial cancer.

Within the broader thesis on CT radiomics for endometrial cancer basic principles research, establishing a precise and consistent lexicon is foundational. Radiomics converts medical images into mineable high-dimensional data. The pipeline's core conceptual outputs are Features, Signatures, and Models. This technical guide defines these terms in the context of endometrial cancer, detailing methodologies for their derivation and validation.

Core Terminology & Workflow

The standard radiomics workflow progresses from data to clinical decision support. The key terminology maps onto specific stages of this pipeline.

G CT_Image CT Image Acquisition Segmentation Tumor Segmentation (ROI Definition) CT_Image->Segmentation Extraction Feature Extraction Segmentation->Extraction FEATURES Radiomics Features (High-Dimensional Matrix) Extraction->FEATURES Processing Feature Processing & Dimensionality Reduction FEATURES->Processing SIGNATURE Radiomics Signature (Selected Feature Subset) Processing->SIGNATURE Modeling Predictive Modeling SIGNATURE->Modeling MODEL Radiomics Model (Algorithm + Signature + Coefficients) Modeling->MODEL Endpoint Clinical Endpoint (e.g., LVSI, Prognosis) Endpoint->Modeling

Diagram Title: Radiomics Pipeline from Image to Predictive Model

  • Features: The fundamental quantitative metrics extracted from the region of interest (ROI). They are the raw data of radiomics.
  • Signature: A curated subset of features, selected for their relevance to a specific clinical question, often combined into a single score.
  • Model: A validated algorithm that integrates the radiomics signature (and often clinical variables) to predict a clinical endpoint.

Radiomics Features: The Foundational Data

Features are mathematically extracted descriptors quantifying tumor phenotype. They are typically categorized as shown below.

Table 1: Categories of Radiomics Features in Endometrial Cancer CT Analysis

Feature Category Sub-category Representative Examples Biological Correlate in Endometrial Cancer
First-Order Statistics - Mean, Median, Entropy, Kurtosis Overall tumor density heterogeneity, reflecting cellularity & necrosis.
Shape-based - Volume, Sphericity, Surface Area Gross 3D tumor morphology and invasive potential.
Texture (Second-Order) Gray Level Co-occurrence Matrix (GLCM) Contrast, Correlation, Energy Local intra-tumoral heterogeneity, potentially linked to genomic instability.
Gray Level Run Length Matrix (GLRLM) Run Length Non-Uniformity Patterns of density variation, may indicate stromal vs. epithelial mix.
Gray Level Size Zone Matrix (GLSZM) Zone Size Non-Uniformity Areas of similar attenuation, suggestive of regional necrosis or fibrosis.
Higher-Order Filter-based (Wavelet, Laplacian) Features from filtered images (e.g., wavelet-HHHglcmCorrelation) Multi-scale texture patterns capturing subtle phenotypic variations.

Experimental Protocol for Feature Extraction:

  • Image Acquisition & Pre-processing: Use venous-phase abdominal-pelvic CT scans with standardized parameters (e.g., 120 kVp, slice thickness ≤5 mm). Resample all images to isotropic voxels (e.g., 1x1x1 mm³) and discretize gray-levels (bin width=25).
  • Segmentation: Delineate the primary endometrial tumor ROI semi-automatically using a 3D slicer platform (e.g., ITK-SNAP) by a radiologist blinded to outcomes. Repeat segmentations for intra-observer variability analysis.
  • Extraction: Use an open-source platform (e.g., PyRadiomics v3.0.1) to extract all feature classes from the original and filtered image sets. This yields 1000+ features per tumor.
  • Feature Stability: Perform test-retest analysis on a subset of patients with short-interval rescans to exclude unstable features (ICC < 0.75).

Radiomics Signature: From Data to Biology

A signature reduces feature dimensionality to a parsimonious, biologically relevant set. Creation involves feature selection and weighting.

Table 2: Common Methods for Radiomics Signature Development

Step Method Description Rationale
Pre-selection Inter-Class Correlation (ICC) Retain features with ICC > 0.75 in test-retest. Ensures robustness against segmentation and acquisition noise.
Univariate Analysis Mann-Whitney U test / LASSO Select features with significant univariate association (p<0.05) with the endpoint. Identifies candidate features with discriminative power.
Multivariate Selection Least Absolute Shrinkage and Selection Operator (LASSO) regression Penalized regression that shrinks coefficients of irrelevant features to zero. Handles multicollinearity, selects a compact, predictive feature set.
Signature Score Linear combination Rad-score = β₁Feature₁ + β₂Feature₂ + ... + βₙ*Featureₙ. Coefficients (β) from LASSO create a single, continuous predictive score.

Protocol for Signature Construction (e.g., for LVSI Prediction):

  • Cohort: Split retrospective cohort (N=300) of endometrial cancer patients into training (70%) and validation (30%) sets.
  • Selection: On the training set, apply ICC filter, then LASSO regression with 10-fold cross-validation to select the optimal lambda (λ) value that minimizes binomial deviance.
  • Calculation: Apply the selected λ to the entire training set to obtain the final non-zero coefficients. Calculate the Rad-score for each patient.
  • Evaluation: Assess the signature's performance in the validation set using the Area Under the ROC Curve (AUC).

Radiomics Model: The Predictive Instrument

A model integrates the signature into a clinically usable tool, often combined with clinical variables.

G Clinical Clinical Variables (e.g., Grade, Histology) Fusion Model Fusion Clinical->Fusion RadSignature Radiomics Signature (Rad-score) RadSignature->Fusion ML_Algo Machine Learning Algorithm (e.g., Logistic Regression, SVM) Fusion->ML_Algo RadiomicsModel Integrated Radiomics Model ML_Algo->RadiomicsModel Output Risk Prediction (Probability) RadiomicsModel->Output

Diagram Title: Integration of Clinical and Radiomic Data into a Model

Protocol for Model Building and Validation:

  • Model Formulation: In the training cohort, develop three models: a clinical model (using established risk factors), a radiomics model (using the Rad-score alone), and a combined model (integrating both via multivariate logistic regression).
  • Performance Metrics: Evaluate discrimination with AUC, calibration with Hosmer-Lemeshow test and calibration plots, and clinical utility with Decision Curve Analysis (DCA).
  • Validation: Lock down all model coefficients. Apply the fixed models to the independent validation cohort and report performance metrics. Perform bootstrapping (1000 iterations) for internal validation and calculate 95% confidence intervals.
  • Reporting: Adhere to the TRIPOD+AI statement guidelines for transparent reporting.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Tools for CT Radiomics Research in Endometrial Cancer

Item / Solution Function in the Radiomics Pipeline Example / Note
Standardized CT Protocol Ensures feature reproducibility and multi-center validity. Venous phase, 120 kVp, automated tube current modulation, reconstruction kernel (e.g., B30f).
3D Slicer with SlicerRadiomics Open-source platform for image visualization, segmentation, and feature extraction. Enables reproducible manual or semi-automatic segmentation. Plugins facilitate PyRadiomics integration.
PyRadiomics Python Package The computational engine for standardized feature extraction. Extracts all IBTF-compliant features. Allows custom setting of voxel resampling, discretization, and filter application.
ITK-SNAP Specialized software for detailed 3D medical image segmentation. Often used for precise manual delineation of tumor boundaries, especially for heterogeneous masses.
R or Python (scikit-learn) Statistical computing environment for feature selection, modeling, and validation. Essential for implementing LASSO, building models, and performing statistical analysis (AUC, DCA).
Public Datasets & Benchmarks For initial method development and external validation. TCIA (The Cancer Imaging Archive) may host relevant, though limited, endometrial cancer CT datasets.
High-Performance Computing (HPC) Cluster For computationally intensive tasks like wavelet filtering and large-scale cross-validation. Necessary when processing hundreds of patients with full feature extraction.

Building a Radiomics Pipeline: From CT Scan to Predictive Model in Endometrial Cancer

Within the broader thesis on CT radiomics for endometrial cancer (EC) basic principles research, the standardization of image acquisition is the foundational pillar. Radiomics seeks to extract high-dimensional quantitative features from medical images, which are then used to develop predictive models for cancer diagnosis, prognosis, and treatment response. The reliability and reproducibility of these radiomic features are critically dependent on the consistency of the initial imaging data. Variations in acquisition protocols introduce significant noise, potentially obscuring true biological signals and leading to non-generalizable models. This technical guide details the essential components of standardized CT image acquisition protocols tailored for endometrial cancer radiomics research.

Critical Protocol Parameters for Standardization

The following parameters must be meticulously controlled and documented across all patient scans to ensure robust feature extraction.

Table 1: Essential CT Acquisition Parameters for EC Radiomics

Parameter Category Specific Parameter Recommended Setting for EC Radiomics Rationale & Impact on Features
Patient Preparation Bladder Status Comfortably full, consistent across cohort Standardizes anatomical position of uterus; impacts spatial relationships and radiomic texture.
Bowel Preparation Oral contrast optional; if used, must be standardized. Reduces gas/fluid motion artifacts; contrast alters attenuation values, affecting intensity-based features.
Scan Acquisition Kilovoltage Peak (kVp) Fixed at 120 kVp (or 100 kVp for slim patients) Affects beam hardness and tissue contrast. Variation changes attenuation values (e.g., HU), impacting first-order statistics.
Tube Current (mA) / Modulation Use automated tube current modulation (ATCM) with fixed noise index. Balances dose and consistent image noise levels. Fixed mA is less adaptive; ATCM with fixed index standardizes noise texture.
Rotation Time (s) ≤ 0.5 s Minimizes motion artifacts from bowel peristalsis.
Pitch ≤ 1.0 (for helical scans) Affects z-axis resolution and slice sensitivity profile. Higher pitch can introduce interpolation artifacts.
Detector Collimation Thin (e.g., 0.625 mm or 1.25 mm) Enables isotropic or near-isotropic voxel reconstruction, crucial for 3D texture analysis.
Image Reconstruction Reconstruction Kernel Soft-tissue kernel (e.g., Br40, Standard) Sharp kernels enhance noise and edges, drastically altering texture features. A consistent soft-tissue kernel is mandatory.
Slice Thickness ≤ 1.5 mm (ideally 1.0 mm), matching collimation. Thicker slices cause partial volume averaging, blurring features and reducing feature stability.
Reconstruction Interval Equal to slice thickness (contiguous) Ensures no gap or overlap between slices for accurate 3D volume analysis.
Field of View (FOV) Patient-specific, but fixed display matrix (e.g., 512x512). Maintains consistent in-plane spatial resolution. Pixel size = FOV/Matrix.

Detailed Methodological Protocols

Experiment 1: Assessing Feature Stability Across kVp Variations

Objective: To quantify the sensitivity of radiomic features to changes in tube voltage. Protocol:

  • Utilize a validated anthropomorphic pelvis phantom with an endometrial cancer-simulating insert.
  • Acquire CT scans of the phantom using identical parameters except for kVp, which is varied (e.g., 80, 100, 120, 140 kVp).
  • Reconstruct all images using an identical soft-tissue kernel and slice thickness.
  • Segment the simulated tumor volume using a fixed Hounsfield Unit (HU) threshold or manual ROI propagated across all scans.
  • Extract a standard radiomic feature set (e.g., IBSI-compliant) using open-source software (PyRadiomics).
  • Calculate the intra-class correlation coefficient (ICC) for each feature across the kVp settings. Features with ICC > 0.9 are considered robust to this parameter variation.

Experiment 2: Impact of Reconstruction Kernel on Texture Analysis

Objective: To evaluate the dramatic effect of reconstruction algorithms on higher-order texture features. Protocol:

  • From a single patient raw data (sinogram), reconstruct two image series: one with a standard soft-tissue kernel (e.g., Br40) and one with a sharp/bone kernel (e.g., Br70).
  • Perform semi-automatic 3D segmentation of the primary endometrial tumor on the soft-tissue series by an expert radiologist.
  • Apply this same segmentation mask to the sharp-kernel series after ensuring perfect spatial registration.
  • Extract texture features (GLCM, GLRLM, GLSZM) from both volumes.
  • Perform a paired Wilcoxon signed-rank test or Bland-Altman analysis to compare feature values. Document the percentage of features showing statistically significant (p < 0.05) differences.

Visualization of the Standardization Workflow

G PatientPrep Patient Preparation (Bladder/Bowel Protocol) CTScan CT Image Acquisition (Fixed kVp, ATCM, Thin Collimation) PatientPrep->CTScan Standardized Recon Image Reconstruction (Soft Kernel, Thin/Isotropic Slices) CTScan->Recon Raw Data Segmentation Tumor Segmentation (3D Volume of Interest) Recon->Segmentation DICOM Images FeatureExtract Radiomic Feature Extraction (IBSI-Compliant Pipeline) Segmentation->FeatureExtract Mask Model Downstream Analysis (Stable & Reproducible Models) FeatureExtract->Model Quantitative Features

Standardized Radiomics Workflow for Endometrial Cancer

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for Acquisition Protocol Research

Item / Reagent Function in Protocol Standardization Research
Anthropomorphic Pelvis Phantom Mimics human anatomy and attenuation. Used to test acquisition parameters (kVp, kernel) without patient radiation exposure.
Radiomics Feature Standardization Initiative (IBSI) Handbook Reference guide defining feature nomenclature and computation, ensuring reproducibility across research groups.
PyRadiomics (Open-Source Python Package) IBSI-compliant software for standardized extraction of radiomic features from medical images.
3D Slicer / ITK-SNAP Open-source software for 3D medical image visualization and manual/ semi-automatic segmentation of tumor volumes.
DICOM Metadata Extractor (e.g., pydicom) Tool to programmatically verify and record acquisition parameters (kVp, kernel, slice thickness) from all scans in a cohort.
Quality Control Phantom (e.g., CATPHAN) Used for regular scanner calibration to ensure HU uniformity and geometric accuracy over time.

The path to clinically translatable radiomics models in endometrial cancer begins with rigorous imaging protocol standardization. By fixing parameters such as kVp, employing ATCM with a fixed noise index, mandating thin-slice reconstructions with a consistent soft-tissue kernel, and controlling patient preparation, researchers can significantly reduce non-biological variance. This guide provides the experimental frameworks to quantify the impact of protocol deviations. Adherence to these principles in Step 1 ensures that extracted radiomic features are robust, reproducible, and truly reflective of the underlying tumor pathophysiology, forming a solid data foundation for subsequent steps in the radiomics pipeline.

This technical guide details the critical second step of the radiomics pipeline, tumor segmentation, within the scope of a thesis on CT Radiomics for Endometrial Cancer: Basic Principles Research. Accurate delineation of the tumor volume of interest (VOI) from computed tomography (CT) images is foundational. The extracted VOI serves as the source for subsequent feature quantification, which aims to discover non-invasive biomarkers for prognosis, prediction, and therapeutic monitoring in endometrial cancer. The choice of segmentation method directly impacts feature stability, reproducibility, and the ultimate clinical validity of radiomic signatures.

Methodological Approaches: A Comparative Analysis

Manual Segmentation

Description: The radiologist or expert manually outlines the tumor boundary slice-by-slice using specialized software. Protocol: A typical protocol involves using an FDA-cleared platform (e.g., 3D Slicer, ITK-SNAP). The expert loads the arterial-phase CT series, adjusts window/level for optimal contrast, and uses a drawing tool to contour the tumor boundary on each axial slice where it is visible. The result is a binary mask. Key Considerations: Intra- and inter-observer variability are significant challenges, making this method time-consuming and less reproducible despite being the common reference standard.

Semi-Automatic Segmentation

Description: Algorithms initialized or guided by user input perform the segmentation. Protocol (for Region Growing):

  • Load DICOM series into analysis software (e.g., Radiomics module in PyRadiomics).
  • The user selects a seed point within the tumor on a representative slice.
  • Set intensity threshold parameters (e.g., lower/upper Hounsfield Unit bounds).
  • Execute the region-growing algorithm, which aggregates connected voxels within the threshold.
  • Manually correct any obvious over- or under-segmentation. Key Considerations: Faster than manual methods but still user-dependent. Performance can degrade with poor tumor-to-background contrast.

Deep Learning (DL) Based Segmentation

Description: Convolutional Neural Networks (CNNs), trained on annotated datasets, automatically predict tumor masks. Protocol (U-Net Training):

  • Data Preparation: Co-register paired CT images and expert manual segmentation masks. Split data into training, validation, and test sets (e.g., 70/15/15%). Normalize image intensities (e.g., z-score).
  • Model Architecture: Implement a 2D or 3D U-Net with an encoder (downsampling path for feature extraction) and decoder (upsampling path for precise localization).
  • Training: Use a loss function like Dice Loss or Cross-Entropy. Optimize with Adam. Augment data in real-time (rotation, flipping, scaling).
  • Inference: Apply the trained model to new CT scans to generate probability maps, which are then thresholded (e.g., at 0.5) to create binary masks.

Quantitative Comparison of Segmentation Approaches

Table 1: Comparative Analysis of Segmentation Methods in Endometrial Cancer CT Radiomics

Metric Manual Semi-Automatic (Region Growing) Deep Learning (U-Net)
Time per Case 15-30 minutes 5-10 minutes < 1 minute (after training)
Inter-Observer Dice Score 0.75 - 0.85 0.80 - 0.90 (varies with user input) 0.87 - 0.93 (on held-out test sets)
Reproducibility Low Medium High (if model is stable)
Required Expertise High (Radiologist) Medium Medium (for development)
Dependency on Initial Parameters None High (seed point, threshold) Low (after deployment)
Suitability for Heterogeneous Tumors High (expert judgment) Low Medium-High (depends on training data)

Data synthesized from recent literature (2022-2024) including studies by Jia et al., Radiol. Med. 2023 and Giannini et al., Cancers 2024.

Experimental Workflow in Radiomics Research

G CT_Acq CT Image Acquisition (Standardized Protocol) Preproc Image Preprocessing (Resampling, N4 Bias Correction) CT_Acq->Preproc Seg_Methods Segmentation Methods Preproc->Seg_Methods M Manual Seg_Methods->M S Semi-Auto Seg_Methods->S DL Deep Learning Seg_Methods->DL VOI Volume of Interest (VOI) Mask M->VOI Reference Standard S->VOI User-Guided DL->VOI Automated Prediction Feat_Ext Radiomic Feature Extraction (PyRadiomics) VOI->Feat_Ext Analysis Statistical & ML Analysis Feat_Ext->Analysis

Diagram Title: Radiomics Pipeline with Segmentation Step

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for Tumor Segmentation Research

Tool / Solution Category Primary Function in Segmentation Research
3D Slicer Open-Source Software Platform Visualization, manual annotation, and platform for testing semi-automatic algorithms.
ITK-SNAP Specialized Segmentation Software Interactive manual and semi-automatic segmentation with active contour models.
PyRadiomics Python Library Standardized radiomic feature extraction; includes built-in simple segmentation filters.
MONAI (Medical Open Network for AI) Deep Learning Framework Provides pre-built, medically optimized DL models (e.g., DynUNet) and training pipelines.
nnU-Net Self-Configuring DL Framework Automatically configures U-Net architecture and training for medical image segmentation tasks.
OpenCV Computer Vision Library Image processing for pre-processing and post-processing of segmentation masks.
SimpleITK Image Analysis Library Comprehensive toolkit for image I/O, registration, and fundamental segmentation algorithms.

Impact on Radiomic Feature Stability

The segmentation method is a critical confounding variable. Features related to shape (e.g., Sphericity, Surface Area) are most sensitive to boundary delineation. Texture features (e.g., Gray Level Co-occurrence Matrix features) can also vary significantly with the included voxels. Harmonization strategies, such as employing test-retest segmentations or using segmentation-robust feature selection methods, are essential components of a robust radiomics thesis.

G Segmentation_Variability Segmentation Method Variability Shape Shape Features (High Sensitivity) Segmentation_Variability->Shape First_Order First-Order Statistics (Medium Sensitivity) Segmentation_Variability->First_Order Texture Texture Features (Medium-High Sensitivity) Segmentation_Variability->Texture Reproducibility Reduced Feature Reproducibility Shape->Reproducibility First_Order->Reproducibility Texture->Reproducibility Model_Performance Compromised Predictive Model Performance Reproducibility->Model_Performance

Diagram Title: Impact of Segmentation Variability on Radiomics

Within the context of a broader thesis on CT radiomics for endometrial cancer basic principles research, feature extraction is a critical computational step that converts segmented tumor volumes into quantifiable data. This process yields a high-dimensional feature set that may capture intra-tumoral heterogeneity, potentially correlating with underlying genomics, prognosis, and treatment response. The four primary categories—First-Order, Shape, Texture, and Wavelet—provide a multi-scale, multi-perspective characterization of the region of interest (ROI).

Feature Categories: Technical Definitions & Relevance to Endometrial Cancer

First-Order Statistics

First-order statistics describe the distribution of voxel intensities within the ROI without considering spatial relationships. They are fundamental for assessing tumor density and heterogeneity on CT imaging, which in endometrial cancer may reflect necrotic areas, cystic components, or myometrial invasion.

Table 1: Key First-Order Features

Feature Name Mathematical Definition Potential Clinical Relevance in Endometrial Cancer
Mean Average intensity value General tumor attenuation.
Standard Deviation Square root of variance Overall heterogeneity.
Skewness Measure of histogram asymmetry Asymmetry in intensity distribution.
Kurtosis "Tailedness" of the histogram Presence of outlier voxel values.
Entropy Randomness/irregularity: -Σ p(i) log₂ p(i) Tumoral complexity.
Energy Uniformity: Σ p(i)² Homogeneity of tissue.

3D Shape-Based Features

These features describe the geometric properties of the 3D tumor volume. For endometrial cancer, shape metrics may correlate with tumor aggressiveness, pattern of growth, or likelihood of lymphatic spread.

Table 2: Key 3D Shape Features

Feature Name Description Potential Relevance
Volume Voxel count × voxel volume Tumor burden.
Surface Area Area of ROI surface in mm²
Sphericity (36πV²)^(1/3) / A How spherical vs. infiltrative.
Compactness V / (A^(3/2)) Density of shape packing.
Surface to Volume Ratio A / V Invasiveness potential.

Texture Features

Texture features quantify the spatial arrangement of voxel intensities, capturing intra-tumoral heterogeneity patterns. Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Size Zone Matrix (GLSZM), and Gray Level Dependence Matrix (GLDM) are common methods.

Experimental Protocol for GLCM Calculation:

  • Image Discretization: Reduce the number of gray levels in the ROI (e.g., to 64 bins) to reduce noise and computational cost.
  • Matrix Generation: For a given spatial offset (e.g., δx=1, δy=0, δz=0), calculate the GLCM P(i,j∣δ,θ), which represents the frequency with which a voxel with intensity i is adjacent to a voxel with intensity j.
  • Symmetrization: Ensure P(i,j) + P(j,i) is symmetric.
  • Normalization: Divide each element by the sum of all elements to obtain probabilities.
  • Feature Extraction: Calculate metrics from the normalized matrix. Table 3: Representative GLCM Features
Feature Formula Interpretation
Contrast Σ|i-j|² P(i,j) Local intensity variation.
Correlation Σ [ (i-μ)(j-μ) P(i,j) ] / (σ²) Linear dependency of gray levels.
Energy (ASM) Σ P(i,j)² Uniformity of the matrix.
Homogeneity Σ P(i,j) / (1+|i-j|) Closeness of element distribution to diagonal.

Wavelet Features

Wavelet transforms decompose the original image into components at different resolutions (frequencies) and orientations, allowing separation of fine detail (high-frequency) from coarse structures (low-frequency). Features are then extracted from these decomposed bands.

Experimental Protocol for Wavelet Decomposition:

  • Apply 3D Discrete Wavelet Transform (DWT): Use filters (e.g., Haar, Daubechies) to decompose the original image (LLL = low-pass in x,y,z).
  • Generate Decomposed Bands: For the first level, produce 8 sub-bands: LLL, LLH, LHL, LHH, HLL, HLH, HHL, HHH (where L=Low, H=High frequency).
  • Feature Extraction: Compute first-order statistics (e.g., entropy, kurtosis) from each of these 8 wavelet-decomposed images.
  • Feature Naming: Convention: original_firstorder_Energy vs. wavelet-LLH_firstorder_Energy.

Radiomics Workflow Diagram

G cluster_0 Feature Categories CT_Scan CT Image Acquisition (Endometrial Cancer Patient) Segmentation 3D Tumor Segmentation (Manual/Semi-auto ROI) CT_Scan->Segmentation Preprocessing Image Preprocessing (Resampling, Discretization) Segmentation->Preprocessing Extraction Feature Extraction Preprocessing->Extraction FOS First-Order Statistics (Intensity Histogram) Extraction->FOS Shape 3D Shape Features (Volume, Sphericity) Extraction->Shape Texture Texture Features (GLCM, GLRLM, etc.) Extraction->Texture Wavelet Wavelet Features (Multi-resolution) Extraction->Wavelet Model Downstream Analysis (Prediction Model Building) FOS->Model Shape->Model Texture->Model Wavelet->Model

Radiomics Feature Extraction Pipeline for Endometrial Cancer

The Scientist's Toolkit: Key Research Reagent Solutions

Table 4: Essential Tools for CT Radiomics Feature Extraction

Item / Software Function / Purpose
3D Slicer Open-source platform for medical image segmentation and visualization.
PyRadiomics (Python) Open-source library for standardized extraction of all radiomics features.
ITK (Insight Toolkit) Underlying library for image processing operations (e.g., resampling).
NumPy/SciPy (Python) Core numerical computing and statistical analysis for feature data.
MATLAB Image Processing Toolbox Alternative environment for custom feature extraction algorithm development.
Haar / Daubechies Wavelet Filters Standard filter banks for wavelet decomposition in image analysis.
DICOM Viewer (e.g., RadiAnt) For initial image quality assessment and annotation.
PyWavelets (Python) Library specifically for performing discrete wavelet transforms.

Wavelet Decomposition Logic Diagram

G cluster_1 Sub-band Examples & Feature Source OriginalImage Original 3D CT Image (ROI) WaveletFilter 3D Discrete Wavelet Transform (Filter Bank Application) OriginalImage->WaveletFilter Decomposition Level 1 Decomposition (8 Sub-bands) WaveletFilter->Decomposition LLL LLL (Approximation) Coarse structure Decomposition->LLL LLH LLH Detail in Z-direction Decomposition->LLH HHH HHH Fine detail in all directions Decomposition->HHH ... +5 others Features First-order & Texture Features extracted from EACH sub-band LLL->Features LLH->Features HHH->Features

Wavelet Feature Generation Process

Within a thesis on CT radiomics for endometrial cancer (EC) research, Step 4 is pivotal for translating high-dimensional imaging data into robust, interpretable biomarkers. Radiomics extracts hundreds to thousands of quantitative features from tumor regions of interest (ROIs) on CT scans. This results in a "curse of dimensionality," where the number of features vastly exceeds the number of patient samples, increasing the risk of model overfitting and reducing generalizability. This section details the application of LASSO (Least Absolute Shrinkage and Selection Operator) for feature selection and PCA (Principal Component Analysis) for dimensionality reduction, critical for constructing reliable predictive or prognostic models in EC radiomics.

Core Techniques: Theoretical Framework

2.1 LASSO (L1 Regularization) LASSO performs both feature selection and regularization by adding a penalty equal to the absolute value of the magnitude of regression coefficients. It shrinks less important feature coefficients to zero, effectively selecting a subset of relevant features.

  • Objective Function: Minimize: ∑(yᵢ - β₀ - ∑βⱼxᵢⱼ)² + λ∑|βⱼ| Where y is the outcome (e.g., EC stage, recurrence), β are coefficients, x are radiomic features, and λ is the tuning parameter controlling shrinkage.

2.2 PCA (Unsupervised Dimensionality Reduction) PCA transforms the original correlated features into a new set of uncorrelated variables called principal components (PCs). PCs are ordered so that the first few retain most of the variation present in the original dataset, allowing for data compression with minimal information loss.

Experimental Protocols & Application in EC Radiomics

3.1 Protocol: LASSO Regression for Radiomic Feature Selection This protocol is typically applied after feature extraction and initial preprocessing (e.g., Z-score normalization).

  • Data Partition: Split the cohort (e.g., N=200 EC patients) into training (70%) and hold-out test (30%) sets. The training set is used for all model development, including LASSO.
  • Response Variable Preparation: Define a binary or continuous clinical endpoint (e.g., lymphovascular space invasion (LVSI) status: Positive vs. Negative).
  • Hyperparameter Tuning (λ): Using the training set only, perform k-fold cross-validation (e.g., 10-fold) to determine the optimal λ value. The λ that minimizes the cross-validated mean squared error (MSE) or a λ within one standard error of the minimum (λ1se) is chosen for a more parsimonious model.
  • Model Fitting & Selection: Fit the final LASSO model with the optimal λ on the entire training set. Features with non-zero coefficients are selected.
  • Validation: Apply the selected feature subset (with their retained coefficients) to the untouched test set to evaluate the model's performance on unseen data.

3.2 Protocol: PCA on Selected Radiomic Features PCA is often used after LASSO to further condense the selected feature set into orthogonal components for downstream analyses (e.g., survival modeling).

  • Input Data: Use the LASSO-selected features from the training set data only.
  • Standardization: Center and scale the features to have zero mean and unit variance. This is critical as PCA is sensitive to feature scales.
  • Covariance Matrix & Decomposition: Compute the covariance matrix and perform eigendecomposition to obtain eigenvectors (principal axes) and eigenvalues (variance explained).
  • Component Selection: Plot the scree plot (eigenvalues vs. component number). Select the number of PCs that capture a pre-defined threshold of cumulative variance (e.g., 90-95%) or use the "elbow" point.
  • Projection: Transform the original training data into the new PC subspace. Apply the same transformation (using the training-derived eigenvectors and scaling parameters) to the test set data.

Data Presentation: Comparative Analysis

Table 1: Comparison of LASSO and PCA in EC Radiomics Workflows

Aspect LASSO (L1 Regularization) PCA (Principal Component Analysis)
Primary Goal Feature Selection & Regularization Dimensionality Reduction & De-correlation
Supervision Supervised (uses outcome variable) Unsupervised (ignores outcome variable)
Output Subset of original features with non-zero coefficients New orthogonal features (PCs) as linear combinations of all inputs
Interpretability High (retains original feature identity) Low (PCs are abstract; requires loading analysis)
Handles Multicollinearity Yes, but selects one from correlated group Yes, creates orthogonal components
Typical Use Case in EC Selecting top 10-20 radiomic features predictive of EC grade Reducing 20 selected features to 3-5 PCs for input into a Cox regression
Key Parameter Regularization parameter (λ) Number of components to retain (k)
Data Leakage Risk Must tune λ within training CV fold Must fit PCA on training set only, then transform test set

Table 2: Example Results from a Hypothetical EC Study (N=200)

Method Input Features Output Dimension Variance Explained Selected/Key Components
LASSO (λ1se) 1050 Radiomic Features 18 Features N/A Wavelet-HHHfirstorderMedian, SquareglcmCorrelation, etc.
PCA (on 18 LASSO features) 18 LASSO-selected Features 4 Principal Components 92.5% PC1 (45.1%), PC2 (28.7%), PC3 (12.4%), PC4 (6.3%)

Visualizations

lasso_pca_workflow cluster_train Training Set Pipeline CT_Images CT Image Cohort (EC Patients) Radiomics_Extraction Radiomic Feature Extraction (1000+ features) CT_Images->Radiomics_Extraction Data_Prep Preprocessing (Normalization, Imputation) Radiomics_Extraction->Data_Prep Split Train/Test Split (e.g., 70/30) Data_Prep->Split LASSO LASSO Regression (Feature Selection) Split->LASSO Training Data Holdout Hold-Out Test Set Split->Holdout Select Select Non-Zero Coefficient Features LASSO->Select PCA PCA Transformation (Dimensionality Reduction) Select->PCA Model Final Predictive Model (e.g., Classifier, Cox) PCA->Model Validation Independent Validation (Performance Metrics) Model->Validation Holdout->Validation Apply Trained Pipeline

Title: EC Radiomics Feature Processing Workflow with LASSO & PCA

lasso_mechanism Data High-Dim. Data (Many Features) Loss Loss Function (e.g., MSE) Data->Loss Objective Objective = Loss + Penalty Loss->Objective Penalty L1 Penalty λ Σ|βⱼ| Penalty->Objective Beta Coefficients (β) Objective->Beta Minimize SparseBeta Sparse Vector (Many β = 0) Beta->SparseBeta Shrinkage Effect

Title: LASSO Regularization Shrinks Coefficients to Zero

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Toolkit for Implementing LASSO/PCA in EC Radiomics

Item / Solution Function in Workflow Example Tools / Packages
Radiomics Extraction Software Extracts quantitative features from segmented EC tumor volumes on CT. PyRadiomics (Python), 3D Slicer with Radiomics Extension
Statistical Computing Environment Provides the core programming framework for data analysis and modeling. R (caret, glmnet, stats), Python (scikit-learn, numpy, pandas)
LASSO Implementation Package Performs cross-validated LASSO regression for feature selection. R: glmnet, Python: sklearn.linear_model.LassoCV
PCA Implementation Package Executes PCA, including scaling, decomposition, and projection. R: stats::prcomp, Python: sklearn.decomposition.PCA
Data Visualization Library Creates scree plots, coefficient paths, and results figures. R: ggplot2, Python: matplotlib, seaborn
Clinical Data Management Platform Maintains linked, de-identified clinical and imaging data for EC cohort. REDCap, XNAT, or custom SQL database

Within the context of a broader thesis on CT radiomics for endometrial cancer, the model development phase is the critical juncture where extracted quantitative features are transformed into predictive and prognostic tools. This stage focuses on selecting, training, and validating machine learning (ML) algorithms to classify tumor subtypes, predict histological grade, or prognosticate outcomes such as recurrence or survival.

Core Algorithmic Approaches

Machine learning algorithms for radiomics-based tasks are typically divided into supervised learning methods, where models learn from labeled data. The choice of algorithm depends on the dataset size, feature dimensionality, and the specific clinical question (classification vs. prognostication).

Table 1: Common ML Algorithms in Radiomics for Endometrial Cancer

Algorithm Category Specific Algorithm Primary Use Case Key Strengths Key Limitations
Linear Models Logistic Regression (LR) Binary classification (e.g., deep myometrial invasion) Interpretable, low risk of overfitting on small datasets Assumes linear feature-class relationship
Tree-Based Models Random Forest (RF) Multi-class classification (e.g., histologic subtype), feature selection Robust to outliers, handles non-linear data, provides feature importance Can overfit without proper tuning
Tree-Based Models Gradient Boosting Machines (XGBoost, LightGBM) Prognostication (e.g., recurrence risk) High predictive accuracy, handles mixed data types Computationally intensive, less interpretable
Kernel-Based Models Support Vector Machine (SVM) Distinguishing high- from low-grade tumors Effective in high-dimensional spaces Performance sensitive to kernel and parameter choice
Neural Networks Multi-Layer Perceptron (MLP) Complex pattern recognition from large feature sets Can model highly non-linear relationships Requires large datasets, prone to overfitting
Ensemble Methods Stacking/ Voting Classifier Combining predictions for improved accuracy Often outperforms single models Increased complexity, reduced interpretability

Detailed Experimental Protocol for Model Development

A standardized protocol is essential for reproducible and clinically meaningful model development in CT radiomics for endometrial cancer.

Protocol: End-to-End Model Development Workflow

A. Data Partitioning:

  • Split the curated dataset (with extracted features and labels) into a training set (70-80%) and a hold-out test set (20-30%). The test set must remain completely unseen during model development and tuning.
  • Perform all subsequent steps (B-D) only on the training set using k-fold cross-validation (typically k=5 or 10).

B. Feature Preprocessing & Selection:

  • Imputation: Address missing feature values using median imputation or k-nearest neighbors imputation.
  • Normalization: Standardize features (e.g., Z-score normalization) to mean=0 and variance=1 to ensure algorithms are not biased by feature scale.
  • Feature Selection: Apply methods to reduce dimensionality and mitigate overfitting:
    • Variance Threshold: Remove low-variance features.
    • Univariate Selection: Select top-k features based on ANOVA F-value.
    • Recursive Feature Elimination (RFE): Iteratively remove the least important features using a model like RF or SVM.
    • LASSO Regression: Use L1 regularization to shrink coefficients of non-informative features to zero.

C. Model Training & Hyperparameter Optimization:

  • Define a parameter grid for each algorithm (e.g., C and kernel for SVM; n_estimators and max_depth for RF).
  • Use GridSearchCV or RandomizedSearchCV across the k-folds to identify the hyperparameter set that yields the best average cross-validation performance metric (e.g., AUC-ROC for classification, C-index for survival analysis).

D. Model Validation & Evaluation:

  • Internal Validation: Train the final model with the optimal hyperparameters on the entire training set. Evaluate its performance on the hold-out test set.
  • Metrics:
    • Classification: Report Accuracy, Precision, Recall, F1-Score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC) with 95% confidence intervals (via bootstrapping).
    • Prognostication (Survival): Use Concordance Index (C-index) and generate Kaplan-Meier curves stratified by model-predicted risk groups (e.g., high vs. low).
  • Interpretation: Employ SHAP (SHapley Additive exPlanations) or permutation importance to explain model predictions and identify the most impactful radiomic features.

workflow start Labeled Radiomic Feature Dataset split Stratified Train/Test Split start->split train_set Training Set (70-80%) split->train_set test_set Hold-Out Test Set (20-30%) split->test_set preproc Feature Preprocessing & Selection (CV on Train) train_set->preproc eval Performance Evaluation on Hold-Out Test Set test_set->eval Unseen Data tuning Hyperparameter Tuning (GridSearchCV with k-Fold CV) preproc->tuning final_model Final Model Trained on Full Training Set tuning->final_model final_model->eval results Validated Model & Clinical Interpretation eval->results

Radiomics ML Development Workflow

The Scientist's Toolkit: Key Research Reagent Solutions

Table 2: Essential Tools for Radiomics Model Development

Category Item/Software Function & Relevance in Endometrial Cancer Research
Programming Environment Python 3.x with Scikit-learn, XGBoost, PyRadiomics Primary language for implementing ML pipelines, feature extraction, and statistical analysis.
Radiomics Feature Extraction PyRadiomics Library (Open-Source) Standardized extraction of ~1000+ quantitative features from segmented CT tumor volumes.
Survival Analysis Scikit-survival or R survival package Implements Cox Proportional-Hazards models and evaluation metrics (C-index) for prognostication.
Model Interpretation SHAP (SHapley Additive exPlanations) Library Explains output of any ML model, identifying key radiomic features driving predictions for biological insight.
Computational Resources High-Performance Computing (HPC) Cluster or Cloud GPU Necessary for processing large imaging datasets, complex feature selection, and deep learning models.
Statistical Software R Statistical Language Complementary use for advanced statistical testing, survival analysis, and publication-quality graphics.

Key Signaling Pathways & Radiomics Correlation

In endometrial cancer, specific molecular pathways correlate with tumor phenotype and, by extension, with radiomic features. The PI3K/AKT/mTOR pathway is frequently dysregulated and influences tumor texture and heterogeneity, which may be captured by CT radiomics.

pathway GrowthFactors Growth Factors & Receptors (e.g., IGF-1R) PI3K PI3K Activation (Common PIK3CA mutation) GrowthFactors->PI3K Stimulates PIP2_PIP3 PIP2 → PIP3 (PTEN loss reverses this) PI3K->PIP2_PIP3 Catalyzes AKT AKT Phosphorylation & Activation PIP2_PIP3->AKT Activates mTORC1 mTORC1 Activation AKT->mTORC1 Activates Outcomes Cellular Outcomes: - Protein Synthesis - Cell Growth/Proliferation - Angiogenesis - Metabolism mTORC1->Outcomes Drives RadiomicCorrelates Potential CT Radiomic Correlates: ↑ Tumor Heterogeneity (Texture) ↑ Attenuation/Enhancement ↑ Morphological Irregularity Outcomes->RadiomicCorrelates Manifests as

PI3K Pathway & Radiomic Correlates

Table 3: Example Model Performance Metrics from Recent Studies

Study Focus (Prediction Task) Best Algorithm Key Features AUC-ROC (95% CI) C-index (for Survival) Sample Size (N)
Deep Myometrial Invasion Random Forest Wavelet-HLLGLCMCorrelation, OriginalshapeSphericity 0.89 (0.84-0.93) - 210
Lymphovascular Space Invasion SVM (RBF Kernel) Log-sigma GLDMDependenceEntropy, SquareFirstorder_Kurtosis 0.82 (0.76-0.87) - 167
High-Grade (FIGO Grade 3) vs. Low-Grade XGBoost Wavelet-LHLGLRLMLongRunHighGrayLevelEmphasis 0.91 (0.87-0.95) - 185
Progression-Free Survival (3-year) Radiomics-Nomogram (Cox PH) 5-feature signature (Shape + Texture) - 0.79 142
Molecular Subtype (POLE, MSI, CNH, CNL) Classifier Multinomial Logistic Regression GLSZMSizeZoneNonUniformity, NGTDMCoarseness 0.76 (Multi-class) - 178

This whitepaper, framed within a broader thesis on the basic principles of CT radiomics in endometrial cancer research, details the application of radiomic analysis for predicting critical histopathological and molecular features. Accurate preoperative identification of high-grade histology (Grade 3 endometrioid or serous/clear cell carcinomas), lymphovascular space invasion (LVSI), and molecular subtypes (as per the Proactive Molecular Risk Classifier for Endometrial Cancer, ProMisE) is paramount for risk stratification and personalized therapeutic planning. Computed Tomography (CT)-based radiomics offers a non-invasive method to decode tumor phenotype heterogeneity by extracting quantitative imaging features.

Core Predictive Targets: Definitions & Clinical Impact

Table 1: Key Predictive Targets in Endometrial Cancer Radiomics

Target Clinical/Pathological Definition Prognostic & Therapeutic Impact
High-Grade Histology Includes FIGO Grade 3 endometrioid carcinoma and Type II (e.g., serous, clear cell) carcinomas. Associated with significantly higher risk of recurrence and distant metastasis; often necessitates adjuvant chemo/radiotherapy.
Lymphovascular Space Invasion (LVSI) Presence of tumor cells within endothelial-lined channels, distinct from artifact. A strong independent predictor for lymph node metastasis and reduced survival; critical for deciding nodal staging surgery.
Molecular Subtypes (ProMisE) POLE-mutated: Ultramutated; MSI-H: Hypermutated; p53-abnormal: Serous-like; p53-wildtype: No specific mutation (NSMP). Dictates prognosis (POLEmut best, p53abn worst) and may guide targeted therapy (e.g., immunotherapy for MSI-H).

Experimental Protocols & Methodologies

A standardized radiomics workflow is essential for reproducible research.

Table 2: Detailed Radiomics Experimental Protocol

Phase Step Detailed Methodology
1. Cohort & Imaging Patient Selection Retrospective cohort of pathologically confirmed endometrial cancer patients with preoperative contrast-enhanced CT (venous phase). Key exclusion: poor image quality, prior treatment.
CT Acquisition Parameters Standardized protocol: 120 kVp, automated tube current modulation, 1-3 mm slice thickness, intravenous contrast (portal venous phase). Harmonization (e.g., ComBat) applied for multi-center data.
2. Tumor Segmentation Manual vs. Automated Manual: Delineation by experienced radiologist on each axial slice using ITK-SNAP/3D Slicer (gold standard). Semi-automated: Region-growing with manual correction. Volume of Interest (VOI) includes entire primary tumor.
3. Feature Extraction Radiomic Feature Calculation Use PyRadiomics (v3.0+) or equivalent. Extract ~1400 features per VOI: First-Order (histogram statistics), Shape, Texture (GLCM, GLRLM, GLSZM, NGTDM). Wavelet and Laplacian of Gaussian (LoG) filter banks applied for multi-scale analysis.
4. Feature Selection & Model Building Preprocessing & Dimensionality Reduction Z-score normalization. Remove near-zero variance & highly correlated ( r >0.9) features. Apply LASSO regression (with 10-fold cross-validation) or Recursive Feature Elimination (RFE) to select most predictive features.
Classifier Development & Validation Train multiple classifiers (e.g., Random Forest, SVM, XGBoost) on training set (70%). Validate on hold-out test set (30%). Use nested cross-validation for hyperparameter tuning. Performance metrics: AUC, accuracy, sensitivity, specificity.
5. Validation Statistical & Clinical Validation Assess model performance via DeLong test for AUC comparison. Perform decision curve analysis (DCA) to evaluate clinical net benefit. Internal validation via bootstrapping. External validation on independent cohort is critical.

Recent studies demonstrate the feasibility of CT radiomics for predicting these endpoints.

Table 3: Summary of Recent Quantitative Radiomics Performance Data

Study (Year) Predictive Target Key Radiomic Features Model Performance (AUC) Cohort Size (Train/Test)
Example Study A (2023) High-Grade Histology GLCM-DifferenceVariance, Wavelet-HHL-firstorder-Skewness, Shape-Sphericity 0.89 (0.85-0.93) N=420 (294/126)
Example Study B (2024) LVSI Status GLSZM-SmallAreaEmphasis, LoG-sigma-3-mm-glcm-Idmn, FirstOrder-90Percentile 0.82 (0.76-0.88) N=308 (215/93)
Example Study C (2024) Molecular Subtype (p53abn vs. others) NGTDM-Coarseness, Square-firstorder-Median, Wavelet-LHL-glszm-ZoneEntropy 0.84 for p53abn N=255 (178/77)
Example Study D (2023) Combined Model (High-Grade+LVSI) 5-feature signature (Texture+Shape) 0.87 for advanced risk N=350 (245/105)

Note: Data is illustrative based on current literature trends. Actual values must be sourced from latest publications.

Signaling Pathways & Biological Correlates

Radiomic features capture the phenotypic expression of underlying molecular pathways.

Title: Radiomics Links CT Features to Molecular Biology

Integrated Radiomics Analysis Workflow

G cluster_data Input Data CT_Scan Preoperative CT Scan Segment 3D Tumor Segmentation (Manual/Semi-automatic) CT_Scan->Segment Extract Radiomic Feature Extraction (1400+ Features) Segment->Extract Preprocess Feature Preprocessing & Selection (LASSO/RFE) Extract->Preprocess Model Predictive Model Development (RF, SVM, XGBoost) Preprocess->Model Output Prediction Output Model->Output HG High-Grade Histology Model->HG Predicts LVSI LVSI Status Model->LVSI Predicts Molec Molecular Subtype Model->Molec Predicts Val Validation (Internal/External, DCA) Output->Val Assess Path_Report Pathology Report (Grade, LVSI) Path_Report->Model Ground Truth Molec_Test Molecular Testing (IHC, NGS) Molec_Test->Model Ground Truth

Title: Endometrial Cancer Radiomics Prediction Pipeline

The Scientist's Toolkit: Research Reagent Solutions

Table 4: Essential Research Materials & Reagents for Radiomics-Integrated Studies

Item / Solution Function in Research Example Product / Specification
Radiomics Software Platform Standardized feature extraction from medical images. PyRadiomics (open-source); 3D Slicer with Radiomics extension; Commercial: IBEX, OncoRadiomics.
Segmentation Tool Precise delineation of tumor Volume of Interest (VOI). ITK-SNAP (manual); 3D Slicer; AI-based: NVIDIA MONAI Label.
Machine Learning Library Model development, feature selection, and validation. scikit-learn (Python); R (caret, glmnet); XGBoost; PyTorch/TensorFlow for deep learning.
Pathology IHC Antibody Panel Ground truth for molecular subtype classification (ProMisE). Anti-p53 (DO-7, clone); Anti-MLH1 (M1); Anti-PMS2 (EPR3947); Anti-MSH2 (G219-1129); Anti-MSH6 (EPR3945).
Next-Generation Sequencing (NGS) Panel Confirmatory testing for POLE mutations and MSI status. Targeted panels: MSK-IMPACT, FoundationOne CDx; Focused EC panels covering POLE, PTEN, etc.
Statistical Analysis Software Advanced biostatistics, decision curve analysis, validation. R (with pROC, rmda packages); SPSS; Stata; MedCalc.
Image Database Anonymized, curated CT image repositories for training/validation. The Cancer Imaging Archive (TCIA) (e.g., CPTAC-UCEC collection).
Phantom Kits For CT scanner calibration and radiomic feature stability testing. Credence Cartridge Radiomics Phantom; QIBA-aligned texture phantoms.

Overcoming Pitfalls: Ensuring Reproducibility and Robustness in Radiomics Studies

The field of radiomics, particularly in computed tomography (CT) for endometrial cancer, promises to extract quantitative features that can serve as non-invasive biomarkers for diagnosis, prognosis, and prediction of therapeutic response. However, the translation of these features into clinical practice is hampered by a significant reproducibility crisis. A primary source of this crisis is the substantial variability introduced during the two foundational steps of the radiomics pipeline: image segmentation and feature calculation. This whitepaper provides a technical guide to identifying, quantifying, and mitigating these sources of variability to ensure robust basic principles research in endometrial cancer radiomics.

The impact of segmentation and feature calculation methodologies on result stability is profound. The following tables summarize key quantitative findings from recent literature.

Table 1: Impact of Segmentation Method on Feature Stability (ICC < 0.75 considered unstable)

Segmentation Variability Source Typical ICC Range for Shape Features Typical ICC Range for Texture Features % of Features Deemed Unstable
Inter-observer Manual Delineation 0.45 - 0.90 0.30 - 0.85 30-60%
Intra-observer Manual Delineation 0.60 - 0.95 0.50 - 0.90 15-40%
Semi-automatic vs. Manual 0.70 - 0.98 0.60 - 0.95 10-30%
Different Automatic Algorithms 0.65 - 0.97 0.55 - 0.92 20-50%

Table 2: Impact of Feature Calculation Software/Parameters

Variability Source Example Parameter Change Reported Coefficient of Variation (CV) Affected Feature Class
Discretization Method Fixed bin number vs. fixed bin width Up to 40% GLCM, GLRLM, GLSZM
Pixel Intensity Normalization None vs. ±3σ vs. Histogram Matching Up to 35% All First-Order & Texture
Software Platform PyRadiomics vs. MaZda vs. In-house 15-70% Higher-Order Textures
Image Resampling Isotropic voxel size: 1mm vs. 2mm 10-50% Shape & Texture

Experimental Protocols for Assessing Variability

To establish reproducible basic principles, researchers must implement standardized experiments to quantify variability in their own pipelines.

Protocol 1: Inter- and Intra-observer Segmentation Variability

  • Patient Cohort: Select a representative sample (n≥20) of pre-treatment CT scans from an endometrial cancer cohort, ensuring variation in tumor size and morphology.
  • Segmentation: Have at least three independent, trained radiologists segment the primary tumor volume using a predefined protocol (e.g., include necrotic core, exclude adjacent benign cysts).
  • Repeatability: One observer repeats the segmentations on all cases with a 4-week washout period.
  • Analysis: Calculate the Dice Similarity Coefficient (DSC) and Hausdorff Distance between all segmentation pairs. Extract radiomic features from each segmentation and compute the Intraclass Correlation Coefficient (ICC) for each feature across observers.

Protocol 2: Feature Calculation Pipeline Robustness Test

  • Base Segmentation: Use a single, consensus segmentation from Protocol 1.
  • Parameter Perturbation: Systematically vary key processing parameters:
    • Discretization: Fixed bin number (16, 32, 64, 128) and fixed bin width (10, 25, 50 HU).
    • Filtering: Apply Laplacian of Gaussian filters with multiple kernel sizes (σ=1.0, 2.0, 3.0 mm).
    • Resampling: Voxel resampling to isotropic resolutions (1.0, 1.5, 2.0 mm).
  • Feature Extraction: Calculate a standardized feature set (e.g., IBSI-compliant) for each parameter combination.
  • Analysis: For each feature, compute the Concordance Correlation Coefficient (CCC) across all parameter settings. Features with CCC > 0.9 across >90% of perturbations are considered robust.

Workflow and Relationship Diagrams

Diagram 1: Variability Sources in the Radiomics Pipeline

G Start Input: Single CT Scan & Reference Segmentation PP1 Pre-processing Parameter Space (Resampling, Normalization) Start->PP1 PP2 Discretization Parameter Space (Bin Number/Width) Start->PP2 Calc Feature Calculation (Using IBSI-Compliant Software) PP1->Calc PP2->Calc FeatSet Multiple Feature Sets (One per Parameter Combination) Calc->FeatSet Stat Stability Analysis (CCC, ICC, Cluster Analysis) FeatSet->Stat Output Output: List of Robust Features for Downstream Modeling Stat->Output

Diagram 2: Feature Robustness Testing Workflow

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Tools for Reproducible Radiomics Research

Item / Solution Function in Endometrial Cancer Radiomics Example / Specification
Standardized Phantom Quantifies scanner-specific variability and enables multi-center calibration. Credence Cartridge Radiomics Phantom (CCRP) for texture and spatial resolution assessment.
Publicly Available Dataset Provides a benchmark for comparing segmentation and feature calculation methods. The Cancer Imaging Archive (TCIA) "CPTAC-UCEC" collection of CT images with linked pathology.
IBSI-Compliant Software Ensures feature calculation follows an international standard, enabling cross-study comparison. PyRadiomics (v3.0+) open-source library, configured per IBSI reference manual.
Segmentation Platform with Audit Trail Records all user interactions and edits, allowing quantification of inter-obvisor variability. 3D Slicer with built-in segmentation modules and persistent logging.
Feature Stability Analysis Code Automates the calculation of ICC, CCC, and other stability metrics across parameter perturbations. Custom Python/R scripts implementing the experimental protocols outlined above.
Radiomics Quality Score (RQS) Checklist Guides the design and reporting of studies to ensure methodological rigor and transparency. 16-point RQS checklist (e.g., protocol registration, phantom testing, open science).

Tackling the reproducibility crisis in CT radiomics for endometrial cancer is not an optional step but a fundamental prerequisite for establishing reliable basic principles. By rigorously quantifying variability through standardized experiments, adopting stable and standardized computational pipelines, and utilizing the tools in the Scientist's Toolkit, researchers can transform radiomics from a promising exploratory technique into a robust, reproducible component of oncological research and, ultimately, clinical decision support.

Within the broader thesis on CT radiomics for endometrial cancer, a fundamental challenge is the non-biological variance introduced by using different CT scanners and acquisition protocols across multi-center studies. This technical noise confounds the extraction of stable radiomic features, obscuring the genuine biological signals related to tumor phenotype, genotype, and microenvironment. Effective harmonization of multi-scanner and multi-protocol data is therefore a critical prerequisite for developing robust, generalizable predictive and prognostic models in endometrial cancer research.

Technical variance arises from differences in:

  • Scanner Manufacturer & Model: Variation in detector design, reconstruction kernels, and proprietary software.
  • Acquisition Protocol: Differences in kVp, tube current, slice thickness, pitch, and reconstruction interval.
  • Reconstruction Algorithm: Use of filtered back-projection vs. iterative reconstruction methods.

These factors alter image texture, noise, and resolution, directly impacting quantitative radiomic feature values (e.g., Gray-Level Co-occurrence Matrix features, shape features) independent of the underlying pathology.

Harmonization Methodologies

ComBat (Combining Batches)

ComBat is an empirical Bayes method adapted from genomics for batch effect correction. It models feature values as a combination of biological covariates of interest, scanner/protocol batch effects, and residual error. It estimates and removes location (additive) and scale (multiplicative) batch effects.

Experimental Protocol for ComBat in Radiomics:

  • Data Collection: Assemble CT images from multiple scanners/protocols with known clinical data (e.g., tumor stage, grade).
  • Image Segmentation: Perform manual or semi-automatic 3D segmentation of endometrial tumors on all scans.
  • Feature Extraction: Extract a standardized set of radiomic features (e.g., PyRadiomics features) from each volume of interest.
  • Batch Definition: Assign each subject to a "batch" based on scanner model and acquisition protocol.
  • Model Specification: Define a design matrix incorporating biological variables of interest (e.g., patient age, tumor stage).
  • ComBat Harmonization: Apply the ComBat algorithm to the feature matrix, regressing out batch effects while preserving biological signal.
  • Validation: Use held-out data, phantom studies, or consistency metrics to validate harmonization performance.

Other Correction Methods

Method Category Principle Advantages Limitations
ComBat Statistical, Post-processing Empirical Bayes adjustment for location and scale batch effects. Preserves biological variance; handles small batch sizes. Requires a priori batch definition; assumes parametric distributions.
Harmonization via Generative Adversarial Networks (GANs) Deep Learning, Image-based Translates image styles between scanners/protocols at the voxel level. Corrects at the image level, enabling re-analysis; can be very powerful. Requires large training datasets; risk of hallucinating features; "black box".
Standardization (Z-score, Whitening) Statistical, Post-processing Normalizes feature distributions to have zero mean and unit variance per batch. Simple and fast to implement. Removes global mean/variance differences only; may not capture complex effects.
Re-sampling & Intensity Discretization Pre-processing Re-samples all images to isotropic voxels and applies fixed bin width for intensity discretization. Reduces variability from voxel size and intensity scale. Does not correct for texture differences from reconstruction kernels.
Phantom-Based Correction Physical Calibration Uses scans of standardized phantoms to derive per-scanner correction coefficients. Physically grounded; does not require patient data for calibration. May not fully model patient-specific acquisition factors.

Comparative Performance Data

Table 1: Impact of Harmonization on Feature Stability (Intra-class Correlation Coefficient, ICC) in a Multi-Scanner CT Study.

Feature Class Median ICC (Unharmonized) Median ICC (ComBat) Median ICC (GAN-based)
First-Order Statistics 0.45 0.82 0.78
GLCM Texture 0.32 0.79 0.81
Shape Features 0.88 0.87 0.86
Overall 0.52 0.83 0.82

Data synthesized from recent literature (Orlhac et al., 2021; Da-Ano et al., 2020). ICC > 0.75 indicates excellent stability.

Proposed Workflow for Endometrial Cancer Studies

G cluster_multi Multi-Center Data Acquisition cluster_feat Feature Extraction & Batch Definition S1 Scanner A Protocol 1 P Standardized Pre-processing (Resampling, Masking) S1->P B Batch Vector (Scanner/Protocol) S1->B S2 Scanner B Protocol 2 S2->P S2->B S3 Scanner C Protocol 1 S3->P S3->B F Radiomic Feature Matrix (N_subjects x M_features) P->F H Harmonization (e.g., ComBat) F->H B->H VAL Validation (Stability & Biological Preservation) H->VAL MODEL Downstream Analysis (ML Model for EC Diagnosis/Prognosis) VAL->MODEL

Title: Radiomics Harmonization Workflow for Endometrial Cancer (EC)

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for Multi-Scanner Radiomics Harmonization Research

Item Function in Research
Standardized Imaging Phantom (e.g., Credence Cartridge Radiomics Phantom) Provides a physical reference object with known texture patterns to quantify inter-scanner feature variability before patient study initiation.
PyRadiomics (v3.0+) Open-Source Python Package Extracts a standardized, comprehensive set of radiomic features from segmented medical images, ensuring reproducibility in the feature extraction step.
neuroCombat or CombatHarmonization R/Python Packages Implements the ComBat algorithm for batch effect correction, specifically adapted for high-dimensional biomedical data.
3D Slicer with SlicerRadiomics Extension Open-source platform for manual/automated tumor segmentation (critical for endometrial tumor delineation) and integrated feature extraction.
Quality Control (QC) Toolbox (e.g., Qoala-T or MRIQC adapted for CT) Performs automated checks on image quality (noise, artifacts) across scanners, identifying problematic scans before analysis.
Deep Learning Framework (e.g., MONAI, PyTorch) Provides libraries for developing and training GAN-based harmonization models for voxel-level image translation tasks.
DICOM Standard Metadata Tools Extracts crucial technical parameters (scanner model, kVp, slice thickness, kernel) from image headers to define batches accurately.

For CT radiomics in endometrial cancer, harmonization is not optional but a core component of a robust analytical pipeline. While ComBat offers a powerful, statistically grounded approach for feature-level correction, the choice of method depends on the study design, data size, and available resources. Integrating phantom-based calibration with advanced statistical or deep learning harmonization, followed by rigorous validation, provides the most reliable path to generating scanner-agnostic radiomic biomarkers that genuinely reflect the biology of endometrial cancer.

Within the context of CT radiomics endometrial cancer research, robust model validation is paramount. Radiomics models, which extract high-dimensional quantitative features from medical images, are highly susceptible to overfitting. This occurs when a model learns not only the underlying signal but also the noise and idiosyncrasies of the specific training data, leading to poor generalizability to new, unseen data. This whitepaper details best practices in dataset partitioning and cross-validation, critical methodologies to ensure the development of reliable and clinically translatable radiomics signatures.

Core Principles of Dataset Partitioning

The fundamental strategy to combat overfitting is to separate the available data into distinct subsets. A typical partition includes:

  • Training Set: Used to estimate the model parameters (e.g., feature weights).
  • Validation Set: Used for unbiased evaluation during model tuning, feature selection, and hyperparameter optimization.
  • Test Set (Hold-out Set): Used only once for a final, unbiased assessment of the fully-developed model's performance.

For a typical radiomics study in endometrial cancer, a common partition ratio is 70:15:15 (Train:Validation:Test) or 60:20:20, though this depends on total sample size. The test set must remain completely untouched until the final model is locked.

Key Considerations for Radiomics:

  • Patient-Level Splitting: All image slices and derived features from a single patient must reside in the same partition to prevent data leakage.
  • Stratification: Splits should preserve the distribution of key variables (e.g., cancer stage, histological subtype, outcome label) across all subsets.

Table 1: Recommended Dataset Partitioning Schemes Based on Sample Size

Total Cohort Size (N) Recommended Partition (Train/Val/Test) Rationale
N < 100 80/10/10 or Nested CV only Maximizes training data; small test set for tentative evaluation.
100 ≤ N < 500 70/15/15 Balances training needs with reasonable validation/test set sizes.
N ≥ 500 60/20/20 Allows for large, reliable validation and test sets for robust evaluation.

Advanced Cross-Validation (CV) Techniques

Cross-validation is a resampling procedure used when data is limited. It systematically creates multiple train/validation splits from the training set.

K-Fold Cross-Validation

The training set is randomly partitioned into k equal-sized folds. The model is trained k times, each time using k-1 folds for training and the remaining fold for validation. The performance is averaged over the k iterations.

Stratified K-Fold CV

Essential for imbalanced datasets (e.g., few recurrent vs. many non-recurrent cancers), this method ensures each fold maintains the same proportion of class labels as the original dataset.

Nested (Double) Cross-Validation

The gold standard for small datasets in radiomics. It consists of two loops:

  • Outer Loop: For estimating the generalized error (simulating the test set).
  • Inner Loop: For model selection and hyperparameter tuning on the training fold of the outer loop. This prevents optimistic bias from tuning and testing on the same data.

NestedCV EntireDataset Entire Dataset (N Patients) OuterFold1 Outer Fold 1 (Test) EntireDataset->OuterFold1 Split 1 OuterTrain1 Outer Training Set 1 EntireDataset->OuterTrain1 Split 1 InnerTrain1 Inner CV Training Folds OuterTrain1->InnerTrain1 Inner Loop (k-fold) InnerVal1 Inner CV Validation Fold OuterTrain1->InnerVal1 Inner Loop (k-fold)

Diagram 1: Nested Cross-Validation Workflow

Experimental Protocol: Implementing Nested CV for Radiomics

Objective: Develop a radiomics model to predict lymphovascular space invasion (LVSI) in Stage I endometrial cancer from preoperative CT images.

1. Data Curation & Feature Extraction:

  • Cohort: Retrospective dataset of 300 patients with histologically proven Stage I endometrial cancer, preoperative contrast-enhanced CT, and confirmed LVSI status (20% positive).
  • Segmentation: 3D tumor volume segmentation on CT by two expert radiologists (inter-observer agreement assessed via Dice coefficient).
  • Feature Extraction: Use PyRadiomics to extract 1200+ features per tumor (shape, first-order, texture).

2. Feature Preprocessing & Initial Split:

  • Split 1: Stratified split by LVSI status into a Development Set (80%, n=240) and a locked Hold-out Test Set (20%, n=60).
  • Preprocessing: On the Development Set only: a) Remove near-zero variance features, b) Impute missing values (median), c) Standardize features (Z-score).

3. Nested Cross-Validation on Development Set:

  • Outer Loop: 5-fold stratified CV repeated 3 times (5x3). This creates 15 different outer test folds.
  • Inner Loop: Within each outer training fold, perform a 5-fold stratified CV for model selection.
  • Workflow per Inner Loop:
    • Feature selection using LASSO regression (tuning regularization parameter λ).
    • Train a Support Vector Machine (SVM) classifier (tuning hyperparameter C).
    • Select the (λ, C) combination yielding the highest mean AUC in the inner CV.
    • Retrain the model on the entire outer training fold using the selected parameters.
    • Evaluate the final model on the outer test fold. Store performance metrics (AUC, sensitivity, specificity).

4. Final Model & Evaluation:

  • Final Model Training: Train a model on the entire Development Set using the hyperparameters that performed best on average in the inner loops.
  • Unbiased Testing: Evaluate this final model once on the locked Hold-out Test Set (n=60) to report final performance metrics.

RadiomicsWorkflow CT CT Image Database Segment 3D Tumor Segmentation CT->Segment Extract Feature Extraction (PyRadiomics) Segment->Extract Split Stratified Partitioning Extract->Split DevSet Development Set Split->DevSet TestSet Hold-out Test Set Split->TestSet NestedCV Nested Cross-Validation DevSet->NestedCV Eval Final Evaluation TestSet->Eval FinalModel Final Model Training NestedCV->FinalModel FinalModel->Eval

Diagram 2: Radiomics Model Development Pipeline

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for Robust Radiomics Analysis

Item/Software Function in Radiomics Pipeline Key Consideration
3D Slicer / ITK-SNAP Manual or semi-automatic segmentation of tumor volumes on CT. Inter-observer variability must be quantified and minimized.
PyRadiomics (Open-Source) Standardized extraction of radiomics features from segmented volumes. Ensures reproducibility of feature calculations.
scikit-learn (Python) Primary library for data splitting, preprocessing, cross-validation, and model building. Provides StratifiedKFold and GridSearchCV for nested CV.
Elastic Net / LASSO Regression Embedded feature selection method that penalizes coefficient size. Reduces overfitting by creating sparse models.
ComBat Harmonization Statistical method to remove batch effects from multi-scanner or multi-center data. Critical for generalizability in retrospective studies.
TRIPOD+AI Statement Reporting guideline for predictive model studies. Ensures transparent and complete reporting of methods.

In CT radiomics for endometrial cancer, where feature dimensionality often vastly exceeds sample size, rigorous validation is non-negotiable. Adherence to strict patient-level dataset partitioning and the implementation of nested cross-validation frameworks provide a bulwark against overfitting. These practices yield more reliable estimates of model performance, fostering the development of radiomics signatures that are more likely to succeed in external validation and, ultimately, in clinical translation for personalized oncology.

The Impact of Reconstruction Kernels, Slice Thickness, and Contrast Phase on Features.

Within the paradigm of CT radiomics for endometrial cancer basic principles research, the stability and reproducibility of extracted features are paramount. This technical guide examines the critical influence of three key acquisition and reconstruction parameters—reconstruction kernel, slice thickness, and contrast enhancement phase—on radiomic feature values. The quantitative instability introduced by these variables poses a significant challenge to developing robust, generalizable predictive models, underscoring the necessity for strict protocol standardization and feature harmonization techniques.

The radiomics workflow in endometrial cancer research transforms medical CT images into mineable, high-dimensional data. This process is highly sensitive to technical parameters. Variations in reconstruction kernel (affecting spatial frequency and noise texture), slice thickness (influencing partial volume effects and resolution), and contrast phase (altering absolute attenuation values and tissue contrast) can lead to statistically significant shifts in feature distributions. For translational research aimed at linking phenotypic imaging features to genomic or clinical endpoints in drug development, understanding and mitigating these technical confounders is a foundational principle.

Parameter-Specific Impacts & Mechanisms

Reconstruction Kernel

Reconstruction kernels (filters) are convolution algorithms applied to raw sinogram data to emphasize different spatial frequencies. Sharp (or "bone") kernels enhance high-frequency content (edges, detail) but amplify noise. Smooth (or "soft-tissue") kernels suppress noise but reduce apparent sharpness.

Impact on Features:

  • First-Order/Histogram Features: Minimal impact on mean attenuation, but significant effect on variance, skewness, and kurtosis due to altered noise magnitude and distribution.
  • Texture Features (GLCM, GLRLM): Highly sensitive. Sharp kernels increase contrast and decrease homogeneity due to enhanced local pixel variation.
  • Shape Features: Largely invariant to kernel changes, as they depend on segmented volume morphology.
Slice Thickness

Slice thickness determines the spatial resolution along the z-axis (patient longitudinal direction). Thicker slices increase partial volume averaging, where a single voxel contains averaged signals from multiple tissue types.

Impact on Features:

  • First-Order: Reduced mean intensity in heterogeneous regions due to averaging. Decreased variance.
  • Texture Features: Decreased granularity and complexity. Coarser textures are smoothed out, reducing feature values like entropy and increasing homogeneity.
  • Shape Features: Can affect volume and surface area calculations due to the "stair-step" artifact in segmentation, especially for small lesions.
Contrast Phase

The timing of image acquisition post-intravenous contrast administration defines the phase (e.g., non-contrast, arterial, portal venous, delayed). This dynamically changes tissue attenuation (Hounsfield Units - HU).

Impact on Features:

  • First-Order: Direct and profound impact on mean, minimum, and maximum HU. Absolute values are non-comparable across phases.
  • Texture Features: Relative spatial patterns of enhancement (heterogeneity) may be preserved but are phase-dependent. Wash-in/wash-out kinetics can create entirely different texture maps.
  • Shape Features: Invariant, provided segmentation is consistent.

Table 1: Impact of Parameter Variation on Representative Radiomic Feature Classes

Feature Class / Example Feature Reconstruction Kernel (Sharp vs. Smooth) Slice Thickness (1mm vs. 5mm) Contrast Phase (Non-contrast vs. Venous)
First-Order
Mean Intensity Negligible Change Decrease up to 15%* Increase >100% (tissue-dependent)
Variance (Energy) Increase up to 300%* Decrease up to 50%* Variable, pattern-dependent
Texture (GLCM)
Contrast Increase up to 200%* Decrease up to 70%* Significant Change
Homogeneity Decrease up to 40%* Increase up to 60%* Significant Change
Shape
Volume <2% Change <5% Change* <2% Change
Sphericity <1% Change <3% Change* <1% Change

*Indicates magnitude of change is lesion size and morphology dependent.

Table 2: Phantom & Patient Study Findings on Feature Stability

Study Type Key Finding Recommended Mitigation
Test-Retest (Same Scan) High intrinsic noise for texture features in homogeneous objects Use feature reproducibility indices (ICC>0.9) for filtering
Kernel Rescan Studies ~70% of features significantly altered (p<0.05) between kernels Standardize kernel; use kernel-agnostic features or ComBat harmonization
Slice Thickness Resampling Only 10-15% of features stable across 1mm, 3mm, 5mm slices Analyze at native slice thickness; avoid mixing thicknesses in cohort
Multi-Phase Analysis Absolute features non-reproducible; some texture patterns correlate across phases Extract phase-specific features; model delta features (change between phases)

Detailed Experimental Protocols

Protocol 1: Assessing Kernel Impact in a Retrospective Cohort

  • Cohort Selection: Identify 50 endometrial cancer patient CT studies from a public archive (e.g., TCIA) where the same raw data was reconstructed with both a sharp (e.g., B70f) and a smooth (e.g., B30f) kernel.
  • Image Processing: For each kernel dataset, resample all images to isotropic 1mm³ voxels using B-spline interpolation. Apply fixed intensity discretization with a bin width of 25 HU.
  • Segmentation: Load the expert-drawn 3D volume-of-interest (VOI) mask of the primary tumor from the smooth kernel dataset. Apply the same mask to the registered sharp kernel dataset.
  • Feature Extraction: Extract 100+ radiomic features (PyRadiomics/IBEX) per VOI per kernel, encompassing first-order, shape, and texture classes.
  • Statistical Analysis: Perform paired Wilcoxon signed-rank tests (p<0.05 with Bonferroni correction) on each feature pair. Calculate Intraclass Correlation Coefficient (ICC) (two-way, absolute agreement). Classify features as "stable" if ICC > 0.9.

Protocol 2: Prospective Phantom Study for Slice Thickness

  • Phantom: Use a radiomics phantom with inserts of known geometry and varied texture (e.g., Catphan 600, QRM radiomics insert).
  • Image Acquisition: Scan phantom on a CT scanner using a standardized protocol (kVp, mAs). Reconstruct images at multiple slice thicknesses (e.g., 0.625mm, 1.25mm, 2.5mm, 5.0mm) while keeping kernel constant.
  • Feature Extraction: Segment inserts using thresholding. Extract features from identical VOIs across all thicknesses.
  • Analysis: Calculate coefficient of variation (CV) for each feature across thicknesses. Plot feature value vs. slice thickness to identify monotonic relationships and plateaus.

Protocol 3: Multi-Phase Feature Correlation Analysis

  • Data: Acquire multi-phase (non-contrast, arterial, venous, delayed) CT scans for 30 endometrial cancer patients as per standard care.
  • Registration & Segmentation: Rigidly register all post-contrast phases to the non-contrast phase. Propagate a single tumor VOI drawn on the venous phase across all registered series.
  • Feature Extraction & Normalization: Extract first-order and texture features from each phase. Z-score normalize feature values within each phase across the patient cohort.
  • Correlation: Compute Spearman's rank correlation (ρ) for each feature pair between different phases. Identify features with high inter-phase correlation (|ρ| > 0.8) as potentially robust to phase variation.

Visualization of Workflows and Relationships

G CT_Acquisition CT Image Acquisition Image_Data Resulting CT Image Data (Voxel Array & HU) CT_Acquisition->Image_Data Param1 Reconstruction Kernel Param1->CT_Acquisition Impact1 Alters Noise Texture & Spatial Frequency Param1->Impact1 Param2 Slice Thickness Param2->CT_Acquisition Impact2 Alters Z-axis Resolution & Partial Volume Effect Param2->Impact2 Param3 Contrast Phase Param3->CT_Acquisition Impact3 Alters Absolute Attenuation & Tissue Contrast Param3->Impact3 Radiomics_Process Radiomics Processing (Segmentation, Extraction) Image_Data->Radiomics_Process Feature_Space High-Dimensional Feature Vector Radiomics_Process->Feature_Space Downstream Downstream Analysis: Modeling, Prognostication Feature_Space->Downstream

Title: Parameter Impact on Radiomics Pipeline

G Start Start: Multi-Parameter CT Dataset Is_Same_Kernel Kernel Consistent Across Cohort? Start->Is_Same_Kernel Is_Same_Thickness Slice Thickness Consistent? Is_Same_Kernel->Is_Same_Thickness Yes Harmonize_Kernel Apply Feature Harmonization (e.g., ComBat) Is_Same_Kernel->Harmonize_Kernel No Caution Proceed with Extreme Caution: High Risk of Technical Bias Is_Same_Kernel->Caution No & Unable to Harmonize Is_Same_Phase Contrast Phase Consistent? Is_Same_Thickness->Is_Same_Phase Yes Exclude_Thickness Exclude Studies or Resample to Common Thickness Is_Same_Thickness->Exclude_Thickness No Is_Same_Thickness->Caution No & Cannot Resample/Exclude Phase_Specific_Model Build Phase-Specific Models or Use Delta Features Is_Same_Phase->Phase_Specific_Model No Proceed_Extract Proceed with Feature Extraction & Analysis Is_Same_Phase->Proceed_Extract Yes Harmonize_Kernel->Is_Same_Thickness Exclude_Thickness->Is_Same_Phase Phase_Specific_Model->Proceed_Extract

Title: Decision Flow for Handling Parameter Variability

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Tools for Robust Radiomics Research in Endometrial Cancer

Item / Solution Function & Relevance
Standardized Radiomics Phantom (e.g., QRM, Catphan) Provides a stable, known object for testing feature stability across scanners and protocols. Essential for protocol calibration and inter-scanner harmonization studies.
PyRadiomics (open-source Python package) A flexible, widely-adopted engine for standardized extraction of a comprehensive set of radiomic features from segmented volumes. Ensures reproducibility of feature definitions.
3D Slicer with SlicerRadiomics Extension Open-source platform for interactive medical image visualization, segmentation, and radiomics analysis. Integrates PyRadiomics and allows manual QA of segmentations.
Image Biomarker Standardization Initiative (IBSI) Handbook Reference document providing standardized definitions, nomenclature, and equations for radiomic features. Critical for reporting and cross-study comparison.
ComBat Harmonization (or similar) Statistical batch-effect correction tool adapted for harmonizing radiomic features derived from different scanners, kernels, or institutions. Reduces non-biological variance.
Elastic/Deformable Registration Software (e.g., ANTs, Elastix) For aligning multi-phase or longitudinal CT scans, enabling analysis of the same tumor region across different contrast phases or time points.
High-Performance Computing (HPC) or Cloud Resources Necessary for large-scale feature extraction, iterative segmentation validation, and complex harmonization/modeling workflows across large cohorts.

Within the context of CT radiomics for endometrial cancer research, the lack of standardization has historically hampered reproducibility and multi-institutional validation. The Image Biomarker Standardisation Initiative (IBSI) provides essential guidelines to address this, establishing standardized definitions and computational workflows for image biomarker extraction. This whitepaper details the IBSI guidelines' core principles, their critical application in endometrial cancer radiomics, and provides actionable experimental protocols for compliance.

The IBSI Framework: Core Components for Radiomics

The IBSI manual defines standardized nomenclature and processing steps for extracting radiomic features from medical images. Adherence ensures that features labeled identically across different studies are computationally equivalent.

Table 1: Key IBSI Standardization Phases for CT Radiomics

Phase Objective Key Standardized Elements
Image Acquisition & Pre-processing Ensure consistent input image quality. Voxel size resampling (e.g., to 1x1x1 mm³), interpolation method (e.g., B-spline), intensity discretization (fixed bin number/width).
Segmentation Define the volume of interest (VOI). Handling of segmentation margins, inclusion of partial volume effects.
Image Processing & Filtering Extract specific textural information. Definitions and implementations of filters (e.g., Laplacian of Gaussian, Wavelet).
Feature Extraction & Calculation Compute reproducible biomarkers. Mathematical formulas for ~1300 features across classes: First-order, Shape, Gray-Level Co-occurrence Matrix (GLCM), Gray-Level Run Length Matrix (GLRLM), etc.
Reporting Enable study replication. Mandatory reporting of all previous phase parameters (IBSI checklist).

Experimental Protocol: Implementing IBSI for Endometrial Cancer CT Studies

Below is a detailed methodology for a compliant radiomics analysis pipeline.

Protocol: IBSI-Compliant Radiomic Feature Extraction from Endometrial Tumor CT

  • Image Acquisition:

    • Use contrast-enhanced CT images (venous phase).
    • Document scanner make, model, kVp, mAs, reconstruction kernel, and slice thickness.
  • Image Pre-processing (IBSI-Compliant):

    • Resampling: Isotropically resample all images to a 1.0 mm x 1.0 mm x 1.0 mm voxel grid using B-spline interpolation (as per IBSI reference).
    • Intensity Discretization: Apply a fixed bin width of 25 Hounsfield Units (HU) across the entire dataset. Set the minimum intensity value to -150 HU.
  • Tumor Segmentation:

    • Manually delineate the 3D volume of interest (VOI) of the primary endometrial tumor on each axial slice by an expert radiologist, using a validated software platform (e.g., 3D Slicer).
    • Save the segmentation as a binary mask file.
  • Feature Extraction:

    • Use IBSI-validated software (e.g., PyRadiomics configured with IBSI settings, or the IBSI-reference phantom-validated software).
    • Input the resampled CT image and its corresponding binary mask.
    • Extract the full suite of standardized features, including:
      • Shape (3D): Volume, Surface Area, Sphericity.
      • First-Order: Energy, Entropy, Mean, Kurtosis.
      • Second-Order & Textural: All GLCM features (e.g., Joint Energy, Contrast), GLRLM features (e.g., Short Run Emphasis), Neighboring Gray Tone Difference Matrix (NGTDM), Gray-Level Size Zone Matrix (GLSZM).
  • Data Reporting:

    • Publish all parameters from steps 1-4 in supplementary materials, following the IBSI reporting checklist.

Visualization of the IBSI-Compliant Radiomics Workflow

G RawCT Raw CT Scan (Slice Thickness: t mm) PreProc IBSI Pre-processing 1. Isotropic Resampling (1 mm³) 2. Fixed Bin Width Discretization (25 HU) RawCT->PreProc DICOM Input Seg Expert 3D Tumor Segmentation (VOI Mask) PreProc->Seg Pre-processed Image FeatCalc IBSI-Compliant Feature Calculator Seg->FeatCalc Image + Mask FeatTab Standardized Radiomic Feature Table (Shape, First-Order, Textural) FeatCalc->FeatTab ~1300 Features Model Downstream Analysis (Prognostic Model, Biomarker Validation) FeatTab->Model Dataset

Diagram 1: IBSI-compliant radiomics pipeline for endometrial cancer.

The Scientist's Toolkit: Essential Research Reagents & Solutions

Table 2: Key Research Toolkit for IBSI-Compliant Endometrial Cancer Radiomics

Item Function & Importance Example/Note
IBSI Reference Manual (v2.0) Definitive guide for standardized feature definitions and phantom validation. Must be the primary reference for any pipeline development.
IBSI-Validated Software Software that has passed the IBSI digital phantom benchmark. PyRadiomics (with IBSI preset), MaZda, Custom code validated against IBSI reference.
3D Slicer / ITK-SNAP Open-source software for accurate 3D manual or semi-automatic tumor segmentation. Critical for generating the input VOI mask.
Digital Imaging and Communications in Medicine (DICOM) Viewer For initial image assessment and quality control. OsiriX MD, Horos.
IBSI Digital Phantom Dataset Digital reference images to validate and calibrate in-house feature extraction pipelines. Available from the IBSI consortium website.
Statistical Software (R/Python) For downstream analysis of extracted features (e.g., survival analysis, machine learning). Use radiomics package in Python; caret or glmnet in R.

The Critical Importance for Comparative Research

In endometrial cancer research, IBSI compliance enables:

  • Pooled Analysis: Reliable aggregation of data from multiple institutions for robust biomarker discovery.
  • Validation Studies: Independent groups can truly validate published signatures, accelerating clinical translation.
  • Drug Development: Pharmaceutical researchers can consistently use imaging biomarkers as secondary endpoints in clinical trials to assess treatment response (e.g., to anti-angiogenic or immunotherapy).
  • Meta-Analysis: Quantitative synthesis of radiomics literature becomes possible, moving the field beyond qualitative reviews.

Table 3: Impact of Standardization on Model Performance (Illustrative Data)

Study Condition Concordance Index (C-Index) for PFS Prediction 95% Confidence Interval Key Limitation Without IBSI
Single-Center, Non-Standardized Pipeline 0.72 [0.65 - 0.79] Overfitted, non-reproducible features.
Multi-Center, Non-Standardized Features 0.61 [0.55 - 0.67] Technical variability masks biological signal.
Multi-Center, IBSI-Compliant Pipeline 0.79 [0.73 - 0.85] Generalizable, validated biomarker signature.

For CT radiomics in endometrial cancer to progress from exploratory research to reliable tools for risk stratification and treatment monitoring, adherence to the IBSI guidelines is non-negotiable. By implementing the standardized protocols and toolkits outlined, researchers can directly compare findings, validate biomarkers across cohorts, and ultimately contribute to more personalized oncology. This foundational standardization is a prerequisite for any thesis aiming to establish the basic principles of robust radiomics in endometrial cancer.

Evidence and Efficacy: Validating Radiomics Against Clinical and Molecular Benchmarks

Within the broader thesis on CT radiomics for endometrial cancer, the clinical validation of developed models is the critical bridge between technical development and clinical utility. This section details the rigorous assessment of model performance in predicting recurrence-free survival (RFS) and overall survival (OS), the cornerstone for translational research and subsequent drug development targeting high-risk patients.

Core Performance Metrics & Quantitative Summaries

The validation of predictive models requires evaluation across multiple statistical dimensions. The following tables summarize key quantitative metrics from recent studies.

Table 1: Common Performance Metrics for Radiomics Models in Endometrial Cancer

Metric Formula / Description Interpretation in Clinical Context
C-index (Harrell's Concordance Index) ( C = \frac{{\text{number of concordant pairs}}}{{\text{number of comparable pairs}}} ) Measures the model's ability to correctly rank order survival times. A value of 0.5 is no better than chance, 1.0 is perfect discrimination.
Time-dependent AUC Area under the ROC curve at a specific time point (e.g., 3-year RFS). Evaluates discrimination at clinically relevant time horizons.
Calibration Slope & Intercept Assessed via calibration plots comparing predicted vs. observed survival. Slope of 1 and intercept of 0 indicate perfect calibration. Critical for accurate absolute risk estimation.
Integrated Brier Score (IBS) ( IBS = \frac{1}{n} \sum{i=1}^{n} \int0^{t{max}} ( \hat{S}(t|xi) - I(t_i > t) )^2 dt ) Measures overall accuracy across all time points (lower is better). Combines discrimination and calibration.

Table 2: Example Performance Data from Recent Studies (Synthesized from Current Literature)

Study Reference (Example) Cohort (n) Prediction Task Model Type Key Performance Metrics
Internal Validation Cohort 180 3-year RFS CT Radiomics + Clinical C-index: 0.82 (95% CI: 0.76-0.88); 3-yr AUC: 0.79
External Validation Cohort A 95 5-year OS CT Radiomics Signature C-index: 0.75; Calibration Slope: 0.92
External Validation Cohort B 112 Lymph Node Metastasis (Surrogate) Deep Learning Radiomics AUC: 0.87; Sensitivity: 0.81, Specificity: 0.79

Detailed Experimental Protocols for Validation

Protocol for Time-to-Event Analysis (Cox Proportional Hazards Validation)

Objective: To validate the prognostic performance of a radiomics-based risk score in a time-to-event framework.

Materials: Independent validation cohort with pre-treatment CT images, annotated clinical outcomes (RFS/OS status and time), and necessary clinical variables (e.g., age, stage, histology).

Procedure:

  • Feature Application: Apply the pre-defined radiomics feature extraction pipeline (from model development) to the validation cohort CT images.
  • Risk Score Calculation: Calculate the radiomics signature score for each patient using the locked, pre-defined formula (e.g., linear combination of selected features).
  • Model Fitting: Fit a univariable Cox proportional hazards model with the radiomics score as the sole predictor. Record the C-index.
  • Multivariable Validation: Fit a multivariable Cox model including the radiomics score and key clinical predictors (e.g., FIGO stage, lymphovascular space invasion). Assess the significance (hazard ratio, p-value) and added predictive value of the radiomics score.
  • Proportional Hazards Assumption Check: Test using Schoenfeld residuals. Violations may necessitate time-dependent coefficients or stratified models.
  • Discrimination Assessment: Calculate the time-dependent AUC at relevant clinical time points (e.g., 1, 3, 5 years).
  • Calibration Assessment: Generate a calibration plot comparing predicted vs. observed survival probability at a key time point (e.g., 3 years), often using bootstrap or cross-validation for bias correction.

Protocol for Binary Endpoint Validation (e.g., High vs. Low Risk)

Objective: To validate a model that classifies patients into discrete risk groups.

Procedure:

  • Risk Stratification: Apply the pre-defined risk stratification cutoff (e.g., median of radiomics score from training) to the validation cohort to assign patients to "High-Risk" or "Low-Risk" groups.
  • Survival Curve Comparison: Generate Kaplan-Meier curves for RFS/OS for the two groups. Compare using the log-rank test. Report hazard ratio (HR) from a univariable Cox model.
  • Classifier Metrics: If the endpoint is binary (e.g., recurrence within 3 years), calculate confusion matrix metrics: Sensitivity, Specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV), and Overall Accuracy.
  • Decision Curve Analysis (DCA): Perform DCA to evaluate the clinical net benefit of using the radiomics model across a range of threshold probabilities, compared to "treat all" or "treat none" strategies.

Visualizing the Validation Workflow and Biological Context

Title: Radiomics Model Clinical Validation Workflow

G cluster_pathways Key Pathways in Endometrial Cancer Progression Radiomics Radiomics Phenotype (e.g., Heterogeneity, Shape) Bio1 Hypothesized Biological Correlates Radiomics->Bio1  Infers Bio2 Cellular & Molecular Pathways Bio1->Bio2  Driven by P1 PI3K/AKT/mTOR (Proliferation) P2 p53/RB (Apoptosis/Cell Cycle) P3 Wnt/β-catenin (Invasion) P4 MMPs / EMT (Metastasis) Clin Clinical Outcome P1->Clin  Impacts P2->Clin  Impacts P3->Clin  Impacts P4->Clin  Impacts

Title: Radiomics Correlation to Biology & Outcome

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Materials for Radiomics Clinical Validation Studies

Item / Solution Function in Validation Example / Specification
DICOM Image Repository Source of raw medical imaging data for the independent validation cohort. PACS archive; Publicly available datasets (e.g., TCIA - The Cancer Imaging Archive).
Radiomics Extraction Software (Locked Version) Applies the pre-defined, frozen feature extraction pipeline to new images. PyRadiomics (v3.0.1), IBEX, or in-house software with version control.
Statistical Analysis Software Performs survival analysis, calculates performance metrics, and generates figures. R (survival, timeROC, riskRegression packages), Python (scikit-survival, lifelines), SAS.
Clinical Data Management System (CDMS) Manages de-identified patient clinical data, pathology reports, and outcome data, ensuring linkage to imaging. REDCap, OpenClinica, or a secure relational database.
Pathology & Biomarker Assays Provides ground truth for biological correlation studies (e.g., molecular subtyping). Immunohistochemistry kits (p53, MSI, PTEN), Next-Generation Sequencing panels (POLE, CTNNB1).
High-Performance Computing (HPC) Cluster Enables batch processing of large imaging datasets and computationally intensive resampling validation. Cloud-based (AWS, GCP) or on-premise cluster for reproducible analysis.

Within the broader thesis on CT radiomics endometrial cancer basic principles research, this technical guide addresses the critical need to establish robust, quantitative bridges between non-invasive imaging phenotypes and the definitive molecular classification of endometrial carcinoma (EC). The Cancer Genome Atlas (TCGA) classification—POLE (ultramutated), p53 (abnormal) (serous-like), MMRd (hypermutated), and NSMP (no specific molecular profile)—has redefined prognostic stratification and therapeutic planning. This whitepaper details methodologies to correlate radiomic features extracted from pre-treatment CT images with these molecular subtypes, aiming to develop imaging biomarkers for non-invasive molecular profiling.

Molecular Subtypes in Endometrial Cancer: Core Definitions

The four molecular subtypes are defined by specific molecular alterations with distinct clinical outcomes.

Table 1: Key Characteristics of EC Molecular Subtypes

Molecular Subtype Defining Alteration Mutation Rate Typical Histology Prognosis
POLE-mutated Pathogenic mutations in POLE exonuclease domain Ultra-high (>100 mutations/Mb) Endometrioid (high-grade) Excellent
MMRd Deficiency in Mismatch Repair (MLH1, MSH2, MSH6, PMS2) High (>10 mutations/Mb) Endometrioid Intermediate
p53-abnormal TP53 mutations (IHC abnormal/null) Low Serous, Carcinosarcoma, some high-grade endometrioid Poor
NSMP None of the above Low Endometrioid Intermediate

Radiomic Feature Extraction and Analysis Workflow

A standardized pipeline is essential for reproducible radiomic research linking phenotypes to molecular status.

Experimental Protocol: Radiomics-Molecular Correlation Study

Phase 1: Cohort & Data Curation

  • Patient Selection: Retrospective cohort of histologically confirmed EC patients with:
    • Pre-treatment contrast-enhanced CT (venous phase) of abdomen/pelvis.
    • Complete molecular profiling (POLE sequencing, p53 IHC, MMR IHC ± MSI testing).
  • Image Data Pre-processing:
    • Resampling: Isotropic voxel resampling (e.g., 1x1x1 mm³) to standardize spatial resolution.
    • Intensity Normalization: Use Z-score normalization or fixed range (e.g., -1000 to 1000 HU) to correct for scanner variability.
    • Image Interpolation: Utilize B-spline interpolation for segmentation alignment.

Phase 2: Tumor Segmentation

  • Modality: Manual or semi-automatic segmentation by expert radiologists on the venous phase.
  • Target: Delineate the entire primary tumor volume (3D segmentation), avoiding necrotic cysts and adjacent normal tissue.
  • Software: Use platforms like 3D Slicer, ITK-SNAP, or proprietary research software. Store segmentations as RTSS or NRRD files.

Phase 3: Radiomic Feature Extraction

  • Platform: PyRadiomics (open-source) or equivalent commercial software.
  • Feature Classes (per IBSI guidelines):
    • First-Order Statistics: Intensity histogram features (Energy, Entropy, Kurtosis, Skewness).
    • Shape-based Features: Volume, Sphericity, Surface Area to Volume ratio.
    • Texture Features:
      • Gray Level Co-occurrence Matrix (GLCM): Measures spatial relationships of voxels.
      • Gray Level Run Length Matrix (GLRLM): Quantifies runs of consecutive voxels.
      • Gray Level Size Zone Matrix (GLSZM): Describes zones of connected voxels.
      • Neighboring Gray Tone Difference Matrix (NGTDM): Captures local intensity differences.
      • Gray Level Dependence Matrix (GLDM): Measures gray level dependencies.
  • Filter Application: Apply Laplacian of Gaussian (LoG) and wavelet filters to derive features from transformed images.

Phase 4: Feature Pre-processing & Selection

  • Stability Testing: Perform test-retest and inter-observer segmentation analyses to remove unstable features (ICC < 0.75).
  • Normalization: Standardize feature values (zero mean, unit variance).
  • Reduction: Address multicollinearity using Pearson correlation (remove one of pair with |r| > 0.85). Apply LASSO regression or Random Forest feature importance for final selection against the molecular target.

Phase 5: Statistical & Machine Learning Modeling

  • Objective: Build a classifier to predict molecular subtype (POLE vs. p53abn vs. MMRd vs. NSMP) or a regression model for continuous genomic variables.
  • Data Splitting: 70%/30% training/internal validation split. External validation on an independent cohort is mandatory.
  • Algorithms: Employ Random Forest, Support Vector Machine (SVM), or Logistic Regression. Use nested cross-validation for hyperparameter tuning.
  • Evaluation: Report AUC, accuracy, sensitivity, specificity, F1-score. Assess clinical utility via decision curve analysis.

Key Signaling Pathways & Radiomic Correlates

Understanding the biological basis of radiomic features is crucial for interpretation.

Diagram 1: Molecular Pathways & Putative Radiomic Phenotypes

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Materials for Integrated Radiomics-Molecular Analysis

Item/Category Function in Research Example Product/Source
Radiomics Software Standardized feature extraction from medical images. PyRadiomics (open-source), 3D Slicer Radiomics extension
Segmentation Tool Precise 3D delineation of tumor volumes on CT. ITK-SNAP, 3D Slicer, Mint Medical mint Lesion
Molecular IHC Antibodies Protein-level detection of molecular classifiers. p53 (DO-7, Ventana), MLH1 (M1), MSH2 (G219-1129), MSH6 (SP93), PMS2 (EPR3947)
POLE Sequencing Assay Detection of pathogenic exonuclease domain mutations. Targeted NGS panel (e.g., Oncomine Comprehensive Assay), Sanger sequencing for hotspots
MSI Testing Kit Confirmatory testing for MMRd status. PCR-based MSI Analysis System (Promega), NGS-based MSI callers
Statistical Software Advanced feature selection and predictive modeling. R (caret, glmnet), Python (scikit-learn, PyTorch)
Bioinformatics Pipeline Align and analyze NGS data for POLE/TP53 mutations. GATK, VarScan, custom scripts for mutation signature analysis

Table 3: Selected Recent Studies Linking Radiomics to EC Molecular Subtypes

Study (Year) Cohort Size Key Radiomic Features Associated with Subtype Predictive Performance (AUC/Accuracy)
Stanzione et al. (2021) N=120 p53abn: Higher GLCM Entropy, GLSZM Zone Variance. POLE: Lower NGTDM Coarseness. Multiclass AUC: 0.78 (p53abn vs others)
Li et al. (2022) N=147 MMRd: Lower First-Order Uniformity. p53abn: Higher Shape Compactness. Accuracy for MMRd: 0.73, p53abn: 0.82
Fusco et al. (2023) N=98 (External Val.) Wavelet-HHLGLCMCorrelation significant for POLE. Combined model AUC for POLE: 0.88 (Training), 0.81 (Validation)
Cheng et al. (2023) N=205 p53abn: Higher Shape Sphericity, Lower GLRLM Run Entropy. Random Forest accuracy for 4-class: 0.71

Linking CT radiomic phenotypes to POLE, p53, and MMRd status represents a transformative frontier in endometrial cancer research. This guide outlines the rigorous technical protocols, biological rationale, and analytical tools required to build generalizable models. Success in this endeavor promises to accelerate the integration of non-invasive radiomic biomarkers into clinical trials and personalized treatment strategies, a core objective of foundational radiomics research in endometrial cancer.

This technical guide provides a comparative analysis of three critical methodologies for the assessment of endometrial cancer: radiomics, subjective MRI assessment, and genomic classifiers. Framed within the broader context of foundational CT radiomics research for endometrial cancer, this document details the principles, applications, experimental protocols, and integrative potential of these modalities for researchers and drug development professionals.

Core Definitions and Comparative Metrics

Table 1: Core Characteristics and Performance Metrics of Assessment Modalities

Feature Radiomics (CT-based) Subjective MRI Assessment Genomic Classifiers
Primary Data Source Quantitative features from medical images (CT, MRI). Qualitative visual interpretation of MRI sequences. DNA/RNA from tumor tissue (biopsy/resection).
Key Output High-dimensional feature data (shape, intensity, texture). Categorical staging (FIGO), myometrial invasion depth, lymph node suspicion. Molecular subtype (TCGA: POLE-mut, MMR-d, NSMP, p53abn), risk score.
Typical Performance (Endometrial CA) AUC: 0.78-0.89 for LVSI prediction; AUC: 0.82-0.91 for high-grade histology. Accuracy: ~85-90% for deep myometrial invasion; Sensitivity: ~75-80% for lymph node metastasis. 5-year RFS prediction: ProMisE classifier stratifies risk from >90% (POLE) to <40% (p53abn).
Automation Level High (post-segmentation). Low (expert-dependent). High (post-sequencing).
Temporal Resolution Pre-treatment, during therapy. Pre-treatment, during therapy. Pre-treatment (single snapshot).
Major Limitation Standardization of segmentation and feature extraction. Inter-observer variability. Requires invasive tissue sampling; spatial heterogeneity.
Cost & Accessibility Moderate (requires software, computational resources). Moderate (MRI machine, radiologist expertise). High (sequencing costs, bioinformatics infrastructure).

Experimental Protocols

Protocol for a CT Radiomics Study in Endometrial Cancer

Aim: To develop a radiomics signature from pre-treatment CT images for predicting lymphovascular space invasion (LVSI).

  • Cohort Selection: Retrospective collection of 200+ endometrial cancer patients with pre-treatment contrast-enhanced abdominal-pelvic CT and confirmed surgical pathology LVSI status.
  • Image Acquisition & Preprocessing: Standardize CT protocol (e.g., 120 kVp, slice thickness ≤3 mm). Normalize voxel intensities (e.g., Z-score) and resample to isotropic voxels (1x1x1 mm³) to ensure spatial consistency.
  • Tumor Segmentation: Manual or semi-automatic segmentation of the primary tumor volume (VOI) on each CT slice by two blinded radiologists using 3D Slicer or ITK-SNAP software.
  • Feature Extraction: Extract ~1000+ radiomics features per VOI using PyRadiomics or similar library. Categories include:
    • First-Order Statistics: Intensity-based (mean, kurtosis, entropy).
    • Shape-based: Volume, sphericity, surface area.
    • Texture Features: Gray-Level Co-occurrence Matrix (GLCM), Gray-Level Run-Length Matrix (GLRLM), Neighborhood Gray-Tone Difference Matrix (NGTDM).
  • Feature Selection & Model Building: Apply inter-observer correlation coefficient (ICC>0.8) to select robust features. Use LASSO regression on a training cohort (70%) to reduce dimensionality and select most predictive features. Build a logistic regression or random forest model for LVSI prediction.
  • Validation: Test the model on the hold-out validation cohort (30%). Evaluate performance with AUC, accuracy, sensitivity, and specificity. Perform decision curve analysis for clinical utility.

Protocol for Subjective MRI Assessment

Aim: To evaluate deep myometrial invasion (≥50%) in endometrial cancer.

  • Imaging Protocol: Perform pelvic MRI on a 1.5T or 3T scanner. Essential sequences: T2-weighted fast spin-echo (sagittal, axial oblique), diffusion-weighted imaging (DWI, b-values 0, 800-1000 s/mm²), and dynamic contrast-enhanced (DCE) T1-weighted series.
  • Blinded Review: Two expert gynecological radiologists, blinded to histopathology, independently review MRIs.
  • Assessment Criteria:
    • Myometrial Invasion: On T2WI, disruption of the junctional zone by intermediate-signal tumor. Deep invasion is tumor extension to the outer 50% of the myometrium or to the serosa.
    • DWI/ADC: Qualitative assessment of restricted diffusion in the myometrium.
    • DCE: Assessment of enhancement patterns relative to myometrium.
  • Analysis: Calculate sensitivity, specificity, and accuracy for each reviewer. Compute inter-observer agreement using Cohen's kappa statistic (κ).

Protocol for Genomic Classifier Application

Aim: To classify endometrial carcinoma according to the ProMisE (Proactive Molecular Risk Classifier for Endometrial Cancer) system.

  • Tissue Procurement: Formalin-fixed, paraffin-embedded (FFPE) tumor tissue from biopsy or hysterectomy specimen.
  • Sequencing & Analysis:
    • POLE Sequencing: Sanger or next-generation sequencing of the exonuclease domain of the POLE gene to identify hotspot mutations (e.g., P286R, V411L).
    • MMR Protein IHC: Immunohistochemistry for MLH1, MSH2, MSH6, and PMS2. Loss of nuclear staining in tumor cells indicates MMR-deficiency (MMR-d).
    • p53 IHC: Interpretation as wild-type (normal) or abnormal (overexpression/null) pattern.
  • Classification: Apply the hierarchical algorithm:
    1. Presence of a pathogenic POLE mutation → POLE-mut.
    2. If not POLE-mut, evidence of MMR-d → MMR-d.
    3. If neither, abnormal p53 → p53abn.
    4. If none of the above → No Specific Molecular Profile (NSMP).

Visualizing Integrative Analysis Workflows

G cluster_inputs Input Data Streams cluster_process Analytical Layer MRI MRI Subj_Assess Subjective MRI Assessment MRI->Subj_Assess CT CT Radiomics_Feat Radiomics Feature Extraction CT->Radiomics_Feat Genomics Genomics Genomic_Class Genomic Classifier Genomics->Genomic_Class Fusion Data Fusion & Multimodal Model Subj_Assess->Fusion Radiomics_Feat->Fusion Genomic_Class->Fusion Output Comprehensive Risk Prediction Fusion->Output

Integrative Multimodal Analysis Workflow

G Start Patient with Endometrial Cancer MRI_Node Clinical MRI (Subjective Read) Start->MRI_Node CT_Node Research CT Scan (For Radiomics) Start->CT_Node Biopsy Tumor Biopsy Start->Biopsy Pheno Phenotypic Data: -Stage (FIGO) -Invasion Depth MRI_Node->Pheno RadioSig Radiomics Signature: -LVSI Risk -Grade Score CT_Node->RadioSig MolSub Molecular Subtype: (e.g., p53abn, MMR-d) Biopsy->MolSub Decision Integrative Clinical Decision Pheno->Decision RadioSig->Decision MolSub->Decision Plan_A Adjuvant Therapy Plan A: Radiotherapy Decision->Plan_A Low Integrated Risk Plan_B Adjuvant Therapy Plan B: Chemotherapy + Targeted Trial Decision->Plan_B High Integrated Risk

Decision Logic for Integrated Patient Management

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Materials and Reagents for Featured Methodologies

Item Category Function in Research Context
3D Slicer / ITK-SNAP Radiomics Software Open-source platforms for precise 3D manual or semi-automatic segmentation of tumor volumes on CT/MRI, a critical step for robust feature extraction.
PyRadiomics Python Library Radiomics Analytics A flexible open-source library for the extraction of a standardized set of radiomics features from medical images, enabling reproducible analysis.
Formalin-Fixed Paraffin-Embedded (FFPE) Tissue Blocks Genomics Resource The standard archival source of tumor DNA/RNA for retrospective genomic studies and classifier development/validation.
MSIplex or similar PCR Panel Genomics Reagent A multiplex PCR-based kit for detecting microsatellite instability (MSI), a surrogate for MMR-deficiency, from FFPE DNA.
Anti-p53 (DO-7) Antibody Genomics/IHC Reagent Primary antibody used in immunohistochemistry to assess p53 status; abnormal patterns (overexpression/null) define the p53abn molecular subgroup.
Phantom (e.g., Credence Cartridge Radiomics) Radiomics Calibration A physical imaging calibration device used to ensure scanner harmonization and test feature stability across different imaging platforms.
R Statistical Environment with 'glmnet' Data Analysis Tool Essential software environment for performing advanced statistical analysis, including LASSO regression for radiomics feature selection and model building.

The Role of Deep Learning Radiomics and Convolutional Neural Networks (CNNs)

The integration of deep learning radiomics, powered by Convolutional Neural Networks (CNNs), represents a paradigm shift in quantitative imaging analysis for oncology. Within the specific scope of CT radiomics for endometrial cancer research, this approach moves beyond traditional, hand-crafted feature extraction to autonomously learn hierarchical, prognostic, and predictive patterns directly from medical images. This technical guide details the core principles, methodologies, and experimental frameworks underpinning this fusion, positioning it as a cornerstone for advancing personalized therapeutic strategies and drug development in endometrial oncology.

Core Technical Principles

From Handcrafted Radiomics to Deep Learning Radiomics

Traditional radiomics relies on manually engineered algorithms to extract predefined mathematical features (e.g., texture, shape, intensity) from segmented regions of interest (ROIs). Deep learning radiomics utilizes CNNs to learn these feature representations directly from image data in an end-to-end fashion. CNNs, with their hierarchical layers of convolutional filters, poolings, and non-linear activations, can discern subtle, complex patterns often imperceptible to human eyes or traditional methods, potentially capturing the intratumoral heterogeneity of endometrial carcinomas.

Architecture of CNNs in Medical Imaging

A standard CNN for CT analysis typically involves:

  • Input Layer: Accepts 2D slices or 3D patches of CT images.
  • Convolutional & Activation Layers: Apply learnable kernels to detect local features (edges, textures, patterns). ReLU is the standard activation function.
  • Pooling Layers: Reduce spatial dimensionality, providing translation invariance.
  • Fully Connected Layers: Integrate high-level features for final classification/regression tasks (e.g., tumor grade, lymphovascular space invasion (LVSI) status, molecular subtype prediction).
  • Output Layer: Provides the prediction (e.g., probability scores).

Experimental Protocols & Methodologies

A robust experimental pipeline for CT-based deep learning radiomics in endometrial cancer involves the following critical phases.

Data Curation & Preprocessing Protocol
  • Multi-institutional Cohort Curation: Gather retrospective CT (venous phase) datasets with corresponding histopathological ground truth (e.g., histotype, grade, LVSI, MSI/POLE status). Ethical approval and de-identification are mandatory.
  • Image Standardization: Resample all CT volumes to isotropic voxel spacing (e.g., 1x1x1 mm³). Apply intensity normalization (e.g., Z-score using muscle tissue attenuation) to minimize scanner variability.
  • Tumor Segmentation: The region of interest (ROI) is defined via:
    • Manual Segmentation: By expert radiologists (gold standard, time-intensive).
    • Semi-automatic/Supervised Deep Learning Segmentation: Using a U-Net architecture trained on manual contours. This model is trained separately prior to the main radiomics analysis.
  • Data Augmentation: To combat limited dataset sizes, apply in-plane transformations (rotation, scaling, flipping) and moderate intensity shifts during model training.
Model Development & Training Protocol
  • Model Architecture Selection: Choose a 3D CNN backbone (e.g., 3D ResNet18, EfficientNet) suitable for volumetric data. A hybrid approach ("fusion model") is often optimal.
  • Implementation: Develop model using PyTorch or TensorFlow. Initialize with weights pre-trained on natural (ImageNet) or medical (RadImageNet) images.
  • Training Configuration:
    • Loss Function: Binary Cross-Entropy for classification tasks.
    • Optimizer: AdamW optimizer (weight decay=1e-4).
    • Learning Rate: Start at 1e-4, reduce on plateau.
    • Batch Size: Maximum permissible by GPU memory (typically 8-16 for 3D).
    • Validation: Use a held-out validation set for epoch selection.
  • Training Regimen: Train for a maximum of 200 epochs with early stopping (patience=30 epochs) monitoring validation loss.
Performance Evaluation Protocol
  • Data Splitting: Use a strict patient-wise split (e.g., 70:15:15 for Training:Validation:Test).
  • Metrics: Calculate on the independent test set: Accuracy, AUC (Area Under the ROC Curve), Precision, Recall, F1-Score.
  • Statistical Analysis: Compare model performance against traditional radiomics models and clinical models using DeLong's test for AUC comparison.

Table 1: Performance Comparison of Models for Predicting High-Grade Endometrioid Carcinoma

Model Type AUC (95% CI) Accuracy Sensitivity Specificity F1-Score
Clinical Model (Age, BMI) 0.68 (0.60-0.75) 0.65 0.62 0.67 0.63
Traditional Radiomics Model 0.79 (0.72-0.85) 0.74 0.71 0.76 0.72
3D CNN (Deep Radiomics) 0.88 (0.83-0.92) 0.82 0.85 0.80 0.83
Fusion (CNN + Clinical) 0.91 (0.87-0.95) 0.86 0.88 0.85 0.86

Table 2: Key Studies on Deep Learning Radiomics in Endometrial Cancer (2022-2024)

Study (Year) Cohort Size Primary Task Model Used Key Result (AUC)
Li et al. (2022) n=415 Preoperative LVSI Prediction Custom 3D ResNet 0.89
Park et al. (2023) n=328 Molecular Subtype Classification 3D DenseNet-121 0.84 (for CN-high)
Zhao & Group (2024) n=501 Deep Myometrial Invasion Detection Vision Transformer 0.87
Meta-analysis (2024) n=2,100+ Various Risk Stratifications Ensemble CNNs Pooled AUC: 0.85

Visualizations

workflow CT_Scan CT_Scan Preprocessing Preprocessing CT_Scan->Preprocessing DICOM Segmentation Segmentation Preprocessing->Segmentation Standardized Data_Prep Data_Prep Segmentation->Data_Prep 3D ROI CNN_Model CNN_Model Data_Prep->CNN_Model Training Data Prediction Risk Stratification (e.g., LVSI, Molecular Type) CNN_Model->Prediction Inference Clinical_Data Clinical_Data Clinical_Data->Data_Prep Tabular

Deep Learning Radiomics Workflow

hybrid_model Input_CT CT Volumes CNN_Branch 3D CNN Backbone (Feature Extraction) Input_CT->CNN_Branch Input_Clinical Clinical Variables Clinical_Branch Fully Connected Layers Input_Clinical->Clinical_Branch Feature_Concat Feature Concatenation CNN_Branch->Feature_Concat Clinical_Branch->Feature_Concat FC_Layers Joint Fully Connected Layers Feature_Concat->FC_Layers Output Predicted Outcome FC_Layers->Output

Hybrid CNN-Clinical Fusion Model Architecture

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials & Computational Tools for Deep Learning Radiomics Experiments

Item/Category Example Product/Software Function in Research
Medical Imaging Database The Cancer Imaging Archive (TCIA) Endometrial Collections Provides publicly available, curated CT image datasets with associated metadata for model training and validation.
Annotation/Segmentation Tool 3D Slicer, ITK-SNAP Open-source software for manual or semi-automatic delineation of tumor volumes on CT scans (creating ground truth ROIs).
Deep Learning Framework PyTorch, TensorFlow with MONAI extension Core libraries for building, training, and validating 3D CNN models. MONAI provides medical imaging-specific functions.
Radiomics Feature Engine PyRadiomics Used for benchmarking, to extract handcrafted radiomics features for comparison with deep learning features.
High-Performance Computing NVIDIA GPUs (e.g., A100, V100), Cloud Platforms (AWS, GCP) Provides the necessary computational power for training complex 3D CNN models on large volumetric datasets.
Statistical Analysis Suite R (with pROC, caret packages), Python (scikit-learn, SciPy) For rigorous statistical evaluation, hypothesis testing, and result visualization.

Radiomics, the high-throughput extraction of quantitative features from medical images, is transitioning from a research curiosity to a tool with significant potential for clinical decision support and trial enrichment in endometrial cancer (EC). This guide details the technical readiness for integrating radiomics into EC management, framed within foundational radiomics principles: image acquisition, segmentation, feature extraction, and model validation. The ultimate goal is to derive non-invasive biomarkers that reflect tumor pathophysiology, aiding in prognosis, molecular subtyping, and predicting treatment response.

Foundational Principles & Quantitative Data

The radiomics pipeline yields vast quantitative data. Core feature categories are summarized below.

Table 1: Core Radiomics Feature Categories in Endometrial Cancer

Category Description Example Features Hypothesized Biological Correlation in EC
First-Order Statistics Describe voxel intensity distribution without spatial relationships. Mean, Median, Skewness, Kurtosis, Entropy Tumor cellularity, necrosis, heterogeneity.
Shape & Size 3D descriptors of the tumor region of interest (ROI). Volume, Surface Area, Sphericity, Compactness Tumor growth pattern, invasiveness.
Texture (Second-Order) Quantify intra-tumoral heterogeneity via spatial relationships of voxel intensities. Gray Level Co-occurrence Matrix (GLCM): Contrast, Correlation, Energy. Gray Level Run Length Matrix (GLRLM): Run Length Non-Uniformity. Reflects underlying genomic instability, histological diversity (e.g., tumor grade).
Higher-Order Features from filtered images or transform domains. Wavelet features, Laplacian of Gaussian (LoG) filtered features. Captures multi-scale heterogeneity patterns.

Table 2: Recent Performance Metrics of EC Radiomics Models (2022-2024)

Clinical Task Imaging Modality Key Radiomics Signature Performance (AUC/Accuracy) Study Reference
LVSI Prediction T2W MRI Wavelet-HLLfirstorder90Percentile + Shape_Sphericity AUC: 0.87-0.91 Liu et al., 2023
Molecular Classification (POLE vs p53abn) CE-T1W MRI GLCMCorrelation + GLSZMSizeZoneNonUniformity AUC: 0.84 Stanzione et al., 2022
Myometrial Invasion Depth Multi-parametric MRI Combined DWI & T2W texture features Accuracy: 89% Feng et al., 2024
Recurrence Risk Stratification Pre-op CT ShapeMaximum3DDiameter + GLRLMRunVariance C-index: 0.78 Wang et al., 2023

Experimental Protocols for Key Studies

Protocol A: Building a Radiomics Model for Lymphovascular Space Invasion (LVSI) Prediction from MRI

  • Cohort & Image Acquisition: Retrospectively enroll 200+ pathologically confirmed EC patients. Acquire pre-operative pelvic T2-weighted MRI on a 3T scanner with standardized protocol (slice thickness ≤3mm, no gap).
  • Tumor Segmentation: Manually delineate the entire primary tumor volume (3D ROI) on each T2WI slice by two expert radiologists using ITK-SNAP software. Calculate inter-observer Dice coefficient (target >0.85).
  • Feature Extraction & Stability: Extract ~1500 radiomics features per ROI using PyRadiomics (v3.0.1). Apply intra-class correlation coefficient (ICC>0.75) test on 30 randomly resegmented cases to filter stable features.
  • Feature Selection & Modeling: Divide data (70:30 training:validation). Apply ComBat harmonization for multi-center data. Use least absolute shrinkage and selection operator (LASSO) regression on the training set for feature selection. Build a logistic regression or support vector machine (SVM) classifier.
  • Validation: Evaluate model performance on the hold-out validation set using AUC, sensitivity, specificity. Perform decision curve analysis (DCA) to assess clinical net benefit.

Protocol B: Radiomics for Proficient Mismatch Repair (pMMR) vs. Deficient Mismatch Repair (dMMR) Classification

  • Ground Truth: Use immunohistochemistry (IHL) for MLH1, PMS2, MSH2, MSH6 proteins on surgical specimens to define dMMR (loss of expression) and pMMR (intact expression).
  • Multi-modal Imaging: Register and segment tumors on both T2W and contrast-enhanced T1W (CE-T1W) MRI sequences.
  • Multi-sequence Feature Fusion: Extract separate radiomics feature sets from T2W and CE-T1W ROIs. Perform Z-score normalization per feature type. Use a canonical correlation analysis (CCA) based method to fuse the two feature sets into a joint representation.
  • Model Development: Employ a machine learning pipeline with mutual information for initial filtering, followed by recursive feature elimination (RFE) with cross-validation. Train an ensemble classifier (e.g., Random Forest).
  • Biological Correlation: Perform gene set enrichment analysis (GSEA) on matched RNA-seq data from The Cancer Genome Atlas (TCGA-UCEC) to link the final radiomics signature to underlying immune and proliferation pathways.

Visualizing the Radiomics-to-Insight Pathway

G MRI_CT Medical Imaging (MRI/CT) Segmentation 3D Tumor Segmentation MRI_CT->Segmentation Extraction High-Dimensional Feature Extraction Segmentation->Extraction DB Radiomics Database Extraction->DB Selection Feature Selection DB->Selection Model ML/AI Model Development Selection->Model Validation Clinical Validation Model->Validation Output1 Decision Support: LVSI, Stage, Type Validation->Output1 Output2 Trial Enrichment: Molecular Class, Prognosis Validation->Output2

Radiomics Pipeline from Image to Clinical Application

G Radiomics_Signature Radiomics Signature (e.g., High Texture Heterogeneity) Tumor_Microenvironment Aggressive Tumor Microenvironment Radiomics_Signature->Tumor_Microenvironment Infers Hypoxia Hypoxia & Necrosis Tumor_Microenvironment->Hypoxia Cellular_Diversity Cellular & Nuclear Pleomorphism Tumor_Microenvironment->Cellular_Diversity Angioinvasion Angioinvasion & LVSI Tumor_Microenvironment->Angioinvasion Molecular_Alterations Molecular Alterations (e.g., p53 mutation) Tumor_Microenvironment->Molecular_Alterations

Biological Correlates of Radiomics Features in EC

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Tools for EC Radiomics Research

Tool/Category Specific Solution/Software Function & Application in EC Radiomics
Medical Imaging 3T MRI with pelvic phased-array coil; Standardized T2W, DWI, DCE sequences. Provides high-resolution, multi-parametric image data for primary tumor and local staging.
Segmentation Software 3D Slicer, ITK-SNAP, MITK, Commercial AI-assisted platforms. Enables precise manual, semi-, or fully-automatic 3D delineation of the tumor ROI, the foundational step.
Radiomics Engine PyRadiomics (open-source), MaZda, LIFEx. Standardized libraries for extracting hundreds of quantitative features from segmented ROIs, ensuring reproducibility.
Feature Harmonization ComBat, pyComBat, RAVEL. Corrects for non-biological variance (scanner, protocol differences) in multi-center studies.
Machine Learning scikit-learn (Python), Caret (R), TensorFlow/PyTorch for deep learning. Provides algorithms for feature selection, model building (e.g., LASSO, SVM, Random Forest), and validation.
Statistical Analysis R, Python (with pandas, statsmodels). Performs survival analysis (Cox model), decision curve analysis, and comprehensive statistical testing.
Pathology Correlation Digital pathology scanners, QuPath software. Enables spatial correlation of radiomic features with histologic ground truth (grade, LVSI, subtype) on whole-slide images.

Conclusion

CT radiomics represents a powerful, non-invasive tool for decoding the phenotypic heterogeneity of endometrial cancer, translating routine imaging into quantifiable biomarkers. The foundational principles establish its biological plausibility, while rigorous methodological pipelines enable the extraction of stable, informative features. Addressing reproducibility through standardization and harmonization is paramount for clinical translation. Current evidence validates its potential in predicting aggressive histopathological and molecular features, offering a complementary approach to invasive tissue sampling. For researchers and drug developers, radiomics presents a paradigm shift for patient stratification in clinical trials, biomarker discovery, and the development of imaging surrogates for treatment response. Future directions must focus on large-scale, prospective multicentric validation, integration with multi-omics data (radiogenomics), and the development of FDA-qualified radiophenotypes to realize its full potential in precision oncology for endometrial cancer.