Calcium Correction

Mechanistic ML model predicting ionized calcium from nine routine ICU labs, outp...

clinical-mlicubiostatisticsPythonscikit-learnXGBoost

Premise

Clinicians need ionized calcium (the physiologically active form) but only total calcium is routinely measured in ICUs. The standard fix is a 1973 correction formula (Payne) that adjusts for albumin alone. The problem: it was derived on healthy outpatients and achieves r²=0.42 in the ICU populations it's actually used on. Fifty years of wide adoption, demonstrably wrong.

How it evolved

Started as a straightforward formula comparison on MIMIC-III. Grew into a full mechanistic modeling study when it became clear that existing corrections were missing two things the literature had mostly ignored: pH (which governs calcium-albumin binding equilibrium directly) and the anion gap components (sodium, chloride, bicarbonate), which Goldwasser and Yap had shown matter but hadn't incorporated mechanistically. The dataset grew from ~4K to 130K+ paired measurements as MIMIC-IV v3.1 became available, and the validation layer expanded to include both temporal (MIMIC-III) and geographic (eICU, 131 hospitals) external datasets.

Technical crux

The model is a fraction-offset sigmoid (iCa = offset + sigmoid(β) × total_calcium) which constrains the ionized fraction to physiological bounds (0–1) while remaining interpretable. Unlike a black-box regression, the parameters map directly onto binding chemistry: the pH coefficient (β = −0.187) reflects that higher pH drives more calcium onto albumin; chloride ranks #2 in feature importance, ahead of albumin itself. Missing data was handled with 20-imputation MICE and Rubin's rules pooling; overfitting was addressed with Harrell's bootstrap optimism correction (shrinkage factor 1.017, nearly perfect). Albumin ended up ranking fifth in feature importance, which undercuts the entire premise of every existing formula.

Findings

Test RMSE 0.068 mmol/L (95% CI 0.067–0.068), a 22% improvement over Payne, 7% over the best published comparator (Yap 2022). Bootstrap-corrected R² = 0.673. External validation: MIMIC-III RMSE 0.068 (virtually identical); eICU RMSE 0.087 across 131 hospitals (expected degradation from institutional heterogeneity, still clinically useful). Performance held across 18 subgroups including low albumin, impaired kidney, and elderly. XGBoost ceiling was R² = 0.67, only 4% above the mechanistic model, not worth the interpretability cost. Manuscript-stage; poster drafts prepared.

Open questions

The model explains 67% of variance; 33% remains. PTH is the minute-to-minute regulator of ionized calcium and has never been included in a correction model, likely because it's not on a CMP. Checking MIMIC-IV for PTH availability is the obvious next step; estimated ceiling with PTH: R² ≈ 0.72–0.77. Prospective clinical validation is the real test: does model-guided calcium management change outcomes, or does the prediction accuracy not translate to clinical decisions that matter?

Detailed case study in progress.

X in

2024