Ca
Calcium Correction
Mechanistic ML model predicting ionized calcium from nine routine ICU labs — out...
clinical-mlicubiostatisticsPythonscikit-learnXGBoost

Premise

Clinicians need ionized calcium — the physiologically active form — but only total calcium is routinely measured in ICUs. The standard fix is a 1973 correction formula (Payne) that adjusts for albumin alone. The problem: it was derived on healthy outpatients and achieves r²=0.42 in the ICU populations it's actually used on. Fifty years of wide adoption, demonstrably wrong.

How it evolved

Started as a straightforward formula comparison on MIMIC-III. Grew into a full mechanistic modeling study when it became clear that existing corrections were missing two things the literature had mostly ignored: pH (which governs calcium-albumin binding equilibrium directly) and the anion gap components (sodium, chloride, bicarbonate), which Goldwasser and Yap had shown matter but hadn't incorporated mechanistically. The dataset grew from ~4K to 130K+ paired measurements as MIMIC-IV v3.1 became available, and the validation layer expanded to include both temporal (MIMIC-III) and geographic (eICU, 131 hospitals) external datasets.

Technical crux

The model is a fraction-offset sigmoid — iCa = offset + sigmoid(β) × total_calcium — which constrains the ionized fraction to physiological bounds (0–1) while remaining interpretable. Unlike a black-box regression, the parameters map directly onto binding chemistry: the pH coefficient (β = −0.187) reflects that higher pH drives more calcium onto albumin; chloride ranks #2 in feature importance, ahead of albumin itself. Missing data was handled with 20-imputation MICE and Rubin's rules pooling; overfitting was addressed with Harrell's bootstrap optimism correction (shrinkage factor 1.017, nearly perfect). Albumin ended up ranking fifth in feature importance — which undercuts the entire premise of every existing formula.

Findings

Test RMSE 0.068 mmol/L (95% CI 0.067–0.068) — a 22% improvement over Payne, 7% over the best published comparator (Yap 2022). Bootstrap-corrected R² = 0.673. External validation: MIMIC-III RMSE 0.068 (virtually identical); eICU RMSE 0.087 across 131 hospitals (expected degradation from institutional heterogeneity, still clinically useful). Performance held across 18 subgroups including low albumin, impaired kidney, and elderly. XGBoost ceiling was R² = 0.67 — only 4% above the mechanistic model, not worth the interpretability cost. Manuscript-stage; poster drafts prepared.

Open questions

The model explains 67% of variance; 33% remains. PTH is the minute-to-minute regulator of ionized calcium and has never been included in a correction model — likely because it's not on a CMP. Checking MIMIC-IV for PTH availability is the obvious next step; estimated ceiling with PTH: R² ≈ 0.72–0.77. Prospective clinical validation is the real test: does model-guided calcium management change outcomes, or does the prediction accuracy not translate to clinical decisions that matter?

Detailed case study in progress.

2024