# Package handles the figure
fig <- plot(corr_analysis(data))
# Analyst decorates to context
fig +
hv_theme("manuscript") +
scale_color_brewer(palette = "Set1") +
annotate("text", ...)Scaling Best Practices in a Large Hybrid SAS/R Team
Cleveland Clinic,
Heart, Vascular & Thoracic Institute,
Cardiovascular Outcome Registries and Research
April 27, 2026
We built methods in-house. Now we maintain them too.
Discipline doesn’t scale under production pressure
|
Study Researcher |
Analysis Report CORR |
Manuscript Researcher |
|---|---|---|
| Research questions | Data ingestion | Introduction |
| Statistical approaches | Model building | Methods |
| Communication | Figures & Tables | Results |
| Discussion |
“Bus insurance” — any analyst can pick up any job.
Make the safe path the easy path.
Make the safe path the easy path.
Is this the right data? Has it changed?
Layer governance on top of existing structure.
study_root/
analyses/ ← jobs
datasets/ ← *this problem*
descriptive/
distributions/
documents/ ← deliverables
estimates/ ← cached results
graphs/ ← pdf
datasets/built260430.manifest.yml
Register explicitly — separate from build
Verify before analysis
Filename says when. Checksum says whether.
Rebuild without re-registering → every downstream job fails.
That failure is the feature.
Make the safe path the easy path.
SAS
R & RStudio
estimates/ fail to loadCan you reproduce this analysis six months from now?
renv with RStudio projects In flightupdate.packages() breaks other analyses silentlyestimates/ fail to load 6 months laterSAS macro library
R functions
SAS: did that update break something? R: did anyone else get that fix?
An R package is a versioned, distributable library.
Functions in the study folder
Code in a package
renv::update()The hard problems shift — they don’t disappear.
SAS — partially solved this
.lst pairs code and outputR scripts — no equivalent
Can a new analyst navigate this job without asking whoever wrote it?
report.qmd = job.sas
Always render before committing.
Make the safe path the easy path.
SAS: plot.sas
R: ggplot2
Can we capture that effort once — make figures consistent?
The package does the hard work. The analyst decorates.
Package owns
Analyst owns
ggRandomForests — github.com/ehrlinger/ggRandomForests
Variable importance, partial depth, minimal depth — ggplot2 for randomForestSRC.
hvtiPlotR — github.com/ehrlinger/hvtiPlotR
Survival curves, hazard plots, forest plots in CORR themes.
Bridges ggplot2 to plot.sas standards.
As the science keeps moving and CORR evolves, so do the packages.
ggRandomForestsSurvival difference between two treatment groups (blue, red)
randomForestSRC fits a survival forest; ggRandomForests plots the results.
Right now, this is all SAS.
gt and gtsummary exist — they aren’t tamed to our standards yet.
That’s next.
| Status | |
|---|---|
| Data Ingestion | |
| Study folder structure | Adopted — standard |
| Dataset naming + manifests + checksums | In development |
| Model Building | |
RStudio Projects + renv
|
In flight — “easy” win |
| Quarto templates | In development |
hazard (SAS), randomForestSRC (R)TemporalHazard (R)
|
Adopted In development |
| Figures & Tables | |
ggRandomForests, hvtiPlotR
|
In flight |
hvtiRtables
|
Aspirational — coming “soon” |
“Reproducibility” → “Production reliability”
“Any biostatistician can pick up any job.”
Make the safe path the easy path.
| Package | |
|---|---|
hazard (SAS) |
github.com/ehrlinger/hazard |
TemporalHazard |
github.com/ehrlinger/temporal_hazard |
randomForestSRC |
cran.r-project.org/package=randomForestSRC |
ggRandomForests |
github.com/ehrlinger/ggRandomForests |
hvtiPlotR |
github.com/ehrlinger/hvtiPlotR |
hvtiRutilities |
github.com/ehrlinger/hvtiRutilities |
These slides: github.com/ehrlinger/CareFeedingBiostats
Contact: ehrlinj@ccf.org
We’re hiring a biostatistician — Contact me!