Extracts the predicted response values from the
rfsrc object, and formats data for plotting
the response using plot.gg_rfsrc.
Arguments
- object
A fitted
rfsrcorrandomForestobject.- oob
Logical; if
TRUE(default) return out-of-bag predictions. Set toFALSEto use full in-bag (training) predictions. Forced toFALSEautomatically forpredict.rfsrcobjects, which carry no OOB estimates.- by
Optional stratifying variable. Either a character column name present in the training data, or a vector/factor of the same length as the training set. When supplied, a
groupcolumn is added to the returned data and bootstrap CI bands (survival) are computed per group. Omit or leave missing to return an unstratified result.- ...
Additional arguments controlling output for specific forest families:
- surv_type
Character; one of
"surv"(default),"chf", or"mortality"for survival forests.- conf.int
Numeric coverage probability (e.g.
0.95) to request bootstrap pointwise confidence bands for survival forests. Triggers wide-format output withlower,upper,median, andmeancolumns.- bs.sample
Integer; number of bootstrap resamples when
conf.intis set. Defaults to the number of observations.
Value
A gg_rfsrc object (a classed data.frame) whose
structure depends on the forest family:
- regression
Columns
yhatand the response name; optionally agroupcolumn whenbyis supplied.- classification
One column per class with predicted probabilities; a
ycolumn with observed class labels; optionallygroup.- survival (no CI / grouping)
Long-format with columns
variable(event time),value(survival probability),obs_id, andevent.- survival (with
conf.intorby) Wide-format with pointwise bootstrap CI columns (
lower,upper,median,mean) per time point; agroupcolumn whenbyis supplied.
The object carries class attributes for the forest family so that
plot.gg_rfsrc dispatches correctly.
Details
For survival forests, use the surv_type argument
("surv", "chf", or "mortality") to select the
predicted quantity. Bootstrap confidence bands are requested by passing
conf.int (e.g. conf.int = 0.95); the number of resamples
is controlled by bs.sample.
Examples
## ------------------------------------------------------------
## classification example (small, runs on CRAN)
## ------------------------------------------------------------
## -------- iris data
set.seed(42)
rfsrc_iris <- rfsrc(Species ~ ., data = iris, ntree = 50)
gg_dta <- gg_rfsrc(rfsrc_iris)
plot(gg_dta)
# \donttest{
## ------------------------------------------------------------
## Additional regression / survival examples are guarded with
## \donttest because the cumulative example time exceeds the
## 10-second CRAN budget. Run locally with `R CMD check --run-donttest`
## (or `devtools::check(run_dont_test = TRUE)`) to exercise them.
## ------------------------------------------------------------
## -------- air quality data (regression)
rfsrc_airq <- rfsrc(Ozone ~ ., data = airquality,
na.action = "na.impute", ntree = 50)
plot(gg_rfsrc(rfsrc_airq))
## -------- Boston data (rfsrc + randomForest)
if (requireNamespace("MASS", quietly = TRUE)) {
data(Boston, package = "MASS")
Boston$chas <- as.logical(Boston$chas)
rfsrc_boston <- rfsrc(medv ~ ., data = Boston, ntree = 50,
forest = TRUE, importance = TRUE,
tree.err = TRUE, save.memory = TRUE)
plot(gg_rfsrc(rfsrc_boston))
rf_boston <- randomForest::randomForest(medv ~ ., data = Boston,
ntree = 50)
plot(gg_rfsrc(rf_boston))
}
## -------- mtcars data
rfsrc_mtcars <- rfsrc(mpg ~ ., data = mtcars, ntree = 50)
plot(gg_rfsrc(rfsrc_mtcars))
## -------- veteran data (survival; with CI and group-by)
data(veteran, package = "randomForestSRC")
rfsrc_veteran <- rfsrc(Surv(time, status) ~ ., data = veteran,
ntree = 50)
plot(gg_rfsrc(rfsrc_veteran))
plot(gg_rfsrc(rfsrc_veteran, conf.int = .95))
plot(gg_rfsrc(rfsrc_veteran, by = "trt"))
# }