| Title: | Decision Tree Analysis for Longitudinal Measurement Data |
|---|---|
| Description: | Implements tree-based methods for longitudinal data. The package constructs decision trees that evaluate both the main effect of a covariate and its interaction with time through a weighted splitting criterion. It supports single-tree construction, bootstrap-based multiple-tree selection, and tree visualisation. For methodological details, see Obata and Sugimoto (2026) <doi:10.1007/s11634-025-00665-2>. |
| Authors: | Ryoto Obata [aut, cre], Tomoyuki Sugimoto [aut] |
| Maintainer: | Ryoto Obata <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.0.1 |
| Built: | 2026-05-16 18:12:31 UTC |
| Source: | https://github.com/cran/longitree |
Constructs a single decision tree for longitudinal data. The method evaluates both the main effect of a covariate and its interaction with time, incorporating a weighting mechanism to balance the two effects. Three single-tree construction procedures (ST1, ST2, ST3) are available; see Details. For the underlying methodology, refer to Obata and Sugimoto (2026).
longitree( formula, time, random, weight = "w", data, alpha = "no", gamma = "no", cv = "no", maxdepth = 5, minbucket = 5, minsplit = 20, xval = 10 ) ## S3 method for class 'longitree' summary(object, ...) ## S3 method for class 'longitree' print(x, ...) ## S3 method for class 'longitree' predict(object, ...) ## S3 method for class 'longitree' plot(x, ...)longitree( formula, time, random, weight = "w", data, alpha = "no", gamma = "no", cv = "no", maxdepth = 5, minbucket = 5, minsplit = 20, xval = 10 ) ## S3 method for class 'longitree' summary(object, ...) ## S3 method for class 'longitree' print(x, ...) ## S3 method for class 'longitree' predict(object, ...) ## S3 method for class 'longitree' plot(x, ...)
formula |
A formula specifying the model.
The response variable should be on the left side and covariates on the
right side. Use |
time |
Character string giving the column name of the time variable. All individuals are assumed to be observed at the same time points. |
random |
Character string giving the column name of the random effect (subject identifier). |
weight |
Weight for balancing the main effect of a covariate and
its interaction with time. A value in
|
data |
A data frame containing the variables in |
alpha |
Significance level used as the stopping rule for tree
growth. A smaller value produces a more conservative (smaller) tree.
Specify a numeric value or |
gamma |
Complexity parameter for pruning. A larger value prunes
more aggressively, yielding a smaller and simpler tree; a smaller
value retains more branches. Specify a numeric value or |
cv |
Set |
maxdepth |
Maximum depth of the tree (default 5). |
minbucket |
Minimum number of subjects in a terminal node (default 5). |
minsplit |
Minimum number of subjects required to attempt a split (default 20). |
xval |
Number of cross-validation folds (default 10). Used to
compute the cross-validated coefficient of determination
( |
object |
A |
... |
Additional arguments passed to |
x |
A |
Exactly one of alpha, gamma, or cv must be specified.
Specifying more than one will result in an error. These correspond to the
three single-tree construction procedures:
cv = "yes")Tree growth, pruning, and final tree selection via cross-validation.
alpha)Tree growth with a significance threshold. No pruning or final tree selection via cross-validation.
gamma)Tree growth followed by pruning with a pre-specified complexity parameter. No final tree selection via cross-validation.
Since the time variable is not used as a splitting variable, each terminal node (leaf) contains the full longitudinal responses for every subject assigned to it, allowing direct evaluation of longitudinal trajectories within each leaf.
An object of class "longitree". Use
summary.longitree, predict.longitree,
or plot.longitree to inspect the results.
summary(longitree): Print a brief summary of a longitree
object.
print(longitree): Print method (calls summary).
predict(longitree): Extract predicted values and terminal node
assignments from a longitree object. Returns a data frame
with columns predict (predicted values) and
terminalnode (terminal node assignments).
plot(longitree): Plot a longitree object.
A convenience wrapper around treeplot.
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
data(ltreedata) # ST1: tree construction via cross-validation result_st1 <- longitree(y ~ ., time = "time", random = "subject", weight = 0.7, data = ltreedata, cv = "yes") summary(result_st1) predict(result_st1) plot(result_st1) # ST2: tree growth with a significance threshold result_st2 <- longitree(y ~ ., time = "time", random = "subject", weight = 0.1, data = ltreedata, alpha = 0.05) summary(result_st2) predict(result_st2) plot(result_st2) # ST3: pruning with a complexity parameter result_st3 <- longitree(y ~ ., time = "time", random = "subject", weight = "w", data = ltreedata, gamma = 3) summary(result_st3) predict(result_st3) plot(result_st3)data(ltreedata) # ST1: tree construction via cross-validation result_st1 <- longitree(y ~ ., time = "time", random = "subject", weight = 0.7, data = ltreedata, cv = "yes") summary(result_st1) predict(result_st1) plot(result_st1) # ST2: tree growth with a significance threshold result_st2 <- longitree(y ~ ., time = "time", random = "subject", weight = 0.1, data = ltreedata, alpha = 0.05) summary(result_st2) predict(result_st2) plot(result_st2) # ST3: pruning with a complexity parameter result_st3 <- longitree(y ~ ., time = "time", random = "subject", weight = "w", data = ltreedata, gamma = 3) summary(result_st3) predict(result_st3) plot(result_st3)
Generates multiple trees from bootstrap samples and evaluates all three-tree combinations based on two criteria: cross-validated prediction error and tree diversification measured by the adjusted Rand index (ARI). Bootstrap sampling is performed at the subject level to preserve longitudinal structure.
longitrees( formula, time, random, weight = "w", data, alpha = "no", gamma = "no", cv = "no", maxdepth = 5, minbucket = 5, minsplit = 20, xval = 10, bootsize, trees = 100, mins = 40 )longitrees( formula, time, random, weight = "w", data, alpha = "no", gamma = "no", cv = "no", maxdepth = 5, minbucket = 5, minsplit = 20, xval = 10, bootsize, trees = 100, mins = 40 )
formula |
A formula specifying the model.
The response variable should be on the left side and covariates on the
right side. Use |
time |
Character string giving the column name of the time variable. All individuals are assumed to be observed at the same time points. |
random |
Character string giving the column name of the random effect (subject identifier). |
weight |
Weight for balancing the main effect of a covariate and
its interaction with time. A value in
|
data |
A data frame containing the variables in |
alpha |
Significance level used as the stopping rule for tree
growth. A smaller value produces a more conservative (smaller) tree.
Specify a numeric value or |
gamma |
Complexity parameter for pruning. A larger value prunes
more aggressively, yielding a smaller and simpler tree; a smaller
value retains more branches. Specify a numeric value or |
cv |
Set |
maxdepth |
Maximum depth of the tree (default 5). |
minbucket |
Minimum number of subjects in a terminal node (default 5). |
minsplit |
Minimum number of subjects required to attempt a split (default 20). |
xval |
Number of cross-validation folds (default 10). Used to
compute the cross-validated coefficient of determination
( |
bootsize |
Number of subjects in each bootstrap sample. |
trees |
Number of bootstrap trees to grow (default 100). |
mins |
Number of top-ranking candidate three-tree subsets to retain (default 40). |
See longitree for a description of the three single-tree
construction procedures (ST1, ST2, ST3) corresponding to cv,
alpha, and gamma.
An object of class "longitrees". Pass to
selectionplot to select the optimal three-tree combination.
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
longitree, selectionplot,
threetrees, treeplot
A sample balanced longitudinal dataset with 50 subjects observed at 10 equally spaced time points.
ltreedataltreedata
A data frame with 500 rows and 7 variables:
Response variable (continuous).
Subject identifier (integer, 1–50).
Time point (integer, 1–10).
Baseline covariate 1 (integer, 1–10).
Baseline covariate 2 (integer, 1–10).
Baseline covariate 3 (integer, 1–6).
Baseline covariate 4 (integer, 1–12).
Plots the cross-validated prediction error against the maximum pairwise
adjusted Rand index (ARI) for candidate three-tree subsets, and selects
a subset based on either prediction performance or tree diversification.
The selected combination is indicated by a red point on the plot, which
corresponds to the three trees used in the subsequent
threetrees step.
selectionplot(longitrees, metric, nth)selectionplot(longitrees, metric, nth)
longitrees |
A |
metric |
|
nth |
Rank of the tree subset to select (1 = best). |
An object of class "selectionplot". Pass to
threetrees to refit and evaluate the selected trees.
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
Refits the three trees selected by selectionplot on their
original bootstrap samples.
threetrees(x, selection) ## S3 method for class 'threetrees' summary(object, ...) ## S3 method for class 'threetrees' print(x, ...) ## S3 method for class 'threetrees' predict(object, tree = 1, ...) ## S3 method for class 'threetrees' plot(x, tree = 1, ...)threetrees(x, selection) ## S3 method for class 'threetrees' summary(object, ...) ## S3 method for class 'threetrees' print(x, ...) ## S3 method for class 'threetrees' predict(object, tree = 1, ...) ## S3 method for class 'threetrees' plot(x, tree = 1, ...)
x |
A |
selection |
A |
object |
A |
... |
Additional arguments passed to |
tree |
Integer 1, 2, or 3 selecting which tree to plot. |
An object of class "threetrees". Use
summary.threetrees, predict.threetrees,
or plot.threetrees to inspect the results.
summary(threetrees): Print a brief summary of a threetrees object.
print(threetrees): Print method (calls summary).
predict(threetrees): Extract predicted values and terminal node
assignments from a threetrees object. Returns a data frame
with columns predict (predicted values) and
terminalnode (terminal node assignments).
plot(threetrees): Plot one of the three trees.
A convenience wrapper around treeplot.
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
longitrees, selectionplot,
treeplot
data(ltreedata) set.seed(10) trees_res <- longitrees(y ~ ., time = "time", random = "subject", weight = 0.5, data = ltreedata, alpha = 0.01, bootsize = 50, mins = 40) sel <- selectionplot(trees_res, metric = "PE", nth = 1) tt <- threetrees(trees_res, selection = sel) summary(tt) predict(tt, tree = 1) predict(tt, tree = 2) predict(tt, tree = 3) plot(tt, tree = 1) plot(tt, tree = 2) plot(tt, tree = 3)data(ltreedata) set.seed(10) trees_res <- longitrees(y ~ ., time = "time", random = "subject", weight = 0.5, data = ltreedata, alpha = 0.01, bootsize = 50, mins = 40) sel <- selectionplot(trees_res, metric = "PE", nth = 1) tt <- threetrees(trees_res, selection = sel) summary(tt) predict(tt, tree = 1) predict(tt, tree = 2) predict(tt, tree = 3) plot(tt, tree = 1) plot(tt, tree = 2) plot(tt, tree = 3)
Visualises the structure of a decision tree for longitudinal
data. Built on ggparty. Each split node displays the node
number, split variable, -value, and weight . Each
terminal node displays the node number, sample size , and the
intercept () and slope () from a
linear mixed-effects model fitted within that node. Individual
longitudinal trajectories are shown as dashed lines; the predicted
values (average at each time point) are shown as solid lines, with the
response variable on the vertical axis and time on the horizontal axis.
treeplot( x, tree = NULL, snsize = 50, spsize = 5, plotsize = 80, linesize1 = 0.3, linesize2 = 1, tnsize = 60 )treeplot( x, tree = NULL, snsize = 50, spsize = 5, plotsize = 80, linesize1 = 0.3, linesize2 = 1, tnsize = 60 )
x |
A |
tree |
Integer 1, 2, or 3 selecting which tree to plot when |
snsize |
Split-node label size (default 50). |
spsize |
Split-point label size (default 5). |
plotsize |
Overall plot size (default 80). |
linesize1 |
Branch line width (default 0.3). |
linesize2 |
Main line width (default 1). |
tnsize |
Terminal-node label size (default 60). |
A ggplot2/ggparty object.
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2