LAG {DTVEM}R Documentation

Differential Time-Varying Effect model (DTVEM)

Description

This is the PRIMARY DTVEM function. It combines (1) exploratory lag identication, followed by confirmatory lag identifcation. This function will work with discrete or continuous time data (note that the underlying model is assumes to be discrete even when randomly measured). This function can handle multivariate predictors and multivariate outcomes (or univariate for both). The function will also automatically apply data manipulation required to run the models specified in Jacobson, Chow, and Newman (2019).

Usage

LAG(
  ...,
  differentialtimevaryingpredictors = NULL,
  outcome = NULL,
  controlvariables = NULL,
  data = NULL,
  ID = "ID",
  Time = NULL,
  k = 10,
  k2 = 10,
  k3 = 3,
  k4 = 3,
  controllag = NULL,
  standardized = TRUE,
  predictionstart = NULL,
  predictionsend = NULL,
  predictionsinterval = NULL,
  software = "OpenMx",
  independentpredictors = FALSE,
  minimumpracticalsignificance = NULL,
  gamma = 1,
  minN = 30,
  debug = FALSE,
  OpenMxStartingValues = 0.3,
  ResidualAnalysis = "Group",
  blockdata = FALSE,
  rounddecimals = TRUE,
  maxintermediaterounds = 10,
  differntialtimevaryingpredictors = NULL
)

Arguments

...

A list of variable names used in the function e.g. "X","Y" (REQUIRED)

differentialtimevaryingpredictors

The variables that will be a varying-coefficient of differential time (AKA the lags you want to know what times they predict the outcome). This must be specified as a vector using c("variables here"). e.g. c("X","Y") (REQUIRED)

outcome

This is each of the outcome variables. Specified as outcome="outcomevariablename" for a single variable or outcome=c("outcomevariablename1","outcomevariablename2") (REQUIRED)

controlvariables

The variables to be controlled for (not lagged). These are traditional covariates in the analysis. These are the variables that will be controlled for in a stationary fashion. To use this use controlvariables = c("list","here") (OPTIONAL)

data

Specify the data frame that contains the data e.g. data=dataframename (REQUIRED)

ID

The name of the ID variable. E.G. ID = "ID" (must be specified). (REQUIRED)

Time

The name of the Time variable. E.G. Time = "Time" (must be specified). (REQUIRED)

k

The number of k selection points used in the model for stage 1 (see ?choose.k in mgcv package for more details) (note that this is for the raw data k2 refers to the k for the re-blocked data), default is 10. The ideal k is the maximum number of data points per person, but this slows down DTVEM and is often not required. (OPTIONAL, BUT RECOMMENDED)

k2

The number of k selection points used in the model for stage 1 of the blocked data (see ?choose.k in mgcv package for more details). Default is 10. The ideal k is the maximum number of data points per person, but this slows down DTVEM and is often not required. (OPTIONAL)

k3

The number of k selection points used in the model for the time spline (NOTE THAT THIS CONTROLS FOR TIME TRENDS OF THE POPULATION) (see ?choose.k in mgcv package for more details). Default is 3. (OPTIONAL)

k4

The number of k selection points used in the model for the varying coefficient in the intermediate stage (see ?choose.k in mgcv package for more details). Default is 3. (OPTIONAL, BUT RECOMMENDED)

controllag

The time of the lag which coviarates should be controlled for (NOT CURRENTLY FUNCTIONAL)

standardized

This specifies whether all of the variables (aside from Time) should be standardized. Options are TRUE, FALSE, and "center". TRUE means within-person standardize each variable (aka get the person-centered z-scores), FALSE means use the raw data, "center" means to only within-person mean-center the variables. Default = TRUE. FALSE is not recommended unless you have done these transformations yourself (OPTIONAL)

predictionstart

The differential time value to start with, default is NULL, and the lowest time difference in the time series will be used (use lower value if you're first value if you're interested in a smaller interval prediction) e.g. predictionstart = 1. If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

predictionsend

The differential time value to end with. This means how long you want your largest time difference in the study to be (i.e. if you wanted to predict up to allow time predictions up to 24 hours and your time intervals were specified in hours, you would set predictionsend = 24). If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

predictionsinterval

The intervals to predict between differential time points. If using discrete time do you want the intervals to be specified every discrete interval, if so set this to 1. If this is not specified and using a continuous time model, make sure to set blockdata = TRUE so that it will be automatically chosen. (OPTIONAL)

software

This is the software used to run the secondary analysis. State-space models are implemented by the argument "OpenMx". The option "gam" can be used to run a traditional multilevel model with a spline that controls for non-linear time trends at the population level. The option "hybrid" first runs a multilevel model then runs an state-space model. Model. Note that the state-space approach can be very slow with large amounts of lags, and consequently "gam" should be used with large amounts of lags are included. However, state-space model estimation is generally marginally superior to the multilevel modeling approach, and if using small amounts of lags or time is not an issue the state-space option is recommended. The default is "OpenMx" which implements multilevel models. (OPTIONAL)

independentpredictors

This is whether or not the wide model comparisons should be run independently and combined via stepwise regression with backward selection. This can be useful to reduce the amount of lags included in the confirmatory model. Default is FALSE. (OPTIONAL)

minimumpracticalsignificance

This can be used to set a minimum amount to pass on from DTVEM stage 1 to stage 2, and stage 1.5 to stage 2. This can be useful if too many variables come back as significant, but they would not meet your criteria for practical significance. Set this to a numerical value (e.g. minimumpracticalsignificance=.2). (OPTIONAL, UNCOMMONLY SPECIFIED)

gamma

This can be used to change the wiggliness of the model. This can be useful if the model is too smooth (i.e flat). The lower the number the more wiggly this will be (see ?gam in MGCV for more information). The default is equal to 1. (OPTIONAL, UNCOMMONLY SPECIFIED)

minN

The smallest N that will be considered in the stage 2 model (i.e. this can be important in case you don't have observations at certain differential times, such as overnight observations). Default = 30. (OPTIONAL)

debug

This will print more useless information as it goes along. Only useful for troubleshooting problems. (OPTIONAL, UNCOMMONLY SPECIFIED)

OpenMxStartingValues

Only applies when software = "OpenMx". Specify the starting values for OpenMx. Since OpenMx will 10 different runs before giving up on convergence this does not usually need to be specified. It should mostly only be specified if there is a convergence issue with OpenMx. Default is 0.3. (OPTIONAL, UNCOMMONLY SPECIFIED)

ResidualAnalysis

Only applies when software = "OpenMx". Analyze the residuals of the time series with OpenMx after factoring out the non-linear effect of time (takes time trends into account). Can be run only at the group level (faster), or it can also be run with a random effect splines of time (slower) by setting ResidualAnalysis = "Individual". Default = "Group" (OPTIONAL)

blockdata

This re-organizes the raw data into blocks after an exploratory first stage. Default = FALSE. TRUE = Automatic re-organization of data based on the minimum lag number and the time between two lags peaks/valleys. Including a numeric number will automatically re-block the data into chunks at those specific intervals. (OPTIONAL, UNCOMMONLY SPECIFIED)

rounddecimals

The default option is TRUE which to automatically rounds the decimals to the smallest non-zero decimal place in the data. Can also specify a number to round to a specific decimal place (e.g. 1 = the tenths digit, 2 = the hundredths digit, 0 = the nearest whole number, -1 = the nearest 10th number) (OPTIONAL, UNCOMMONLY SPECIFIED)

maxintermediaterounds

The maximum number of intermediate stages to perform. Default is 10 (OPTIONAL, UNCOMMONLY SPECIFIED)

differntialtimevaryingpredictors

This is depricated. Only retained for backward compatibility.

Value

The output of this function is: (1) the stage 1 (i.e. the exploratory stage) (stage1out), (2) the stage 2 model (gamstage2out or OpenMxstage2out), (3) the data used to run the models in the first and second stage (datamanipulationout). Use str() around the saved object to see the information (it is useful to specify the max.level at a time so that the information does not get overwhelming)

Examples


#Load the example data set 1
data(exampledat1)

#Run Univariate DTVEM
#out=LAG("X",differentialtimevaryingpredictors=c("X"),outcome=c("X"),data=exampledat1,Time="Time",k=9,standardized=FALSE,ID="ID",predictionsend=10,predictionstart=1,independentpredictors=FALSE,software="OpenMx")
#Example Bivariate DTVEM
#out=LAG("anxiety","depression",differentialtimevaryingpredictors = c("anxiety","depression"),outcome = c("anxiety","depression"),software="OpenMx",data=data,Time = "Time",ID="ID",predictionstart = 1,predictionsend = 24)

[Package DTVEM version 1.0010 Index]