Title: | Stochastic Augmentation of Matched Data Using Restriction Methods |
---|---|
Description: | Augmenting a matched data set by generating multiple stochastic, matched samples from the data using a multi-dimensional histogram constructed from dropping the input matched data into a multi-dimensional grid built on the full data set. The resulting stochastic, matched sets will likely provide a collectively higher coverage of the full data set compared to the single matched set. Each stochastic match is without duplication, thus allowing downstream validation techniques such as cross-validation to be applied to each set without concern for overfitting. |
Authors: | Mansour T.A. Sharabiani, Alireza S. Mahani |
Maintainer: | Alireza S. Mahani <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1 |
Built: | 2024-10-26 04:55:51 UTC |
Source: | https://github.com/cran/SAMUR |
This function generates multiple subsets of the data in which the distribution of covariates is balanced across treatment groups. It works by binning the output of a base matching algorithm into a multidimensional histogram, and drawing - without replacement - from the full data set according to the histogram. This leads to higher data coverage across multiple matched subsets without duplication of cases within each subset.
samur( formula, data , matched.subset = 1:nrow(data) , nsmp = 100 , use.quantile = TRUE, breaks = 10 , replace = length(unique(matched.subset)) < length(matched.subset) ) ## S3 method for class 'samur' print(x, ...)
samur( formula, data , matched.subset = 1:nrow(data) , nsmp = 100 , use.quantile = TRUE, breaks = 10 , replace = length(unique(matched.subset)) < length(matched.subset) ) ## S3 method for class 'samur' print(x, ...)
formula |
Formula expression used to describe the treatment variable (lhs) and covariates used during matching (rhs). |
data |
Data frame containing the treatment variables and matched covariates as specified in the |
matched.subset |
An integer vector representing the indexes of a subset of |
nsmp |
Number of stochastically matched subsets to generate. |
use.quantile |
Should numeric covariates be binned using quantiles ( |
breaks |
number of breaks to use in binning numeric covariates. |
replace |
Boolean flag indicating whether or not to perform sampling with replacement. |
x |
An object of class |
... |
Arguments passed to/from other methods. |
An object of class samur
, a matrix of size length(matched.subset)
by nsmp
, where each column is a matched subset wihtout case duplication. It also has the following attributes:
call |
Copy of function call. |
formula |
Formula passed to the function. |
mdg |
Multi-dimensional grid used for binning the matched data subsets. |
mdh |
Multi-dimensional histogram resulting frm binning |
data |
Copy of data frame passed to the function. |
Mansour T.A. Sharabiani, Alireza S. Mahani
## Not run: library(SAMUR) library(Matching) data(lalonde) myformula <- treat ~ age + educ myglm <- glm(myformula, lalonde, family="binomial") X <- myglm$fitted.values # using M=1 and replace=F to ensure no duplication bimatch <- Match(Tr = lalonde$treat, X = myglm$fitted.values , M = 1, replace = F, caliper = 0.25) idx <- c(bimatch$index.control, bimatch$index.treated) my.samur <- samur(formula = myformula, data = lalonde , matched.subset = idx, nsmp = 100 , breaks = 10, use.quantile = TRUE) summary(my.samur, nboots = 500) ## End(Not run)
## Not run: library(SAMUR) library(Matching) data(lalonde) myformula <- treat ~ age + educ myglm <- glm(myformula, lalonde, family="binomial") X <- myglm$fitted.values # using M=1 and replace=F to ensure no duplication bimatch <- Match(Tr = lalonde$treat, X = myglm$fitted.values , M = 1, replace = F, caliper = 0.25) idx <- c(bimatch$index.control, bimatch$index.treated) my.samur <- samur(formula = myformula, data = lalonde , matched.subset = idx, nsmp = 100 , breaks = 10, use.quantile = TRUE) summary(my.samur, nboots = 500) ## End(Not run)
summary
method for class "samur".
## S3 method for class 'samur' summary(object, ...) ## S3 method for class 'summary.samur' print(x, ...)
## S3 method for class 'samur' summary(object, ...) ## S3 method for class 'summary.samur' print(x, ...)
object |
An object of class "samur", usually the result of a call to |
x |
An object of class "summary.samur", usually the result of a call to |
... |
Further arguments to be passed to/from other methods. Current implementation of |
A list with the following elements:
min.pval.new |
A vector of length equal to number of samples ( |
min.pval.orig |
Same number as above, but for original matched subset. |
coverage.new |
Percent of cases from full data set covered among all stochastic, matched samples. |
coverage.orig |
Same as above, calculated for the original matched subset. |
All t-tests used for p-value calculations are "not" paired, since the philosophy of stochastic augmentation relaxes the notion of one-to-one matching.
Alireza S. Mahani, Mansour T.A. Sharabiani