Why?
Medical device event data are messy.
Common challenges include:
How?
The mds
package provides a standardized framework to
address these challenges:
R
files for auditability,
documentation, and reproducibilityPurpose of This Vignette
mds
mds
functions: deviceevent(), exposure(), define_analyses(), time_series()Note on Statistical Algorithms
mds
data and analysis standards allow for seamless
application of various statistical trending algorithms via the
mdsstat
package (under development).
Our example dataset maude
was queried from the FDA MAUDE API and contains 535 reported
events on bone cement in 2017. Furthermore, a simulated exposure dataset
sales
was generated to provide denominator data for our
bone cement events.
head(maude, 3)
report_number | event_type | date_received | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name | device_name | medical_specialty_description | device_class | region |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0002249697-2017-00023 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central | |
0002249697-2017-00028 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | West | |
0002249697-2017-00025 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central |
head(sales, 3)
device_name | region | sales_month | sales_volume |
---|---|---|---|
Arthroscope | Central | 2017-01-01 | 83 |
Arthroscope | Central | 2017-02-01 | 119 |
Arthroscope | Central | 2017-03-01 | 112 |
The general workflow to go from data to trending over time is as follows:
deviceevent()
to standardize device-event
data.exposure()
to standardize exposure data
(optional).define_analyses()
to enumerate possible analysis
combinations.time_series()
to generate counts (and/or rates) by
time based on your defined analyses.# Step 1 - Device Events
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")
# Step 2 - Exposures (Optional step)
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")
# Step 3 - Define Analyses
da <- define_analyses(
de,
device_level="device_name",
exposure=ex,
covariates="region")
# Step 4 - Time Series
ts <- time_series(
da,
deviceevents=de,
exposure=ex)
You may:
de
, ex
), analyses
(da
), and time series (ts
) for
documentationsummary()
and
define_analyses_dataframe()
plot()
your time series (plotting
options)mdsstat
package)summary(da)
#> $`Analyses Timestamp`
#> [1] "2025-02-04 04:24:17 UTC"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01
head(dadf, 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceevent()
to Standardize Device-Event DataBasic Usage
de <- deviceevent(maude, "date_received", c("device_name", "device_class"), c("event_type", "medical_specialty_description"))
head(de, 3)
key | time | device_1 | device_2 | event_1 | event_2 |
---|---|---|---|---|---|
1 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
2 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
3 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
Advanced Usage
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")
head(de, 3)
key | time | device_1 | device_2 | event_1 | event_2 | region | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0002249697-2017-00023 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
0002249697-2017-00028 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | West | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
0002249697-2017-00025 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK |
data_frame
time
Date
format.
device_hierarchy
mds
remembers this hierarchy
and allows trending at multiple levels as you specify.
event_hierarchy
descriptors
argument. The hierarchical concept reflects how
events are often nested into progressively more general groups. Set the
first variable as the lowest event level that you would like to trend
at. mds
remembers this hierarchy and allows trending at
multiple levels as you specify. If your data does not have an event
variable, you will need to create a dummy variable.
key
data_frame
.
If your data pipeline carries over a key variable, it is recommended to
specify it here. The key
allows downstream aggregated
analysis to be able to “look up” individual constituent events.
covariates
covariates="Region"
will allow analysis
of regions within device. These variables should be categorical in
nature.
descriptors
implant_days
exposure()
to Standardize Exposure DataExposure data is meant to support device-event data. As such, the
general expectation is that variable values match between exposure and
device-event data. For example, 10 exposures for
ev3 Solitaire
in France
will be matched
exactly to ev3 Solitaire
events in
France
, and not to events for EV3 SOLITAIRE
in
FRANCE
.
Basic Usage
head(ex, 3)
key | time | count | device_1 |
---|---|---|---|
1 | 2017-01-01 | 1 | Arthroscope |
2 | 2017-02-01 | 1 | Arthroscope |
3 | 2017-03-01 | 1 | Arthroscope |
Advanced Usage
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")
head(ex, 3)
key | time | count | device_1 | region |
---|---|---|---|---|
1 | 2017-01-01 | 83 | Arthroscope | Central |
2 | 2017-02-01 | 119 | Arthroscope | Central |
3 | 2017-03-01 | 112 | Arthroscope | Central |
Note: Although not required, count
will commonly be used
as well.
data_frame
time
Date
format. If exposure will be used, it is
critical to have sufficient time granularity. For
example, if analysis will be done monthly, exposure data must be no less
granular than monthly. mds
does not make assumptions about
filling in holes in time!
device_hierarchy
device_hierarchy
parameter.
event_hierarchy
event_hierarchy
parameter. Exposures at an event level is
not common.
count
key
data_frame
. If your data pipeline carries over a key
variable, it is recommended to specify it here. The key
allows downstream aggregated analysis to be able to “look up” individual
constituent exposure records.
match_levels
define_analyses()
to Enumerate Analysis
CombinationsAfter standardizing device-event data using
deviceevent()
and, optionally, exposure data using
exposure()
, the next step is to discover what types
of analyses are possible. This is separated from actually doing
the analysis (counting, calculations, statistics, etc.) because:
Basic Usage
Note that define_analyses()
returns a list of individual
analyses. Each individual analysis contains a set of instructions. You
can view an analysis by submitting da[[1]]
,
da[[2]]
, etc., but a less cumbersome overview is possible
using summary()
and
define_analyses_dataframe()
.
summary(da)
#> $`Analyses Timestamp`
#> [1] "2025-02-04 04:24:19 UTC"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 7 0 6
#> Event Levels Covariates
#> 1 1
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure <NA> <NA>
#> 3 Both 2017-01-01 2017-12-01
head(define_analyses_dataframe(da), 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement, Antibiotic | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
Advanced Usage
summary(da)
#> $`Analyses Timestamp`
#> [1] "2025-02-04 04:24:19 UTC"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01
head(define_analyses_dataframe(da), 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceevents
class()
should
contain "mds_de"
)
device_level
attributes(de)$device_hierarchy
.
event_level
attributes(de)$event_hierarchy
.
exposure
class()
should
contain "mde_e"
)
date_level
and date_level_n
"months"
and 1
analyzes by month.
Other examples include "months"
and 12
for
yearly, or "days"
and 7
for weekly.
covariates
c("region")
analyzes by each level of region
within device.
times_to_calc
date_level
and
date_level_n
.
It is always assumed that analyses at aggregated levels are desired. (such as analysis of all events for a given device, or analysis of all events across all devices)
Aggregated level analysis is easily recognized by the
"All"
and "Data"
values in
device_level
, event_level
,
covariate
, and covariate_level
.
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11 | 11 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-08-01 | Cement, Bone, Vertebroplasty | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-08-01 |
12 | 12 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | Central | FALSE | 2017-02-01 | 2017-12-01 | Cement, Bone, Vertebroplasty | Central | 2017-01-01 | 2017-12-01 | 2017-02-01 | 2017-12-01 |
NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
There are several options:
da[[c(1:5, 24:27)]])
)define_analyses()
with different parameter
settingsda[[1]]$date_range_exposure['start'] <- as.Date("2016-10-01")
)time_series()
to Generate Counts, Rates, and MoreOnce an analysis has been defined using
define_analyses()
, the analyses instructions can be
executed using time_series()
, returning by defined time
periods:
key
parameter from
deviceevent()
) for lookup of individual event records.key
parameter from
exposure()
) for lookup of individual exposure records.Basic Usage
Note that time_series()
returns, in a list, one time
series data frame for every analysis. You can select a time series by
submitting ts[[1]]
, ts[[2]]
, etc.
head(ts[[1]], 3)
time | nA | ids |
---|---|---|
2017-01-01 | 13 | 0002249697-2017-00023 |
2017-02-01 | 7 | 0002249697-2017-00488 |
2017-03-01 | 5 | 0002249697-2017-00755 |
Advanced Usage
head(ts[[1]], 3)
time | nA | ids | exposure | ids_exposure |
---|---|---|---|---|
2017-01-01 | 13 | 0002249697-2017-00023 | 8597 | 37 |
2017-02-01 | 7 | 0002249697-2017-00488 | 5115 | 38 |
2017-03-01 | 5 | 0002249697-2017-00755 | 10191 | 39 |
analysis
class()
should contain
"mds_da"
) or a list of defined analysis.
deviceevents
class()
contains
"mds_de"
). It is typically the same data frame used to
generate analysis
, but can be another "mds_de"
data frame, such as a cut of the data at a different time. Note if, say,
an older dataset is being used, the analysis
date ranges
must correspond.
exposure
class()
contains
"mds_e"
). It is typically the same data frame used to
generate analysis
. Like deviceevents
, another
data frame may be used, but the analysis
instructions must
correspond.
use_hierarchy
?time_series.mds_da
for more details.
It is not uncommon to adjust event and exposure counts, such as with
applications of rolling or moving averages. These adjustments should be
applied after generating time series data frames from
time_series()
.
plot()
ing a Time SeriesPlotting an individual time series generated by
time_series()
is simple. Simply call plot()
on
the time series object:
There are a few custom parameters, including:
mode
"nA"
(representing the device-event
of interest), "exposure"
, and "rate"
(simply
"nA"/"exposure"
). Less common are "nB"
,
"nC"
, and "nD"
representing the cell counts of
the disproportionality analysis (DPA) contingency table.
xlab
, ylab
, main
plot()
behavior. By default, axes and
title labels are inferred directly from the time series.
All other parameters are from plot.default()
.