Introduction to asclepias
This site is the documentation hub for asclepias
.
asclepias
is a domain specific language written in Haskell to build and analyze cohorts.
For a deeper dive into the theory underlying asclepias
,
see Theory and Design.
Information on the Motivation to build asclepias
and its Benefits are included below.
If you are contributing to the development of asclepias
,
see the developer guide for more details.
If you are using asclepias
for your project,
see the user guide for more details.
See the asclepias
API documentation
for detailed information on the types and functions exported from asclepias
.
asclepias
code is hosted on GitLab.
Motivation
The extraction and transformation of source data (such as insurance claims, electronic medical records, or case report forms) into analysis-ready cohort data consumes an inordinate amount of resources in epidemiologic and real world evidence research. Moreover, the translation of scientists' analysis plans into code can be prone to ambiguity and misinterpretation.
The aim of asclepias
is to make epidemiologic research
less effort,
less error-prone,
and more robust by
providing a collection of domain specific languages
that both scientists and programmers can reason with and understand.
Benefits
asclepias
is a domain specific language and a collection of tools
for defining cohorts.
Benefits of asclepias
include:
- Formally defines a cohort
-
asclepias
provides a formal definition of a cohort specification, which includes defining one or more index events, a set criteria that determine how subjects will be included or excluded, and the set of features (i.e. variables) the cohort contains. - Provides tools for defining and testing features
-
A
Feature
inasclepias
can be thought of like covariates, outcomes, or other variables.asclepias
includes a language for defining features of nearly arbitrary shape. The language elegantly handles failures and prevents certain programming bugs. - Includes a well-tested library of common features
-
Cohorts often include many simple features such as indicators that a subject has some condition during a baseline period.
asclepias
includes templates for many common features. - Uses type safety to create better programs
-
Programmming languages often used in research such as SAS, R, python are dynamically, as opposed to statically, typed. These languages are great for many programming tasks, but statically typed languages can prevent many common programming errors by identifying them when the program is compiled. Most of the
asclepias
tools are written in languages like Haskell or Dhall to leverage type safety. - Is easily extensible
-
Users can create multiple features and/or cohorts following a common pattern and share these across projects.
- Introduces the event data theory
-
The event data theory is a framework for defining data models where the concept of time, as in epidemiology, is critical. Having data models of sequences of time-ordered events often make reasoning about cohorts and features easier.
- Is data model agnostic
-
While the using a event-based data model is encouraged, the theory of cohorts and features both work under a "bring your data model" assumption that does not require an event-based model.
TODO: link to example