Introduction to asclepias

This site is the documentation hub for asclepias. asclepias is a domain specific language written in Haskell to build and analyze cohorts. For a deeper dive into the theory underlying asclepias, see Theory and Design. Information on the Motivation to build asclepias and its Benefits are included below.

If you are contributing to the development of asclepias, see the developer guide for more details.

If you are using asclepias for your project, see the user guide for more details.

See the asclepias API documentation for detailed information on the types and functions exported from asclepias.

asclepias code is hosted on GitLab.

Motivation

The extraction and transformation of source data (such as insurance claims, electronic medical records, or case report forms) into analysis-ready cohort data consumes an inordinate amount of resources in epidemiologic and real world evidence research. Moreover, the translation of scientists' analysis plans into code can be prone to ambiguity and misinterpretation.

The aim of asclepias is to make epidemiologic research less effort, less error-prone, and more robust by providing a collection of domain specific languages that both scientists and programmers can reason with and understand.

Benefits

asclepias is a domain specific language and a collection of tools for defining cohorts. Benefits of asclepias include:

Formally defines a cohort

asclepias provides a formal definition of a cohort specification, which includes defining one or more index events, a set criteria that determine how subjects will be included or excluded, and the set of features (i.e. variables) the cohort contains.

Provides tools for defining and testing features

A Feature in asclepias can be thought of like covariates, outcomes, or other variables. asclepias includes a language for defining features of nearly arbitrary shape. The language elegantly handles failures and prevents certain programming bugs.

Includes a well-tested library of common features

Cohorts often include many simple features such as indicators that a subject has some condition during a baseline period. asclepias includes templates for many common features.

Uses type safety to create better programs

Programmming languages often used in research such as SAS, R, python are dynamically, as opposed to statically, typed. These languages are great for many programming tasks, but statically typed languages can prevent many common programming errors by identifying them when the program is compiled. Most of the asclepias tools are written in languages like Haskell or Dhall to leverage type safety.

Is easily extensible

Users can create multiple features and/or cohorts following a common pattern and share these across projects.

Introduces the event data theory

The event data theory is a framework for defining data models where the concept of time, as in epidemiology, is critical. Having data models of sequences of time-ordered events often make reasoning about cohorts and features easier.

Is data model agnostic

While using the event-based data model is encouraged, the theory of cohorts and features both work under a "bring your data model" assumption that does not require an event-based model.