Introduction to Asclepias

This site is the documentation hub for Asclepias, a domain specific language for building and analyzing cohorts.

Where to start

If you’re unfamiliar with asclepias and what it can do, start by reading the Motivation and Benefits sections below. If this is your first time using asclepias, read getting started. If you’re already an asclepias user, check out our available tutorials and how-to guides.

Reference information for hasklepias functions is located here.

Motivation

The extraction and transformation of source data (such as insurance claims, electronic medical records, or case report forms) into analysis-ready cohort data consumes an inordinate amount of resources in epidemiologic and real world evidence research. Moreover, the translation of scientists' analysis plans into code can be prone to ambiguity and misinterpretation.

The aim of asclepias is to make epidemiologic research less effort, less error-prone, and more robust by providing a collection of domain specific languages that both scientists and programmers can reason with and understand.

Benefits

Asclepias is a domain specific language and a collection of tools for defining cohorts. Benefits of asclepias include:

Formally defines a cohort

Asclepias provides a formal definition of a cohort specification, which includes defining one or more index events, a set criteria that determine how subjects will be included or excluded, and the set of features (i.e. variables) the cohort contains.

Provides tools for defining and testing features

A Feature in asclepias can be thought of like covariates, outcomes, or other variables. Asclepias includes a language for defining features of nearly arbitrary shape. The language elegantly handles failures and prevents certain programming bugs.

Includes a well-tested library of common features

Cohorts often include many simple features such as indicators that a subject has some condition during a baseline period. Asclepias includes templates for many common features.

Uses type safety to create better programs

Programmming languages often used in research such as SAS, R, python are dynamically, as opposed to statically, typed. These languages are great for many programming tasks, but statically typed languages can prevent many common programming errors by identifying them when the program is compiled. Most of the asclepias tools are written in languages like Haskell or Dhall to leverage type safety.

Is easily extensible

Users can create multiple features and/or cohorts following a common pattern and share these across projects.

Introduces the event data theory

The event data theory is a framework for defining data models where the concept of time, as in epidemiology, is critical. Having data models of sequences of time-ordered events often make reasoning about cohorts and features easier.

Is data model agnostic

While the using a event-based data model is encouraged, the theory of cohorts and features both work under a "bring your data model" assumption that does not require an event-based model.

TODO: link to example