The `asclepias` User Guide

The purpose of the user guide is to provide guidance on using asclepias. This guide covers:

How to get started using asclepias
Instructions for using asclepias
References

For background theory on cohort building, see Event Data Theory.

Getting Started

To get started using asclepias, there are a few prerequisite tasks. This section covers instructions on the installation of necessary software tools and prepping data for use by asclepias functions.

Setting Up Software Tools

To use asclepias, you will need to install the Haskell tool chain. To install, follow the directions in the setup guide, with these special instructions:

Use ghc version 9.2.2.
Use cabal version 3.6.2.0.

If you are new to Haskell, review the Haskell usage guide for best practices.

Data Requirements

To access all the functionality of asclepias, the project data have to be in a particular format. The data must be in JSON file, and follow NoviSci’s standard EDM schema where each line in the file is a valid EventLine. See the EventLine type in event-data-model for more details. Any project that uses the standard ETL process will meet these requirements.

If the project data is not in this format, asclepias may still be used, but additional steps will need to be taken by the stat dev team. Please contact them for more details.

Instructions

1. Define Fact-Model

If the project data does not meet the Data Requirements, or the ETL process will not be run on your data, skip this step.

Search the event-data-model project repo for existing models.
If one of the existing models matches your project’s use case, select this model.
Else, create a new model, following the instructions in the event-data-model documentation to construct a new model. Note that, to make a new model, you may also need to make new facts.

2. Set Up ETL Process

TODO - link to Paul’s documentation when ready

3. Initialize Project Repo

Initialize a git repo for your project, using nsProjects::initialize() for example.
In your terminal, navigate to the project folder.

Execute the following commands in the terminal,

PROJID=myProj
cabal init \
    --libandexe \
    --application-dir=apps \
    --source-dir=plans \
    --tests \
    --test-dir=tests \
    --package-name=$PROJID \
    --minimal \
    --homepage https://gitlab.novisci.com/nsResearch/$PROJID \
    --dependency hasklepias-main
cabal update
mv plans/MyLib.hs plans/Cohorts.hs
mv tests/MyLibTest.hs tests/Main.hs

You may see warnings about fields in the .cabal file. You can safely ignore these.

Update <myProj>.cabal, replacing <myProj> with your project name.

Change exposed-modules in the library stanza from MyLib to:

    exposed-modules:
        Cohorts
      , Tests

Add the following section to the library stanza:

    default-extensions:
      NoImplicitPrelude
      OverloadedStrings
      LambdaCase
      FlexibleContexts
      FlexibleInstances
      DeriveGeneric
      MultiParamTypeClasses
      DataKinds
      TypeApplications

Change build-depends in the executable stanza to:
```
    build-depends:
        base,
        hasklepias-main,
        <myProj>
```
Replacing <myProj> with your project name.
Change main-is in the test-suite stanza to:
```
    main-is:          Main.hs
```

Modify plans/Cohorts.hs to:
```
module Cohorts where

import Hasklepias
```

Modify apps/Main.hs to:

module Main where

import  Cohorts

main :: IO ()
main = do
  putStrLn "Hello, Haskell!"

Modify tests/Main.hs to, replacing <myProj> with your project name:

import           Hasklepias
import           Tests

main :: IO ()
main = defaultMain (testGroup "<myProj> Tests" [tests])

Add a file Tests.hs file in the plans directory with the following:

module Tests where

import           Hasklepias
import           Cohorts


tests :: TestTree
tests = testGroup "example" [testCase "true is true" (True @?= True) ]

Add a cabal.project file in the project’s root directory with the following, making sure to set the tag field to the git tag of the asclepias version to use. Replace <myProj> with your project name.

source-repository-package
  type: git
  location: https://github.com/novisci/asclepias.git
  tag: <SET TO DESIRED VERSION TAG, e.g. v0.27.0>
  subdir: hasklepias-core hasklepias-main event-data-theory

packages: ./<myProj>.cabal

Add the following entries to .gitignore, adding a .gitignore file in the project root if necessary:
```
dist-newstyle*
*.tix
```
Run cabal build to ensure the minimal project builds successfully.

4. Create Cohort Application

Cohorts are created in the Cohorts.hs file.

To define cohorts in asclepias, one must create a CohortSpec type. The CohortSpec type takes three inputs:

The shape of the input data
The shape of the output data
an IndexSet

Instructions on creating a cohort follow.

a. Import Fact-Model

If the project data does not meet the Data Requirements, or the ETL process will not be run on your data, skip this step.

Go to the event-data-model repository.
Identify the SHA commit containing your model.

After the last package declaration in cabal.project, add the fact-models dependency using the following code, replacing YourCommitSHA` with the SHA commit identified above.

source-repository-package
   type: git
   location: https://gitlab+deploy-token-ns-projects:glpat-7Z4w2JGrm2692Bshcqhd@gitlab.novisci.com/nsStat/event-data-model.git
   tag: YourCommitSHA
   subdir:
    fact-models

Add Models.YourModelType to your module import declaration. YourModelType is the model type name defined in the Model.dhall file.
Your model type can now be refereced in code using YourModelType.

b. Defining Input Data Shape

TODO - I think we need instructions on how to tag things I’m assuming that the tags will be in place already if using notionate?

If using the Event type, define your project event as follows:

type YourEvent a = Event YourTags YourModel a

where

a is the temporal unit

YourTags is the type of your project’s tags

YourModel is the type of your project’s model

TODO Do you need to have a tag type defined? If so - can it be empty and what would that look like?

If not using the Event type, define your input data type base on your project’s requirements.

In the example below, the input data is a pair of values - one bool and one integer.

type InputData = (Bool, Integer)

TODO add comment on subject identifier construction for both Event and non-Event data

When using non-Event data, the structure of the data still needs to meet certain requirements. When parsed, the data must match the Population d type, where d is the type of your input data. The Population type can be interpreted as the subject identifier. Any type that is coercible to a Text can be used for Population.

The example below defines non-Event data. The Population has type Integer. The input data type d has type MyInputData, which is equivalent to (Bool, Integer).

examplePopulation :: Population MyInputData
examplePopulation =
  from @[(Integer, MyInputData)]
  [ (1, (False, 5))
  , (2, (True, 0))
  , (3, (False, 99))
  , (4, (False, -1))
  , (5, (True, -2))
  , (6, (False, 42))
  , (7, (True, 8))
  , (8, (True, 123))
  , (9, (False, 85))
  , (10, (True, 42))
  ]

Note examplePopulation is defined in code. To use non-Event flat files, additional structure will be needed. Contact the stats dev team for assistance.

c. Defining Output Data Shape

The output data is defined in the Cohorts.hs file, and should match the specifications for your project.

Below is an example of an output type using Event format. The output type for Event data should always be Featureset.

TODO - it’s hard to see the "shape" of the output data based on this What will a Featureset look like? Also, I didn’t think features were needed to create a cohort?

THIS EXAMPLE IS INCOMPLETE/INCORRECT AND WILL NOT COMPILE SEE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40905 AND https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40906

featureRunner :: Interval Day -> Es Day -> Featureset
featureRunner i es = do
  let ix        = pure i
  let ev        = pure es
  let flwevents = eval defFlwupEvents ix ev

  featureset
    (  packFeature (eval defAge ix ev)
    :| [packFeature (eval defOutcome flwevents)]
    <> statinCovariates ix ev
    )

Below is an example of a custom output type.

type OutputData = (Fanaticism, Maybe Integer)

d. Defining an IndexSet

An IndexSet represents the set of all index dates for an event. asclepias requires a function defining how to create the IndexSet from the input data in order to create a CohortSpec.

An IndexSet can always be made by applying the makeIndexSet function to a list. Further, asclepias also defines an instance of into, which coerces a list into an IndexSet. In fact, makeIndexSet is just a wrapper around into, as seen below:

makeIndexSet :: (Ord i) => [i] -> IndexSet i
makeIndexSet = into

Creating an empty IndexSet

If no index is required for your project, no function need be defined explicitly. But, you could create a function like that below:

emptyIndex :: YourInputDataType -> IndexSet ()
emptyIndex _ = makeIndexSet [()]

Update YourInputDataType with your project-specific input data type.

Instead, you could also use () for the Index type when needed.

Creating an IndexSet from an Event

Define a function with type [Event t m a] → IndexSet (Interval a), where t represents the project tags, m represents the project model, and a represents the type of the project’s temporal unit

The following example index function takes in a list of Event s and creates an IndexSet where the index is defined as the last day that YOUR_TAG occurred for longer than 1 temporal unit (a) during yourStudyPeriod.

TODO - THIS EXAMPLE IS INCOMPLETE/INCORRECT AND WILL NOT COMPILE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40892 https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40893

index :: [Event t m a] -> IndexSet (Interval a)
index events =
  events
    |> filterEvents (containsTag [YOUR_TAG])
    |> combineIntervals
    |> filter
         (getPredicate
           (Predicate ((> 1) . duration) &&& Predicate (concur yourStudyPeriod))
         )
    |> fmap (endervalMoment . end)
    |> into

Creating an IndexSet from a list

Define a function with type YourInputDataType → IndexSet (Interval a), where YourInputDataType is the name of your input data type.

TODO - what restrictions are there when not using Event type? TODO finish instructions after getting more feedback on restrictions

e. Define Tags

TODO in the asclepias-project-example we needed to do this, but I feel like if we use event-data-model, the tags are already defined in a "real" project, will tags need to be defined in code?

Best Practices

The TagSet type t should be a sum type object. Meaning, each possible tag should be enumerated in the type (data MyProjectTags = Diabetes | BirthDay | InHospital | ...). By defining the tag set as a sum type, type safety is ensured. One cannot misspell a tag or use an undefined tag, for example. This is in contrast to a common alternative to a sum type, a Text type. Text has no such guarantees on type safely. One can misspell a Text tag. One can also define a function checking the values of a Text tag that compiles, but does not produce correct output. Again, the recommendation is to use sum types for all TagSet definitions.

The schema (m) type for an Event must be an instance of Eq, Show, Generic, and FromJSON typeclasses. The DeriveGeneric language extension makes deriving the Generic instance trivial, as in the code above. At this time, users do need to provide the FromJSON instance, and the boilerplate in the example above should work in most cases.

The tag (t) type for an Event must be an instance of Eq, Show, Typeable, and FromJSON typeclasses. Making t Generic will also make it Typeable, so in most cases simply deriving (Eq, Show, Generic) and a stock FromJSON instance is sufficient for the tag type.

The event-data-theory packages provides a few utilities for testing a new model. These can be found in the EventDataTheory.Test module, which is not included in the main set of exported modules.

The eventDecodeTests and eventDecodeFailTests functions, for example, test for successful parsing and successful failed parsing (respectively) of EventLine m t a into the corresponding Event t m a. These functions take a directory path as an argument. Each file ending .jsonl in that directory should contain a single EventLine as JSON to be tested. See the test directory and EventDataTheory.TheoryTest module in this package for examples.

f. Define Criteria

A Criteria is a set of Criterion, paired with a natural number.

newtype Criteria = MkCriteria [ (Natural, Criterion) ]
  deriving (Eq, Show)

Each Criterion is a tuple, (Text, Status), where the Text value represents a label and the Status is either Include or Exclude.

newtype Criterion = MkCriterion ( Text, Status ) deriving (Eq, Show)

excludeIf and includeIf are helper function that take in a Bool and return a Status. exludeIf False returns Include and excludeIf True returns Exclude. includeIf, given the same inputs, returns the opposite Status values.

includeIf :: Bool -> Status
includeIf True  = Include
includeIf False = Exclude

excludeIf :: Bool -> Status
excludeIf True  = Exclude
excludeIf False = Include

CohortSpec requires a function that takes in the IndexSet and the shape of the input data and returns the Criteria. Instructions on creating the criteria-producing function follow.

Review the SAP for the inclusion/exclusion criteria.
(Recommended) Create a function for each inclusion/exclusion in the SAP that returns a Status.
Define your criteria-producing function.
Define your function type, a → YourInputData → Criteria, where a is your IndexSet type and YourInputData is your input data type.
Apply the correct pattern matching for your input data.
Define the function output as criteria [yourCriterionList] where yourCriterionList is constructed as criterion yourLabel (yourCriterionFunction yourInput). If any of these criterion statements result in a Status of Exclude, the subject will be excluded.

In the example below we define an empty index set, YourIndex, and a new type, YourInputData which represents the input data type. We use these types to define yourCriteria, a function that returns the Status Include if the boolean value that is contained in YourInputDataType is False, and the integer in YourInputDataType is greater than 5. Otherwise Exclude is returned.

type YourInputData = (Bool, Integer)

yourIndex :: YourInputDataType -> IndexSet ()
yourIndex _ = makeIndexSet [()]

yourCriterion1 :: Bool -> Status
yourCriterion1 b = incudeIf b

yourCriterion2 :: Integer -> Status
yourCriterion2 n = includeIf (n > 5)

yourCriteria :: () -> YourInputData -> Criteria
yourCriteria myIndex (b, n) =
    criteria
     [ criterion "include if b = True" (yourCriterion1 b)
     , criterion "include if n > 5" (yourCriterion2 n)
     ]

Using the Event type data allows the use of additional asclepias functionality. TODO include an example with event data that does not use the feature-to-criteria coercion

Note that Criteria can also be created from the Feature type. You can use into to cast a Feature n Status. The instance of into is defined below.

instance KnownSymbol n => From (Feature n Status) Criterion where
  from x = MkCriterion
    ( pack $ symbolVal (Proxy @n)
    , case s of
      Left  mr  -> Exclude
      Right sta -> sta
    )
    where s = getData x

The Text component of the Criterion tuple is taken from the n ame of the Feature n Status. The Status component of the Criteria tuple is set to Exclude for all Left values of Feature n Status. Right values return the Status of the of the Feature n Status object, which can either be Exclude or Include. Note that the user defines how the Left values are constructed in the Feature itself.

Below is an example of defining the criteria-producing function using this coercion. The input data have the standard Event type. For inclusion, a subject must be over 50 years old. ageAtIndex is defined separately because it is used for both the criteria and the features.

TODO - THIS EXAMPLE IS INCOMPLETE/INCORRECT AND WILL NOT COMPILE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40888

type Es a = [Event YourProjectTags YourProjectModel a]

ageAtIndex :: Interval Day -> Es Day -> F "ageAtIndex" Integer
ageAtIndex i events =
  events
    |> viewBirthYears ( getFacts . getContext )
    |> fmap (\y -> computeAgeAt (date y 1 7) (begin i))  -- Use July 1 YEAR as birthdate
    |> headMay
    |> \case
         Nothing  -> makeFeature $ missingBecause InsufficientData
         Just age -> pure age

critOver50 :: Def (F "ageAtIndex" Integer -> F "isOver50" Status)
critOver50 = define (includeIf . (>= 50))

critRunner :: Interval Day -> Es Day -> Criteria
critRunner i es =
  let age = ageAtIndex i es
  in  let crit1 = eval critOver50 (ageAtIndex i es)
      in  criteria [into crit1]

featureRunner :: Interval Day -> Es Day -> Featureset
featureRunner i es = do
  let ix        = pure i
  let ev        = pure es

  featureset
    (  packFeature (eval (defineA ageAtIndex) ix ev)
    )

g. Create Features

TODO write procedures/instructions for creating a feature.

h. Define Cohort Specification

Event Data Example

cohortSpecs :: CohortMapSpec (Es Day) Featureset (Interval Day)
cohortSpecs = makeCohortSpecs [("main", index, critRunner, featureRunner)]

Non-Event Data Example

minimalCohort :: CohortSpec InputData OutputData ()
minimalCohort = specifyCohort emptyIndex sillyCriteria sillyOutput

5. Test Cohort Application

Unit Tests

Unit tests examine minimal, independent pieces of code to ensure correct functionality. The goal is to have 100% "coverage", meaning every line of code is executed during unit testing. Positive test cases - where results are returned - and negative test cases - where errors are expected - should be written. Test cases should be written based on the project’s specifications and requirements.

Guidelines

Pick a minimal, independent piece of code.
Review your project specifications for the expected functionality of that code.
Create negative test cases.
Answer the following questions:
When should your code fail?
What specifically will inputs that cause failure look like?
How should the code fail (error, warning, silent failure)?
For each expected failure, define a set of inputs that should result in this failure.
For each expected failure, define a set of expected results.
Create positive test cases.
Answer the following questions:
When should your code succeed?
What, specifically, will inputs that cause successful results look like?
How do the successful results differ based on input values?/ What are the possible combinations of inputs that result in successful, but different results?
For each expected success, define a set of inputs that should result in this success.
For each expected success, define a set of expected results.
Put all test cases into a test plan.
Run the test plan using cabal test.

Example

Below is the definition for an index function.

TODO - THIS EXAMPLE WILL NOT COMPILE SEE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40914

index :: Es Day -> IndexSet (Interval Day)
index events =
  events
    |> filterEvents (containsTag [YourTag])
    |> combineIntervals
    |> filter
         (getPredicate
           (Predicate ((> 1) . duration) &&& Predicate (concur studyPeriod))
         )
    |> fmap (endervalMoment . end)
    |> into

The index for this study is specified as the last unit of any time interval that concurs with the study period, and has duration greater than 1, and also contains the tag "YourTag". We wish to test that index meets these specifications.

Below is the unit test for index.

TODO - THIS EXAMPLE WILL NOT COMPILE SEE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40914

indexTests :: TestTree
indexTests = testGroup
  "Tests of the index function"
  [ testCase "single mi discharge " $ index dummyEvents1 @?= makeIndexSet
    [endervalMoment (date 2015 1 4)]
  , testCase "multiple mi discharge: meeting intervals combine"
  $   index dummyEvents2
  @?= makeIndexSet [endervalMoment (date 2015 1 6)]
  , testCase "multiple mi discharge: two indices "
  $   index dummyEvents3
  @?= makeIndexSet
        [endervalMoment (date 2015 1 6), endervalMoment (date 2015 1 10)]
  ]

The unit test is a TestTree type. The testGroup function creates the TestTree. testGroup takes in a label, in this case "Test of the index function", followed by a list of testCase statements. Each testCase takes in a label, followed by $, followed by your function called with your input parameters, followed by @?=, followed by your expected result.

For reference, the input data is defined below: TODO - THIS EXAMPLE WILL NOT COMPILE SEE https://gitlab.novisci.com/nsStat/asclepias/-/merge_requests/254#note_40914

dummyEvents1 :: Es Day
dummyEvents1 =
  [ event
      (si (date 2015 1 1, date 2015 1 4))
      (context (packConcepts [MI_DISCHARGE])
               (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
               Nothing
      )
  ]

  dummyEvents2 :: Es Day
dummyEvents2 =
  [ event
    (si (date 2015 1 1, date 2015 1 4))
    (context (packConcepts [MI_DISCHARGE])
             (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
             Nothing
    )
  , event
    (si (date 2015 1 4, date 2015 1 6))
    (context (packConcepts [MI_DISCHARGE])
             (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
             Nothing
    )
  ]

dummyEvents3 :: Es Day
dummyEvents3 =
  [ event
    (si (date 2015 1 1, date 2015 1 4))
    (context (packConcepts [MI_DISCHARGE])
             (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
             Nothing
    )
  , event
    (si (date 2015 1 4, date 2015 1 6))
    (context (packConcepts [MI_DISCHARGE])
             (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
             Nothing
    )
  , event
    (si (date 2015 1 7, date 2015 1 10))
    (context (packConcepts [MI_DISCHARGE])
             (Diagnosis $ mkMedFact "aa" ICD10 Inpatient)
             Nothing
    )
  ]

Notes on Generate Input Data

Data access is strictly managed, and so generating dummy data is encouraged. Further, marshalling data in to haskell requires parsers, which makes creating dummy data in a separate file difficult. We understand the difficulty of this process, and hope to provide utilities in the future to assist with this process. Feedback is welcome.

Integration Test

An integration test examines the connection between units of code, within a single application - in this case aclepias. Again, positive and negative test cases should be considered. Any integration test using actual data must be done on the stats server. If using VSCode, use of the IDE’s remote development tools is recommended. For instructions on installing these tools, see TODO add link. For ease of viewing the results, installing jq is recommended. TODO add link to jq installation instructions

Log on to the stats server using VSCode’s remote tools if necessary.
Identify the file name of a partition of data to input into the application.
Open a terminal
If using a local file to test use the following command:
PROJECT --dir DIR --file FILE.jsonl | jq
If using S3 to test, use the following command:
TODO
Debug and repeat as needed until results are returned error-free.

Other Tests

TODO - how are we testing the whole pipeline? TODO - Any UAT testing - what would that look like?

6. Run Cohort Application

Examples

This section provides detailed examples of asclepias usage.

Features

This section provides examples of feature creation.

Find the last event that occurs within a time window of other events

This example demonstrates:

the formMeetingSequence function from interval-algebra
handling a failure case
writing a function generic over both the tag and interval types

In this example, the goal is to write a function that, given a list of tags, converts a list of events into a list of interval durations such that:

the events with any of the given tags are combined into a "meeting sequence";
durations of events of the resulting sequence which have all of the given tags are returned;
but an empty result is treated as a failure.

A function like this could be useful if you wanted to find the durations of time when a subject was both hospitalized and on some medication.

durationsOf
  :: forall n m t a b
   . (KnownSymbol n, Eventable t m a, IntervalSizeable a b)
  => [t]
  -> [Event t m a]
  -> Feature n [b]
durationsOf tSet =
  filter (`hasAnyTag` tSet) (1)
    .> fmap (into @(TagSetInterval t a)) (2) (3)
    .> formMeetingSequence (4)
    .> filter (`hasAllTags` tSet) (5)
    .> \x -> if null x (6)
         then makeFeature $ featureDataL $ Other "no cases"
         else makeFeature $ featureDataR (durations x)

Take the case that a subject has the following events, and we want to know the duration that a subject was both hospitalized and on antibiotics. Below, we walk through the function step-by-by using this case.

   --                          <- [Non-medication]
      ----                     <- [Hospitalized]
       --                      <- [Antibiotics]
            ----               <- [Antibiotics]
------------------------------

1	Filter events to those that contain at least one of the given tags. `---- <- [Hospitalized] -- <- [Antibiotics] ---- <- [Antibiotics] ------------------------------`
2	Cast each event into a `TagSetInterval t a`, which is a synonym for `PairedInterval (TagSet t) a`.
3	This step is important for the `formMeetingSequence` function, as it requires the "data" part of the paired interval to be a `Monoid`. `TagSet` are a `Monoid` by unioning the elements of two values.
4	Form a sequence of intervals where one meets the next. The data of the running example would look like: `- <- [Hospitalized] -- <- [Hospitalized, Antibiotics] - <- [Hospitalized] -- <- [] ---- <- [Antibiotics] ------------------------------`
5	Filter to those intervals that have both of the given tags. Note that `hasAllTags` works here because `PairedInterval (TagSet c) a` is defined as an instance of the `HasTag` typeclass in `event-data-theory`. `-- <- [Hospitalized, Antibiotics] ------------------------------`
6	Lastly, if the result of the previous step is empty, we return a failure, i.e. a `Left` value of `FeatureData`. Otherwise, we return the durations of any intervals, as a successful `Right` value of `FeatureData`.

The durationsOf function can be lifted into a Definition using defineA:

def
  :: (KnownSymbol n1, KnownSymbol n2, Eventable t m a, IntervalSizeable a b)
  => [t] (1)
  -> Def (F n1 [Event t m a] -> F n2 [b]) (2)
def tSet = defineA (durationsOf tSet)

1	Create a function which takes a list of tags and
2	Returns a `Definition`

Find durations of time that satisfy multiple conditions

This example demonstrates

reasoning with the interval algebra
manipulating intervals
using a tag set to group events

In this example, the goal is to write a function that, given a pair of lists of tags and an interval of time:

filters an input list of events to those that concur with the given interval. Note that concur, in this context, means that the intervals are not disjoint.
splits the events into those with the first tag and those with the second
returns the start of the last event of the first tag set where it occurs within +/- 3 time units of an event of the second tag set.

A function like this is useful for defining an index event where the index needs to concur with a time window of other events.

examplePairComparison
  :: (Eventable t m a, IntervalSizeable a b)
  => ([t], [t])
  -> Interval a
  -> [Event t m a]
  -> Maybe a
examplePairComparison (t1, t2) i =
  filterConcur i    --  (1)
    .> splitByTags t1 t2 (2)
    .> uncurry allPairs (3)
    .> filter (\pr -> fst pr `concur` expand 3 3 (snd pr)) (4)
    .> lastMay (5)
    .> fmap (begin . fst) (6)

Take the case that a subject has the following events, and we want to know the first time a diagnosis occurred within +/- 3 days of a procedure. Our given interval, called Baseline here, is (6, 15). Below, we walk through the function step-by-by using this case.

      ---------                <- Baseline
    -                          <- [pr]
      -                        <- [pr]
          -                    <- [dx]
            -                  <- [pr]
            ----               <- [foo]
------------------------------

1	Filter events to those concurring with the given interval. --------- <- Baseline - <- [pr] - <- [dx] - <- [pr] ---- <- [foo] ------------------------------
2	Form a pair of lists where the first element has `t1` (dx in our example) event intervals and the second has `t2` (pr in our example) event intervals. Any events without tag `t1` or `t2` are dropped. In the running example, the intervals of the events would make the following pair: ( [(10,11)] -- the dx event interval , [(6,7), (12,13)] -- the pr event intervals )
3	Form a list of all (c1, c2) pairs of event intervals from the previous step. [ ( (10,11), (6,7) ) , ( (10,11), (12,13) ) ]
4	Expand the `c2` (pr) event intervals by +/- 3 units of time. [ ( (10,11), (3,10) ) , ( (10,11), (9,16) ) ] Then, filter this list to include only instances where the `c1` (dx) interval concurs with a 'c2' interval. [ ( (10,11), (9,16) ) ]
5	Take `Just` the last element of the list, if it exists. Otherwise, `Nothing`.
6	If it exists, take the `begin` of the last `c1` interval. In our example, this is `Just 10`.

Lastly, the example function can be lifted into a Definition using the define function:

def
  :: (Eventable t m a, IntervalSizeable a b)
  => ([t], [t])
  -> Def (F n1 (Interval a) -> F n2 [Event t m a] -> F n3 (Maybe a))
def tag = define (examplePairComparison tag)

Create a function for identifying whether a unit has a history of some event

This example demonstrates:

a simple feature
writing a function in order to create multiple Feature definitions

Epidemiologic studies often seek to determine whether and when some event occurred. In general, the event logic can be quite complicated, but this example demonstrates a simple feature. We wish to determine whether an event of some given tag set occurred, relative to a provided assessment interval.

The function is given here:

makeHx
  :: (Ord a)
  => [Text] (1)
  -> AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> Maybe (Interval a) (2)
makeHx t i events =
  events
    |> filterEvents (containsTag t &&& Predicate (encloses i)) (3)
    |> lastMay (4)
    |> fmap getInterval (5)

1	The example events use `Text` as the type of tag set, so the first argument is a list of `Text` values that will be used to filter events.
2	The return type is `Maybe (Interval a)`. A value of `Nothing` indicates that no event of interest occurred. If one or more events occur, a value of `Just < some interval >` is the interval of the last event.
3	The first step in the function is to filter events to those that contain at least one of the given tags and satisfies an interval relation relative to assessment interval. For this example, we use the `encloses` relation, meaning the event must not overlap either end of the assessment interval.
4	The `lastMay` function returns the last element of a list, if the last is not empty.
5	Lastly, `getInterval` gets the interval component from the event. The `fmap` function is necessary to apply the function to a `Maybe (Event Text ExampleModel a)`.

With the makeHx function, we can create feature definitions:

duckHxDef (1)
  :: (Ord a)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "duck history" (Maybe (Interval a))
       )
duckHxDef = define (makeHx ["wasBitByDuck", "wasStruckByDuck"])

macawHxDef (2)
  :: (Ord a)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "macaw history" (Maybe (Interval a))
       )
macawHxDef = define (makeHx ["wasBitByMacaw", "wasStruckByMacaw"])

1	Defines a feature that identifies whether a unit was hit by a duck or struck by a duck.
2	Defines a feature that identifies whether a unit was hit by a macaw or struck by a macaw.

Creating "Two outpatient or one inpatient"

This example demonstrates:

a common feature used in studies of medical claims data
using a template to define a feature building function

This example defines a feature that indicates either:

at least 1 event during the baseline interval has a tag from the tag1 tag set
there are at least 2 events that have tag set tag2 which have at least 7 days between them during the baseline interval

twoOutOneIn
  :: (IntervalSizeable a b)
  => [Text] -- ^ inpatientTags
  -> [Text] -- ^ outpatientTags
  -> Definition (1)
       (  Feature "index" (Interval a)
       -> Feature "allEvents" [Event Text ExampleModel a]
       -> Feature name Bool
       )
twoOutOneIn inpatientTags outpatientTags = buildNofXOrMofYWithGapBool (2)
  1
  (containsTag inpatientTags) (3)
  1
  7
  (containsTag outpatientTags) (4)
  concur
  (makeBaselineMeetsIndex 10) (5)

1	The `twoOutOneIn` function returns a `Definition`.
2	We use the `buildNofXOrMofYWithGapBool` template function to build our definition. This function takes seven arguments.
3	The first two are passed to the `buildNofX` template. The given arguments say that we’re looking for at least `1` event that contains one or more of the `inpatientTagSet`.
4	The next three arguments are passed to the `buildNofXWithGap` template. The given arguments say that we’re looking for at least `1` gap between any pair of events (and thus at least 2 events) that contains one or more of the `outpatientTagSet`.
5	The last two arguments determine when the events must occur relative to the index event. Here, the events must `concur` with a baseline assessment interval.

Count number of events

This example demonstrates:

using the AssessmentInterval type
using the combineIntervals function
counting the number of events satifying a condition

This example defines a function that takes an AssessmentInterval and a list of ExampleModel events to return a pair: (count of hospitalization events, duration of the last hospitalization).

countOfHospitalEvents
  :: (IntervalSizeable a b)
  => AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> (Int, Maybe b)
countOfHospitalEvents i =
  filterEvents (containsTag ["wasHospitalized"]) (1)
    .> combineIntervals (2)
    .> filterConcur i (3)
    .> (\x -> (length x, duration <$> lastMay x)) (4)

Consider the follow events as a working example:

     **********      <- [assessment]
 ---                 <- [wasHospitalized]
    --               <- [wasHospitalized]
        --           <- [notHospitalized]
          -----      <- [wasHospitalized]
====================

1	As a first step, events are filtered to those satisfying the predicate of interest, In this example, events are filtered to those that contain the tag `wasHospitalized`: `********** <- [assessment] --- <- [wasHospitalized] -- <- [wasHospitalized] ----- <- [wasHospitalized] ====================`
2	The `combineIntervals` function from the `interval-algebra` package combines intervals that are not `before` or `after`. As in our example, this step can be important to combine intervals that we consider to be a single event. In the example, the first and second events would be joined into one event. `********** <- [assessment] ----- <- [wasHospitalized] ----- <- [wasHospitalized] ====================`
3	After combining the intervals, then the intervals are filtered to those not disjoint from the assessment interval. This step includes all hospitalization intervals in our running example. `********** <- [assessment] ----- <- [wasHospitalized] ----- <- [wasHospitalized] ====================`
4	Lastly, the result is derived from remaining hospitalization intervals. The example result is `(2, Just 5)` since there are 2 intervals and the duration of the last one is 5.

The function presented here is one of many ways to filter and count intervals. For example, the current function includes hospitalizations that overlap the assessment interval. If one wanted to filter out such hospitalizations, the filterConcur i could be changed to filter (not . (disjoint <|> overlaps) i).

Another consideration is the duration measurement. The current function measurement the duration of the last hospitalization interval, disregarding the assessment interval. One may instead want to measure the duration that concurs with the assessment.

The countOfHospitalEvents function can be lifted into a Definition using define:

countOfHospitalEventsDef
  :: (IntervalSizeable a b)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "count of hospitalizations" (Int, Maybe b)
       )
countOfHospitalEventsDef = define countOfHospitalEvents

Discontinuation from a Drug

This example demonstrates:

complex interval-algebra functionality
use of the bind operator (>>=)

In this example, the goal is to write a function that, given an assessment interval and list of events:

filters to antibiotic events
allows for a gap of 5 days between antibiotic events
only allow for treatment sequences that are started or overlapped by the assessment interval
returns the time discontinuation begins and the time since the beginning of the assessment interval to discontinuation.

For this example, we walkthrough three cases.

Case 1

     **********      <- [assessment]
 ---                 <- [tookAntibiotics]
    --               <- [tookAntibiotics]
        --           <- [wasHopitalized]
          -----      <- [tookAntibiotics]
====================

Case 2

     **********      <- [assessment]
      --             <- [tookAntibiotics]
          -----      <- [tookAntibiotics]
====================

Case 3

     **********      <- [assessment]
     ---             <- [tookAntibiotics]
====================

The logic of the feature is defined in the discontinuation function:

discontinuation
  :: (IntervalSizeable a b)
  => AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> Maybe (a, b)
discontinuation i events =
  events
    |> filterEvents (containsTag ["tookAntibiotics"]) (1)
    |> fmap (expandr 5) (2)
    |> combineIntervals (3)
    |> nothingIfNone (startedBy <|> overlappedBy $ i) (4)
    |> (>>= gapsWithin i) (5)
    |> (>>= headMay) (6)
    |> fmap (\x -> (begin x, diff (begin x) (begin i))) (7)

First, we filter to events that have the tag "tookAntibiotics". In Case 1, the third interval is filtered out:

     **********      <- [assessment]
 ---                 <- [tookAntibiotics]
    --               <- [tookAntibiotics]
          -----      <- [tookAntibiotics]
====================

Cases 2 and 3 are unchanged.

To allow for a grace period of 5 days between antibiotic events, each antibiotic event is extended by 5 units using the expandr function: For Case 1, this results in:

     **********      <- [assessment]
 --------            <- [tookAntibiotics]
    -------          <- [tookAntibiotics]
          ---------- <- [tookAntibiotics]
====================

And similarly for Cases 2 and 3.

Antibiotic intervals that concur are considered one treatment sequence, so combineIntervals is used to collapse these intervals. In all the example cases, this results in one interval; e.g. for Case 2:

     **********      <- [assessment]
      -------------- <- [tookAntibiotics]
====================

With all the treatment intervals transformed to allow for a gap in treatment; now we handle the case where none of the intervals start or overlap the assessment interval. The nothingIfNone function takes a predicate and a list and returns Nothing if none of the list elements satisfy the predicate; otherwise, it returns Just the list.

In Cases 1 and 3, the assessment interval is overlappedBy and startedBy (respectively) the treatment interval. However in Case 2, since antibiotic treatment starts after the assessment interval starts, nothingIfNone yields Nothing. This is final result for Case 2

In interval-algebra terminology, the assessment interval in Case 2 overlaps the treatment interval; which is different than being overlappedBy the treatment interval.

So far, we have the treatment interval in hand. We’re interested, though, in discovering gaps in treatment which is considered discontinuation. The gapsWithin function find gaps in the input intervals clipped to the assessment, yielding Nothing if no such gaps exist and Just the gaps otherwise. (See note about >>= below)

Case 1 has no gaps, hence the final result is Nothing. For Case 3, however, there is a gap between the treatment interval and the end of assessment:

     **********      <- [assessment]
             --      <- [gap]
====================

6 If there are multiple gaps in treatment, the first one is the discontinuation of interest.

Finally, provided that a gap in treatment exists, the time of discontinuation is the begin of that gap. The time from the start of assessment to discontinutation is computed by diff (begin x) (begin i).

For Case 2, the final result is Just (13, 8).

As implemented, a Nothing result from discontinuation could either indicate that a subject did not discontinue or that they simply had no antibiotics records. If such a distinction is important, the function could be modified to disambiguate these case using a sum type for example.

The discontinuation function can be lifted into a Definition using define:

discontinuationDef
  :: (IntervalSizeable a b)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "discontinuation" (Maybe (a, b))
       )
discontinuationDef = define discontinuation

Using the >>= operator

The >>= comes from Haskell’s Monad typeclass. Sometimes called the bind operator, it has the following type signature:

(>>=) :: m a -> (a -> m b) -> m b

Consider these lines of discontinuation function:

  |> nothingIfNone ( startedBy <|> overlappedBy $ i)
  |> (>>= gapsWithin i)

The type coming out of the nothingIfNone is Maybe [Interval a].
The type for gapsWithin i is [Interval a] → Maybe [Interval a], and we want that to return a Maybe [Interval a].

If you put those pieces together, you have a concrete signature for >>=:

Maybe [Interval a] -> ([Interval a] -> Maybe [Interval a]) -> Maybe [Interval a]

Cohorts

This section provides examples for defining cohorts.

Defining an Index Set

This example demonstrates:

how to create an index set

In this example, index is defined as the first time that a subject was bitten by an Orca (ICD10 codes W56.21/W56.21XA).

defineIndexSet
  :: Ord a
  => [Event Text ExampleModel a] (1)
  -> IndexSet (Interval a) (2)
defineIndexSet events =
  events
    |> filterEvents (containsTag ["wasBitByOrca"]) (3)
    |> headMay (4)
    |> fmap getInterval (5)
    |> into (6)

1	The input type for this example is a list of events, where the tag set is `Text`, the data model is `ExampleModel`, and the interval time is a generic type `a`.
2	The return type is an `Indexset` of `Interval`. The `IndexSet` type is defined in `hasklepias-core` as either `Nothing` or a set of unique ordered values.
3	To determine whether a subject has an index, we filter to the events tagged with the tag `"wasBitByOrca"`.
4	The `headMay` function gets the first event, if one exists. We’re assuming the input list has already been sorted.
5	Then we get the interval, if it exists.
6	The `into` function casts the output from <4> into a `IndexSet (Interval a)` type.

Defining Assessment Intervals

This example demonstrates:

how to create assessment intervals for baseline and followup

In this example, TODO

bline :: (IntervalSizeable a b) => Interval a -> AssessmentInterval a
bline = makeBaselineMeetsIndex 60

flwup :: (IntervalSizeable a b) => Interval a -> AssessmentInterval a
flwup = makeFollowupStartedByIndex 30

Create a minimal cohort

This examples demonstrates:

a bare bones cohort, created without using features or events.
reading csv formatted data

Create a cohort with calendar-based indices

This examples demonstrates:

specifying cohorts from calendar-based indices
using asclepias' cohort module without using its feature module
using an empty return type for the cohort data to just compute attrition information

Review the cohort building checklist TODO: create such a document

Goal

Tha goal in this example is to create a cohort for each quarter of 2017. The cohort should include subjects if they have an enrollment event concurring with the first day of a quarter. For this example,

Decide on the data model

In this example, we use the following data model in our events: TODO

Here we create a type synonym for the the event type in this example

type Evnt = Event Text ExampleModel Day

Create intervals for dates used for indices

indices :: [Interval Day]
indices = map (\(y, m) -> beginervalMoment (fromGregorian y m 1))
              (allPairs [2017] [1, 4, 7, 10])

Define criteria

isEnrollmentEvent :: Predicate Evnt
isEnrollmentEvent = Predicate
  (\x -> case getFacts (getContext x) of
    Enrollment -> True
    _          -> False
  )

Include the subject if she has an enrollment interval concurring with index.

enrolled :: Interval Day -> [Evnt] -> Status
enrolled i es = includeIf . not . null $ filterEvents
  (Predicate (concur i) &&& isEnrollmentEvent)
  es

Write Cohort Specification

A cohort is TODO: link to cohort definition

makeIndexRunner :: Interval Day -> [Evnt] -> IndexSet (Interval Day)
makeIndexRunner i _ = makeIndexSet [i]

makeCriteriaRunner :: Interval Day -> [Evnt] -> Criteria
makeCriteriaRunner index events = criteria [criterion "isEnrolled" crit1]
  where crit1 = enrolled index events

TODO: we could have done this a different way

Create a cohort with multiple indices

TODO

Collect attrition info across partitions

TODO

Create a cohort application and process in AWS batch

TODO

Templates

Features

This section includes description and usage guides for Definition templates. A Definiton is a function that returns a Feature.

`buildNofXBase`: Basis for N of X pattern

Use this function template to:

write another function that answers a question about some count N of events that satisfy a predicate X.

The buildNofXBase template is used a basis for creating new templates with the following pattern:

Filter events to those satisfying two conditions:
- an interval relation with an AssessmentInterval
- a provided Predicate (such as containing a certain tag set)
Preprocess these events.
Process the events.
Postprocess the events, optionally in conjunction with the AssessmentInterval.

Usage and Examples

example = buildNofXBase combineIntervals (1)
                        (fmap end) (2)
                        (fmap . diff . begin) (3)

The example function above returns another definiton builder that performs this logic:

1	combine the intervals of the input events (collapsing concurring and meeting intervals);
2	get the `end` of each interval;
3	computes the difference from each `end` to the `begin` of the assessment interval.

To then be fully specified as a Definition and used in a project, the example function needs 3 additional inputs:

a function mapping the index interval to an assessment interval.
a predicate function comparing events to the assessment interval.
another predicate function on the events.

For example, the defBaseline180Enrollment below is a Definition that performs the logic of example.

defBaseline180Enrollment = example (makeBaselineMeetsIndex 180) (1)
                                   concur (2)
                                   (containsTag ["enrollment"]) (3)

1 Create a baseline interval from the index to 180 units (e.g. days) back in time.

2 Filter to events that concur with the baseline interval and

contains the tag "enrollment".

This example assumes the Tag type is Text.

Source code

View source code

buildNofXBase
  :: ( Intervallic i0
     , Intervallic i1
     , Witherable container0
     , Witherable container1
     )
  => (container0 (Event t m a) -> container1 (i1 a)) -- ^ function mapping a container of events to a container of intervallic intervals (which could be events!)
  -> (container1 (i1 a) -> t1) -- ^ function mapping the processed events to an intermediate type
  -> (AssessmentInterval a -> t1 -> outputType) -- ^ function casting intermediate type to output type with the option to use the assessment interval
  -> (i0 a -> AssessmentInterval a) -- ^ function which maps index interval to interval in which to assess the feature
  -> ComparativePredicateOf2 (AssessmentInterval a) (Event t m a) -- ^ the interval relation of the input events to the assessment interval
  -> Predicate (Event t m a) -- ^ The predicate to filter to Enrollment events (e.g. 'FeatureEvents.isEnrollment')
  -> Definition
       (  Feature indexName (i0 a)
       -> Feature eventsName (container0 (Event t m a))
       -> Feature varName outputType
       )
buildNofXBase runPreProcess runProcess runPostProcess makeAssessmentInterval relation predicate
  = define
    (\index ->
      -- filter events to those satisfying both
      -- the given relation to the assessment interval
      -- AND the given predicate
      filterEvents
          (Predicate (relation (makeAssessmentInterval index)) &&& predicate)
      -- run the preprocessing function
        .> runPreProcess
      -- run the processing function
        .> runProcess
      -- run the postprocessing function
        .> runPostProcess (makeAssessmentInterval index)
    )

`buildNofX`: Do N events satisfy a predicate X?

Use this template to create a Definition for a Feature that answers the following question:

Do N events relating to the assessment interval in some way satisfy the given predicate?

Usage and Examples

TODO

Specialized Versions

buildNofXBool: specialized to return Bool.
buildNofXBinary: specialized to return a stype Binary value.
buildNofXBinaryConcurBaseline: specialized to filter to events that concur with an assessment interval. created by makeBaselineMeetsIndex of a specified duration and a provided predicate.
buildNofTagSetBinaryConcurBaseline: specialized to filter to events that concur with an assessment interval created by makeBaselineMeetsIndex of a specified duration and that have a given tag set.

Source code

View source code

buildNofX
  :: (Intervallic i, Witherable container)
  => (Bool -> outputType) -- ^ casting function
  -> Int -- ^ minimum number of cases
  -> (i a -> AssessmentInterval a) -- ^ function to transform a 'Cohort.Index' to an 'Cohort.AssessmentInterval'
  -> ComparativePredicateOf2 (AssessmentInterval a) (Event t m a) -- ^ interval predicate
  -> Predicate (Event t m a) -- ^ a predicate on events
  -> Definition
       (  Feature indexName (i a)
       -> Feature eventsName (container (Event t m a))
       -> Feature varName outputType
       )
buildNofX f n = buildNofXBase id (\x -> length x >= n) (const f)

`buildNofUniqueBegins`: Find the begin of all unique N events

Use this template to create a Definition for a Feature that:

filters a list of events to those satisfying both a given predicate and a relate to the assessment interval in the given way;
returns the a list of pairs (b, i) where
- b is the difference between the begin of each unique event and the given assessment interval
- i is a counter starting from 1

Usage and Examples

TODO

Source code

View source code

buildNofUniqueBegins
  :: (Intervallic i, IntervalSizeable a b, Witherable container)
  => (i a -> AssessmentInterval a) -- ^ function to transform a 'Cohort.Index' to an 'Cohort.AssessmentInterval'
  -> ComparativePredicateOf2 (AssessmentInterval a) (Event t m a) -- ^ interval predicate
  -> Predicate (Event t m a) -- ^ a predicate on events
  -> Definition
       (  Feature indexName (i a)
       -> Feature eventsName (container (Event t m a))
       -> Feature varName [(b, Natural)]
       )
buildNofUniqueBegins = buildNofXBase
  (fmap (momentize . getInterval))
  (fmap (, 1 :: Natural) .> F.toList .> M.fromList .> M.toList .> \x ->
    uncurry zip (fmap (scanl1 (+)) (unzip x))
  )
  (\window -> fmap (\i -> (diff (begin (fst i)) (begin window), snd i)))

`buildNofXWithGap`: Do events have a certain gap between them?

Use this template to create a Definition for a Feature that answers:

Are there N gaps of at least the given duration between any pair of events that relate to the assessment interval by the given relation and the satisfy the given predicate?

Find two outpatient events separated by at least 7 days is an example.

Usage and Examples

TODO

Source code

View source code

buildNofXWithGap
  :: ( Intervallic i
     , IntervalSizeable a b
     , IntervalCombinable i a
     , Witherable container
     )
  => (Bool -> outputType)
  -> Int -- ^ the minimum number of gaps
  -> b -- ^ the minimum duration of a gap
  -> (i a -> AssessmentInterval a)
  -> ComparativePredicateOf2 (AssessmentInterval a) (Event t m a)
  -> Predicate (Event t m a)
  -> Definition
       (  Feature indexName (i a)
       -> Feature eventsName (container (Event t m a))
       -> Feature varName outputType
       )
buildNofXWithGap cast nGaps allowableGap = buildNofXBase
  (-- just need the intervals
   fmap getInterval
   -- pairGaps needs List input as the container type
                    .> toList)
  (-- get (Maybe) durations of interval gaps between all pairs
     pairGaps
   -- throw away any non-gaps
  .> catMaybes
   -- keep only those gap durations at least the allowableGap
  .> F.filter (>= allowableGap)
   -- are there at least as many events as desired?
  .> \x -> length x >= nGaps
  )
  (const cast)

`buildNofXOrNofYWithGap`: Is either `buildNofX` or `buildNofXWithGap` satisfied?

Use this template to create a Definition for a Feature that answers:

Do N events satisfy predicate X?
OR are there M gaps of at least the given duration between any pair of events that relate to the assessment interval by the given relation and the satisfy the given predicate Y?

Find two outpatient events separated by at least 7 days or one inpatient event is an example.

Usage and Examples

TODO

Specialized Versions

TODO

Source code

View source code

buildNofXOrMofYWithGap
  :: ( Intervallic i
     , IntervalSizeable a b
     , IntervalCombinable i a
     , Witherable container
     )
  => (outputType -> outputType -> outputType)
  -> (Bool -> outputType)
  -> Int -- ^ count passed to 'buildNofX'
  -> Predicate (Event t m a) -- ^ predicate for 'buildNofX'
  -> Int -- ^ the minimum number of gaps passed to 'buildNofXWithGap'
  -> b -- ^ the minimum duration of a gap passed to 'buildNofXWithGap'
  -> Predicate (Event t m a) -- ^ predicate for 'buildNofXWithGap'
  -> ComparativePredicateOf2
       (AssessmentInterval a)
       (Event t m a)
  -> (i a -> AssessmentInterval a)
  -> Definition
       (  Feature indexName (i a)
       -> Feature
            eventsName
            (container (Event t m a))
       -> Feature varName outputType
       )
buildNofXOrMofYWithGap f cast xCount xPred gapCount gapDuration yPred intervalPred assess
  = D2C f
        (buildNofX cast xCount assess intervalPred xPred)
        (buildNofXWithGap cast gapCount gapDuration assess intervalPred yPred)

`buildIsEnrolled`: Does an enrollment event concur with index?

Use this template to create a Definition of a Feature for a Status where:

you have a predicate for identifying enrollment events;
you want to know whether a subject was enrolled (in a health plan, e.g.) at an index time;
the result will be used as inclusion/exclusion status.

Usage and Examples

TODO

Source code

View source code

buildIsEnrolled
  :: ( Intervallic i0
     , Ord a
     , Monoid (container (Interval a))
     , Applicative container
     , Witherable container
     )
  => Predicate (Event t m a) -- ^ The predicate to filter to Enrollment events (e.g. 'FeatureEvents.isEnrollment')
  -> Definition
       (  Feature indexName (i0 a)
       -> Feature eventsName (container (Event t m a))
       -> Feature varName Status
       )
buildIsEnrolled predicate = define
  (\index ->
    F.filterEvents predicate
      .> combineIntervals
      .> any (concur index)
      .> includeIf
  )

`buildContinuousEnrollment`: Does a sequence of enrollment events continuously occur?

Use this template to create a Definition of a Feature for a "continuous enrollment" Status where:

you have a predicate for identifying enrollment events;
you want to know whether a subject was enrolled (in a health plan, e.g.) during some assessment interval with the possibility for an allowable gap in enrollment;
the result will be used as inclusion/exclusion status.

Usage and Examples

TODO

Source code

View source code

buildContinuousEnrollment
  :: ( Monoid (container (Interval a))
     , Monoid (container (Maybe (Interval a)))
     , Applicative container
     , Witherable container
     , IntervalSizeable a b
     )
  => (i0 a -> AssessmentInterval a) -- ^ function which maps index interval to interval in which to assess enrollment
  -> Predicate (Event t m a)  -- ^ The predicate to filter to events (e.g. 'FeatureEvents.isEnrollment')
  -> b  -- ^ duration of allowable gap between intervals
  ->
    {- tag::templateDefSig0 [] -}
     Definition
       (  Feature indexName (i0 a)
       -> Feature eventsName (container (Event t m a))
       -> Feature prevName Status
       -> Feature varName Status
       )
    {- end::templateDefSig0 [] -}
buildContinuousEnrollment makeAssessmentInterval predicate allowableGap =
  define
    (\index events prevStatus -> case prevStatus of
      Exclude -> Exclude
      Include -> includeIf
        (makeGapsWithinPredicate
          all
          (<)
          allowableGap
          (makeAssessmentInterval index)
          (combineIntervals $ F.filterEvents predicate events)
        )
    )

Cohorts

This section includes description and usage guides for cohort specification templates.

References

Lawvere Theory: https://bartoszmilewski.com/2017/08/26/lawvere-theories/

asclepias GitLab: https://gitlab.novisci.com/nsStat/asclepias

Haskell Setup: nsBuild site

Event Data Model Documentation: xref:event-data:ROOT:index.adoc

The asclepias User Guide

Getting Started

Setting Up Software Tools

Data Requirements

Instructions

1. Define Fact-Model

2. Set Up ETL Process

3. Initialize Project Repo

4. Create Cohort Application

a. Import Fact-Model

b. Defining Input Data Shape

c. Defining Output Data Shape

d. Defining an IndexSet

e. Define Tags

f. Define Criteria

g. Create Features

h. Define Cohort Specification

5. Test Cohort Application

Unit Tests

Guidelines

Example

Notes on Generate Input Data

Integration Test

Other Tests

6. Run Cohort Application

Examples

Features

Find the last event that occurs within a time window of other events

Find durations of time that satisfy multiple conditions

Create a function for identifying whether a unit has a history of some event

Creating "Two outpatient or one inpatient"

Count number of events

Discontinuation from a Drug

Cohorts

Defining an Index Set

Defining Assessment Intervals

Create a minimal cohort

Create a cohort with calendar-based indices

Goal

Decide on the data model

Create intervals for dates used for indices

Define criteria

Write Cohort Specification

Create a cohort with multiple indices

Collect attrition info across partitions

Create a cohort application and process in AWS batch

Templates

Features

buildNofXBase: Basis for N of X pattern

Usage and Examples

Source code

buildNofX: Do N events satisfy a predicate X?

Usage and Examples

Specialized Versions

Source code

buildNofUniqueBegins: Find the begin of all unique N events

Usage and Examples

Source code

buildNofXWithGap: Do events have a certain gap between them?

Usage and Examples

Source code

buildNofXOrNofYWithGap: Is either buildNofX or buildNofXWithGap satisfied?

Usage and Examples

Specialized Versions

Source code

buildIsEnrolled: Does an enrollment event concur with index?

Usage and Examples

Source code

buildContinuousEnrollment: Does a sequence of enrollment events continuously occur?

Usage and Examples

Source code

Cohorts

References

The `asclepias` User Guide

`buildNofXBase`: Basis for N of X pattern

`buildNofX`: Do N events satisfy a predicate X?

`buildNofUniqueBegins`: Find the begin of all unique N events

`buildNofXWithGap`: Do events have a certain gap between them?

`buildNofXOrNofYWithGap`: Is either `buildNofX` or `buildNofXWithGap` satisfied?

`buildIsEnrolled`: Does an enrollment event concur with index?

`buildContinuousEnrollment`: Does a sequence of enrollment events continuously occur?