Defining Features in Ascelpias

For more explanation on the theory behind features, see Design of the Features module.

Find the last event that occurs within a time window of other events

This example demonstrates:

  • the formMeetingSequence function from interval-algebra

  • handling a failure case

  • writing a function generic over both the concept and interval types

In this example, the goal is to write a function that, given a list of concepts, converts a list of events into a list of interval durations such that:

  • the events with any of the given concepts are combined into a "meeting sequence";

  • durations of events of the resulting sequence which have all of the given concepts are returned;

  • but an empty result is treated as a failure.

A function like this could be useful if you wanted to find the durations of time when a subject was both hospitalized and on some medication.
durationsOf
  :: forall n m c a b
   . (KnownSymbol n, Eventable c m a, IntervalSizeable a b)
  => [c]
  -> [Event c m a]
  -> Feature n [b]
durationsOf cpts =
  filter (`hasAnyConcepts` cpts) (1)
    .> fmap (into @(ConceptsInterval c a)) (2) (3)
    .> formMeetingSequence (4)
    .> filter (`hasAllConcepts` cpts) (5)
    .> \x -> if null x (6)
         then makeFeature $ featureDataL $ Other "no cases"
         else makeFeature $ featureDataR (durations x)

Take the case that a subject has the following events, and we want to know the duration that a subject was both hospitalized and on antibiotics. Below, we walk through the function step-by-by using this case.

   --                          <- [Non-medication]
      ----                     <- [Hospitalized]
       --                      <- [Antibiotics]
            ----               <- [Antibiotics]
------------------------------
1 Filter events to those that contain at least one of the given concepts.
      ----                     <- [Hospitalized]
       --                      <- [Antibiotics]
            ----               <- [Antibiotics]
------------------------------
2 Cast each event into a ConceptsInterval c a, which is a synonym for PairedInterval (Concepts c) a.
3 This step is important for the formMeetingSequence function, as it requires the "data" part of the paired interval to be a Monoid. Concepts are a Monoid by unioning the elements of two values.
4 Form a sequence of intervals where one meets the next. The data of the running example would look like:
      -                        <- [Hospitalized]
       --                      <- [Hospitalized, Antibiotics]
         -                     <- [Hospitalized]
          --                   <- []
            ----               <- [Antibiotics]
------------------------------
5 Filter to those intervals that have both of the given concepts. Note that hasAllConcepts works here because PairedInterval (Concepts c) a is defined as an instance of the HasConcepts typeclass in event-data-theory.
       --                      <- [Hospitalized, Antibiotics]
------------------------------
6 Lastly, if the result of the previous step is empty, we return a failure, i.e. a Left value of FeatureData. Otherwise, we return the durations of any intervals, as a successful Right value of FeatureData.

The durationsOf function can be lifted into a Definition using defineA:

def
  :: (KnownSymbol n1, KnownSymbol n2, Eventable c m a, IntervalSizeable a b)
  => [c] (1)
  -> Def (F n1 [Event c m a] -> F n2 [b]) (2)
def cpts = defineA (durationsOf cpts)
1 Create a function which takes a list of concepts and
2 Returns a Definition

Find durations of time that satisfy multiple conditions

This example demonstrates

  • reasoning with the interval algebra

  • manipulating intervals

  • using concepts to group events

In this example, the goal is to write a function that, given a pair of lists of concepts and an interval of time:

  • filters an input list of events to those that concur with the given interval. Note that concur, in this context, means that the intervals are not disjoint.

  • splits the events into those with the first concepts and those with the second

  • returns the start of the last event of the first set of concepts where it occurs within +/- 3 time units of an event of the second set of concepts.

A function like this is useful for defining an index event where the index needs to concur with a time window of other events.
examplePairComparison
  :: (Eventable c m a, IntervalSizeable a b)
  => ([c], [c])
  -> Interval a
  -> [Event c m a]
  -> Maybe a
examplePairComparison (c1, c2) i =
  filterConcur i    --  (1)
    .> splitByConcepts c1 c2 (2)
    .> uncurry allPairs (3)
    .> filter (\pr -> fst pr `concur` expand 3 3 (snd pr)) (4)
    .> lastMay (5)
    .> fmap (begin . fst) (6)

Take the case that a subject has the following events, and we want to know the first time a diagnosis occurred within +/- 3 days of a procedure. Our given interval, called Baseline here, is (6, 15). Below, we walk through the function step-by-by using this case.

      ---------                <- Baseline
    -                          <- [pr]
      -                        <- [pr]
          -                    <- [dx]
            -                  <- [pr]
            ----               <- [foo]
------------------------------
1 Filter events to those concurring with the given interval.
      ---------                <- Baseline
      -                        <- [pr]
          -                    <- [dx]
            -                  <- [pr]
            ----               <- [foo]
------------------------------
2 Form a pair of lists where the first element has c1 (dx in our example) event intervals and the second has c2 (pr in our example) event intervals. Any events without c1 or c2 concepts are dropped. In the running example, the intervals of the events would make the following pair:
( [(10,11)] -- the dx event interval
, [(6,7), (12,13)] -- the pr event intervals
)
3 Form a list of all (c1, c2) pairs of event intervals from the previous step.
[ ( (10,11), (6,7) )
, ( (10,11), (12,13) )
]
4 Expand the c2 (pr) event intervals by +/- 3 units of time.
[ ( (10,11), (3,10) )
, ( (10,11), (9,16) )
]

Then, filter this list to include only instances where the c1 (dx) interval concurs with a 'c2' interval.

[ ( (10,11), (9,16) ) ]
5 Take Just the last element of the list, if it exists. Otherwise, Nothing.
6 If it exists, take the begin of the last c1 interval. In our example, this is Just 10.

Lastly, the example function can be lifted into a Definition using the define function:

def
  :: (Eventable c m a, IntervalSizeable a b)
  => ([c], [c])
  -> Def (F n1 (Interval a) -> F n2 [Event c m a] -> F n3 (Maybe a))
def cpts = define (examplePairComparison cpts)

Create a function for identifying whether a unit has a history of some event

This example demonstrates:

  • a simple feature

  • writing a function in order to create multiple Feature definitions

Epidemiologic studies often seek to determine whether and when some event occurred. In general, the event logic can be quite complicated, but this example demonstrates a simple feature. We wish to determine whether an event of some given concepts occurred, relative to a provided assessment interval.

The function is given here:

makeHx
  :: (Ord a)
  => [Text] (1)
  -> AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> Maybe (Interval a) (2)
makeHx cpts i events =
  events
    |> filterEvents (containsConcepts cpts &&& Predicate (enclose i)) (3)
    |> lastMay (4)
    |> fmap getInterval (5)
1 The example events use Text as the type of concepts, so the first argument is a list of Text values that will be used to filter events.
2 The return type is Maybe (Interval a). A value of Nothing indicates that no event of interest occurred. If one or more events occur, a value of Just < some interval > is the interval of the last event.
3 The first step in the function is to filter events to those that contain at least one of the given concepts and satisfies an interval relation relative to assessment interval. For this example, we use the enclose relation, meaning the event must not overlap either end of the assessment interval.
4 The lastMay function returns the last element of a list, if the last is not empty.
5 Lastly, getInterval gets the interval component from the event. The fmap function is necessary to apply the function to a Maybe (Event Text ExampleModel a).

With the makeHx function, we can create feature definitions:

duckHxDef (1)
  :: (Ord a)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "duck history" (Maybe (Interval a))
       )
duckHxDef = define (makeHx ["wasBitByDuck", "wasStruckByDuck"])

macawHxDef (2)
  :: (Ord a)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "macaw history" (Maybe (Interval a))
       )
macawHxDef = define (makeHx ["wasBitByMacaw", "wasStruckByMacaw"])
1 Defines a feature that identifies whether a unit was hit by a duck or struck by a duck.
2 Defines a feature that identifies whether a unit was hit by a macaw or struck by a macaw.

Creating "Two outpatient or one inpatient"

This example demonstrates:

  • a common feature used in studies of medical claims data

  • using a template to define a feature building function

This example defines a feature that indicates either:

  • at least 1 event during the baseline interval has any of the cpts1 concepts

  • there are at least 2 events that have cpts2 concepts which have at least 7 days between them during the baseline interval

twoOutOneIn
  :: (IntervalSizeable a b)
  => [Text] -- ^ inpatientConcepts
  -> [Text] -- ^ outpatientConcepts 
  -> Definition (1)
       (  Feature "index" (Interval a)
       -> Feature "allEvents" [Event Text ExampleModel a]
       -> Feature name Bool
       )
twoOutOneIn inpatientConcepts outpatientConcepts = buildNofXOrMofYWithGapBool (2)
  1
  (containsConcepts inpatientConcepts) (3)
  1
  7
  (containsConcepts outpatientConcepts) (4)
  concur
  (makeBaselineMeetsIndex 10) (5)
1 The twoOutOneIn function returns a Definition.
2 We use the buildNofXOrMofYWithGapBool template function to build our definition. This function takes seven arguments.
3 The first two are passed to the buildNofX template. The given arguments say that we’re looking for at least 1 event that contains one or more of the inpatientConcepts.
4 The next three arguments are passed to the buildNofXWithGap template. The given arguments say that we’re looking for at least 1 gap between any pair of events (and thus at least 2 events) that contains one or more of the outpatientConcepts.
5 The last two arguments determine when the events must occur relative to the index event. Here, the events must concur with a baseline assessment interval.

Count number of events

This example demonstrates:

  • using the AssessmentInterval type

  • using the combineIntervals function

  • counting the number of events satifying a condition

This example defines a function that takes an AssessmentInterval and a list of ExampleModel events to return a pair: (count of hospitalization events, duration of the last hospitalization).

countOfHospitalEvents
  :: (IntervalSizeable a b)
  => AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> (Int, Maybe b)
countOfHospitalEvents i =
  filterEvents (containsConcepts ["wasHospitalized"]) (1)
    .> combineIntervals (2)
    .> filterConcur i (3)
    .> (\x -> (length x, duration <$> lastMay x)) (4)

Consider the follow events as a working example:

     **********      <- [assessment]
 ---                 <- [wasHospitalized]
    --               <- [wasHospitalized]
        --           <- [notHospitalized]
          -----      <- [wasHospitalized]
====================
1 As a first step, events are filtered to those satisfying the predicate of interest, In this example, events are filtered to those that contain the concept wasHospitalized:
     **********      <- [assessment]
 ---                 <- [wasHospitalized]
    --               <- [wasHospitalized]
          -----      <- [wasHospitalized]
====================
2 The combineIntervals function from the interval-algebra package combines intervals that are not before or after. As in our example, this step can be important to combine intervals that we consider to be a single event. In the example, the first and second events would be joined into one event.
     **********      <- [assessment]
 -----               <- [wasHospitalized]
          -----      <- [wasHospitalized]
====================
3 After combining the intervals, then the intervals are filtered to those not disjoint from the assessment interval. This step includes all hospitalization intervals in our running example.
     **********      <- [assessment]
 -----               <- [wasHospitalized]
          -----      <- [wasHospitalized]
====================
4 Lastly, the result is derived from remaining hospitalization intervals. The example result is (2, Just 5) since there are 2 intervals and the duration of the last one is 5.

The function presented here is one of many ways to filter and count intervals. For example, the current function includes hospitalizations that overlap the assessment interval. If one wanted to filter out such hospitalizations, the filterConcur i could be changed to filter (not . (disjoint <|> overlaps) i).

Another consideration is the duration measurement. The current function measurement the duration of the last hospitalization interval, disregarding the assessment interval. One may instead want to measure the duration that concurs with the assessment.

The countOfHospitalEvents function can be lifted into a Definition using define:

countOfHospitalEventsDef
  :: (IntervalSizeable a b)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "count of hospitalizations" (Int, Maybe b)
       )
countOfHospitalEventsDef = define countOfHospitalEvents

Discontinuation from a Drug

This example demonstrates:

  • complex interval-algebra functionality

  • use of the bind operator (>>=)

In this example, the goal is to write a function that, given an assessment interval and list of events:

  • filters to antibiotic events

  • allows for a gap of 5 days between antibiotic events

  • only allow for treatment sequences that are started or overlapped by the assessment interval

  • returns the time discontinuation begins and the time since the beginning of the assessment interval to discontinuation.

For this example, we walkthrough three cases.

Case 1
     **********      <- [assessment]
 ---                 <- [tookAntibiotics]
    --               <- [tookAntibiotics]
        --           <- [wasHopitalized]
          -----      <- [tookAntibiotics]
====================
Case 2
     **********      <- [assessment]
      --             <- [tookAntibiotics]
          -----      <- [tookAntibiotics]
====================
Case 3
     **********      <- [assessment]
     ---             <- [tookAntibiotics]
====================

The logic of the feature is defined in the discontinuation function:

discontinuation
  :: (IntervalSizeable a b)
  => AssessmentInterval a
  -> [Event Text ExampleModel a]
  -> Maybe (a, b)
discontinuation i events =
  events
    |> filterEvents (containsConcepts ["tookAntibiotics"]) (1)
    |> fmap (expandr 5) (2)
    |> combineIntervals (3)
    |> nothingIfNone (startedBy <|> overlappedBy $ i) (4)
    |> (>>= gapsWithin i) (5)
    |> (>>= headMay) (6)
    |> fmap (\x -> (begin x, diff (begin x) (begin i))) (7)
1 First, we filter to events that have the concept "tookAntibiotics". In Case 1, the third interval is filtered out:
     **********      <- [assessment]
 ---                 <- [tookAntibiotics]
    --               <- [tookAntibiotics]
          -----      <- [tookAntibiotics]
====================

Cases 2 and 3 are unchanged.

2 To allow for a grace period of 5 days between antibiotic events, each antibiotic event is extended by 5 units using the expandr function: For Case 1, this results in:
     **********      <- [assessment]
 --------            <- [tookAntibiotics]
    -------          <- [tookAntibiotics]
          ---------- <- [tookAntibiotics]
====================

And similarly for Cases 2 and 3.

3 Antibiotic intervals that concur are considered one treatment sequence, so combineIntervals is used to collapse these intervals. In all the example cases, this results in one interval; e.g. for Case 2:
     **********      <- [assessment]
      -------------- <- [tookAntibiotics]
====================
4 With all the treatment intervals transformed to allow for a gap in treatment; now we handle the case where none of the intervals start or overlap the assessment interval. The nothingIfNone function takes a predicate and a list and returns Nothing if none of the list elements satisfy the predicate; otherwise, it returns Just the list.

In Cases 1 and 3, the assessment interval is overlappedBy and startedBy (respectively) the treatment interval. However in Case 2, since antibiotic treatment starts after the assessment interval starts, nothingIfNone yields Nothing. This is final result for Case 2

In interval-algebra terminology, the assessment interval in Case 2 overlaps the treatment interval; which is different than being overlappedBy the treatment interval.
5 So far, we have the treatment interval in hand. We’re interested, though, in discovering gaps in treatment which is considered discontinuation. The gapsWithin function find gaps in the input intervals clipped to the assessment, yielding Nothing if no such gaps exist and Just the gaps otherwise. (See note about >>= below)

Case 1 has no gaps, hence the final result is Nothing. For Case 3, however, there is a gap between the treatment interval and the end of assessment:

     **********      <- [assessment]
             --      <- [gap]
====================
6 If there are multiple gaps in treatment, the first one is the discontinuation of interest.
7 Finally, provided that a gap in treatment exists, the time of discontinuation is the begin of that gap. The time from the start of assessment to discontinutation is computed by diff (begin x) (begin i).

For Case 2, the final result is Just (13, 8).

As implemented, a Nothing result from discontinuation could either indicate that a subject did not discontinue or that they simply had no antibiotics records. If such a distinction is important, the function could be modified to disambiguate these case using a sum type for example.

The discontinuation function can be lifted into a Definition using define:

discontinuationDef
  :: (IntervalSizeable a b)
  => Definition
       (  Feature "index" (AssessmentInterval a)
       -> Feature "events" [Event Text ExampleModel a]
       -> Feature "discontinuation" (Maybe (a, b))
       )
discontinuationDef = define discontinuation
Using the >>= operator

The >>= comes from Haskell’s Monad typeclass. Sometimes called the bind operator, it has the following type signature:

(>>=) :: m a -> (a -> m b) -> m b

Consider these lines of discontinuation function:

  |> nothingIfNone ( startedBy <|> overlappedBy $ i)
  |> (>>= gapsWithin i)
  • The type coming out of the nothingIfNone is Maybe [Interval a].

  • The type for gapsWithin i is [Interval a] → Maybe [Interval a], and we want that to return a Maybe [Interval a].

If you put those pieces together, you have a concrete signature for >>=:

Maybe [Interval a] -> ([Interval a] -> Maybe [Interval a]) -> Maybe [Interval a]