Event Data Theory
The terms "theory" and "model" are borrowed from the notion of a Lawvere theory. [1]
Definitions
Event
An Event
is a Context
(what happened) with an associated time interval (when it happened).
Concretely, an Event
is a wrapper around the interval-algebra
package’s
PairedInterval
type:
newtype Event t m a = MkEvent ( PairedInterval (Context t m) a )
Context
A Context
contains up to three types of information:
-
A tag set (required)
-
Facts about the event (required)
-
Metadata on the source of the event (optional)
A tag is a set of labels that give meaning to the events of interest. For example, "diabetes diagnosis", "birth day", "in hospital" are all possible tags, that together might define the study tag set.
An example of a context is below.
data Context t m = MkContext
{ -- | the 'TagSet' of a @Context@
getTagSet :: TagSet t (1)
-- | the facts of a @Context@.
, getFacts :: m (2)
-- | the 'Source' of @Context@
, getSource :: Maybe Source (3)
}
1 | a set of TagSet , or labels,
which can be used to identify events in a collection; |
2 | facts about the event whose shape and possible values
are determined by the schema type m ; |
3 | (optionally) data about the provenance of the event in a Source object. |
Facts
Facts are the data of interest for a particular event. The schema of the facts data is dynamic and is passed to the object as a parameter.
Event Model
Passing in specific parameters m
and c
to Context
creates a new event model.
An example of an event model is below.
data SillySchema =
A Int
| B Text
| C
| D
deriving (Show, Eq, Generic, Data)
instance FromJSON SillySchema where
parseJSON = genericParseJSON
(defaultOptions
{ sumEncoding = TaggedObject { tagFieldName = "domain"
, contentsFieldName = "facts"
}
}
)
type SillyEvent1 a = Event Text SillySchema a
The SillyEvent
type is a synonym for an Event
where
the tag set is Text
,
the facts are of shape SillySchema
,
and the Interval
type is any valid type a
.
The type parameter m
provides
a high degree of flexibility in defining new event models.
The m
type represents the schema, or shape,
of an event’s data and
can be a nearly arbitrary type
composed of sum and product types.
Often, the m
type will be a sum type of "domains"
where each domain is a group of facts relevant to a given domain.
The schema of NoviSci’s standard
EDM
is organized around this idea.