gofl
is built around 3 algebraic operators:
+
: the sum operator:
: the product operator*
: the sum of the sum and productBy design, these operators correspond to the same operators in
R
model formulas. Under the hood, these operators are
replaced in the AST by internal generic operators %s%
,
%p%
, and %ssp%
Additionally, gofl
has two functions, tag
and .zoom
, which have specific use cases.
The work of gofl
is done by defining how sum
(%s%
) and product (%p%
) work on different
types.
Permuation matrices are used to define variable levels:
X <- gofl:::as_tmatrix("var1", c("level1", "level2", "level3"))
Y <- gofl:::as_tmatrix("var2", LETTERS[1:5])
Each row corresponds to a group:
X@mat
#> 3 x 3 diagonal matrix of class "ddiMatrix"
#> var1____level1 var1____level2 var1____level3
#> [1,] 1 . .
#> [2,] . 1 .
#> [3,] . . 1
Y@mat
#> 5 x 5 diagonal matrix of class "ddiMatrix"
#> var2____A var2____B var2____C var2____D var2____E
#> [1,] 1 . . . .
#> [2,] . 1 . . .
#> [3,] . . 1 . .
#> [4,] . . . 1 .
#> [5,] . . . . 1
The sum of these two matrices is the direct sum, or block diagonal matrix:
gofl:::`%s%`(X@mat, Y@mat)
#> 8 x 8 sparse Matrix of class "dtCMatrix"
#> var1____level1 var1____level2 var1____level3 var2____A var2____B var2____C
#> [1,] 1 . . . . .
#> [2,] . 1 . . . .
#> [3,] . . 1 . . .
#> [4,] . . . 1 . .
#> [5,] . . . . 1 .
#> [6,] . . . . . 1
#> [7,] . . . . . .
#> [8,] . . . . . .
#> var2____D var2____E
#> [1,] . .
#> [2,] . .
#> [3,] . .
#> [4,] . .
#> [5,] . .
#> [6,] . .
#> [7,] 1 .
#> [8,] . 1
The product of these two matrices is the cartesian product.
gofl:::`%p%`(X@mat, Y@mat)
#> 15 x 8 sparse Matrix of class "dgCMatrix"
#> var1____level1 var1____level2 var1____level3 var2____A var2____B
#> [1,] 1 . . 1 .
#> [2,] 1 . . . 1
#> [3,] 1 . . . .
#> [4,] 1 . . . .
#> [5,] 1 . . . .
#> [6,] . 1 . 1 .
#> [7,] . 1 . . 1
#> [8,] . 1 . . .
#> [9,] . 1 . . .
#> [10,] . 1 . . .
#> [11,] . . 1 1 .
#> [12,] . . 1 . 1
#> [13,] . . 1 . .
#> [14,] . . 1 . .
#> [15,] . . 1 . .
#> var2____C var2____D var2____E
#> [1,] . . .
#> [2,] . . .
#> [3,] 1 . .
#> [4,] . 1 .
#> [5,] . . 1
#> [6,] . . .
#> [7,] . . .
#> [8,] 1 . .
#> [9,] . 1 .
#> [10,] . . 1
#> [11,] . . .
#> [12,] . . .
#> [13,] 1 . .
#> [14,] . 1 .
#> [15,] . . 1
The sum of the sum and product does both the above operations but columns with the same name are stacked appropriately:
gofl:::`%ssp%`(X@mat, Y@mat)
#> 23 x 8 sparse Matrix of class "dgCMatrix"
#> var1____level1 var1____level2 var1____level3 var2____A var2____B
#> [1,] 1 . . . .
#> [2,] . 1 . . .
#> [3,] . . 1 . .
#> [4,] . . . 1 .
#> [5,] . . . . 1
#> [6,] . . . . .
#> [7,] . . . . .
#> [8,] . . . . .
#> [9,] 1 . . 1 .
#> [10,] 1 . . . 1
#> [11,] 1 . . . .
#> [12,] 1 . . . .
#> [13,] 1 . . . .
#> [14,] . 1 . 1 .
#> [15,] . 1 . . 1
#> [16,] . 1 . . .
#> [17,] . 1 . . .
#> [18,] . 1 . . .
#> [19,] . . 1 1 .
#> [20,] . . 1 . 1
#> [21,] . . 1 . .
#> [22,] . . 1 . .
#> [23,] . . 1 . .
#> var2____C var2____D var2____E
#> [1,] . . .
#> [2,] . . .
#> [3,] . . .
#> [4,] . . .
#> [5,] . . .
#> [6,] 1 . .
#> [7,] . 1 .
#> [8,] . . 1
#> [9,] . . .
#> [10,] . . .
#> [11,] 1 . .
#> [12,] . 1 .
#> [13,] . . 1
#> [14,] . . .
#> [15,] . . .
#> [16,] 1 . .
#> [17,] . 1 .
#> [18,] . . 1
#> [19,] . . .
#> [20,] . . .
#> [21,] 1 . .
#> [22,] . 1 .
#> [23,] . . 1
These operators are defined analoguouly are defined for lists. The sum of two lists is concatenation:
gofl:::`%s%`(list(1, 2), list(2, 3, 5))
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
#>
#> [[3]]
#> [1] 2
#>
#> [[4]]
#> [1] 3
#>
#> [[5]]
#> [1] 5
The product of two lists is the cartesian product:
gofl:::`%p%`(list(1, 2), list(2, 3, 5))
#> [[1]]
#> [1] 1 2
#>
#> [[2]]
#> [1] 1 3
#>
#> [[3]]
#> [1] 1 5
#>
#> [[4]]
#> [1] 2 2
#>
#> [[5]]
#> [1] 2 3
#>
#> [[6]]
#> [1] 2 5
The same operators also work for integers:
gofl:::`%s%`(2L, 3L)
#> [1] 5
gofl:::`%p%`(2L, 3L)
#> [1] 6
gofl:::`%ssp%`(2L, 3L)
#> [1] 11
To get gofl
to work a data structure called
tagged
holds the matrix and the tags:
str(X)
#> Formal class 'tagged' [package "gofl"] with 2 slots
#> ..@ mat :Formal class 'ddiMatrix' [package "Matrix"] with 4 slots
#> .. .. ..@ diag : chr "N"
#> .. .. ..@ Dim : int [1:2] 3 3
#> .. .. ..@ Dimnames:List of 2
#> .. .. .. ..$ : NULL
#> .. .. .. ..$ : chr [1:3] "var1____level1" "var1____level2" "var1____level3"
#> .. .. ..@ x : num [1:3] 1 1 1
#> ..@ tags:List of 3
#> .. ..$ : NULL
#> .. ..$ : NULL
#> .. ..$ : NULL
Then the operators are applied simultaneously to the matrices and the tags.
gofl:::`%s%`(X, Y)
#> An object of class "tagged"
#> Slot "mat":
#> 8 x 8 sparse Matrix of class "dtCMatrix"
#> var1____level1 var1____level2 var1____level3 var2____A var2____B var2____C
#> [1,] 1 . . . . .
#> [2,] . 1 . . . .
#> [3,] . . 1 . . .
#> [4,] . . . 1 . .
#> [5,] . . . . 1 .
#> [6,] . . . . . 1
#> [7,] . . . . . .
#> [8,] . . . . . .
#> var2____D var2____E
#> [1,] . .
#> [2,] . .
#> [3,] . .
#> [4,] . .
#> [5,] . .
#> [6,] . .
#> [7,] 1 .
#> [8,] . 1
#>
#> Slot "tags":
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> NULL
#>
#> [[4]]
#> NULL
#>
#> [[5]]
#> NULL
#>
#> [[6]]
#> NULL
#>
#> [[7]]
#> NULL
#>
#> [[8]]
#> NULL
The traverse_expr
function replaces +
with
%s%
and so on.
ff <- ~ a*b + c:d + e
nf <- gofl:::traverse_expr(ff, f = identity)
# It's not pretty to look at
nf
#> ~(new("standardGeneric", .Data = function (x, y)
#> standardGeneric("%s%"), generic = "%s%", package = "gofl", group = list(),
#> valueClass = character(0), signature = c("x", "y"), default = NULL,
#> skeleton = (function (x, y)
#> stop(gettextf("invalid call in method dispatch to '%s' (no default method)",
#> "%s%"), domain = NA))(x, y)))((new("standardGeneric",
#> .Data = function (x, y)
#> standardGeneric("%s%"), generic = "%s%", package = "gofl",
#> group = list(), valueClass = character(0), signature = c("x",
#> "y"), default = NULL, skeleton = (function (x, y)
#> stop(gettextf("invalid call in method dispatch to '%s' (no default method)",
#> "%s%"), domain = NA))(x, y)))((new("standardGeneric",
#> .Data = function (x, y)
#> standardGeneric("%ssp%"), generic = "%ssp%", package = "gofl",
#> group = list(), valueClass = character(0), signature = c("x",
#> "y"), default = NULL, skeleton = (function (x, y)
#> stop(gettextf("invalid call in method dispatch to '%s' (no default method)",
#> "%ssp%"), domain = NA))(x, y)))(a, b), (new("standardGeneric",
#> .Data = function (x, y)
#> standardGeneric("%p%"), generic = "%p%", package = "gofl",
#> group = list(), valueClass = character(0), signature = c("x",
#> "y"), default = NULL, skeleton = (function (x, y)
#> stop(gettextf("invalid call in method dispatch to '%s' (no default method)",
#> "%p%"), domain = NA))(x, y)))(c, d)), e)
Now the new expression can be evaluated with data. Usually, our
data
here would be the tagged
objects, but
just for illustration:
The traverse_expr
function can also take a function as
an argument that is applied to each leaf of the AST. In this example
gofl:::replace_by_size
finds the nrow
of a
matrix.
rlang::eval_tidy(
expr = gofl:::traverse_expr(ff, f = gofl:::replace_by_size)[[2]],
data = list(
a = matrix(nrow = 1),
b = matrix(nrow = 3),
c = matrix(nrow = 4),
d = matrix(nrow = 2),
e = matrix(nrow = 7)))
#> [1] 22
To continue the example from above and demonstrate how it works on
the tagged
type:
ff <- ~ var1 + var2
rlang::eval_tidy(
expr = gofl:::traverse_expr(ff, identity)[[2]],
data = list(var1 = X, var2 = Y))
#> An object of class "tagged"
#> Slot "mat":
#> 8 x 8 sparse Matrix of class "dtCMatrix"
#> var1____level1 var1____level2 var1____level3 var2____A var2____B var2____C
#> [1,] 1 . . . . .
#> [2,] . 1 . . . .
#> [3,] . . 1 . . .
#> [4,] . . . 1 . .
#> [5,] . . . . 1 .
#> [6,] . . . . . 1
#> [7,] . . . . . .
#> [8,] . . . . . .
#> var2____D var2____E
#> [1,] . .
#> [2,] . .
#> [3,] . .
#> [4,] . .
#> [5,] . .
#> [6,] . .
#> [7,] 1 .
#> [8,] . 1
#>
#> Slot "tags":
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> NULL
#>
#> [[4]]
#> NULL
#>
#> [[5]]
#> NULL
#>
#> [[6]]
#> NULL
#>
#> [[7]]
#> NULL
#>
#> [[8]]
#> NULL