Functions as Data
Mark GaleaProgramming languages have evolved largely through a series of abstractions; these abstractions mostly deal with control (e.g. functions) or data. In the post, Houses, Arrays and Higher Order Functions, we have focused on the use of functions as control elements and have delved into the concept of higher order functions. Today we will revisit functions and focus on their use as data elements. Using functions to represent data might seem pretty counter intuitive at first but this blurred line between functions as control elements and functions as data elements is a powerful concept.
Sets
In order to illustrate how functions can be used as data elements we will create a Set library with the following operations: union
, intersection
and difference
. We will define a set using the characteristic function T -> Boolean
i.e. a function which takes an element T
as input and return a true
or false
depending on whether the element is contained within the set. Using classic OOP we can define a set as follows:
class Set<T>(vararg values: T) {
var elements: List<T> = values.toList()
fun contains(element:T): Boolean = this.elements.contains(element)
}
In this case a set is defined as a class with a generic type T
, we will use the underlying list data structure elements
to implement the contains
function. We can use the OOP representation as follows:
val s1 = Set(1)
val s2 = Set(2, 3, 4)
Using functions as data elements we would implement a Set as follows:
fun <T>setOf(element: T): (T) -> Boolean = { test -> element!!.equals(test) }
fun <T>contains(set: (T)->Boolean, element: T): Boolean = set(element)
The function setOf
is a constructor function (as defined by SICP) which takes in an element
and returns a higher order functions that implements the characteristic function (T) -> Boolean
. contains
just delegates the test to the setOf
higher order function by calling set(element)
. Some observant readers might have noticed that the class Set
can ingest a number of values whilst the setOf
function can only ingest one value. Using the union
operator (defined further on) we can extends the setOf
function to accept a variable number of elements T
.
fun <T>setOf(vararg elements: T): (T) -> Boolean {
return elements.map { setOf(it) } .reduce { s, t -> union(s, t) }
}
Using the setOf
function we can define the set s1
and s2
as follows:
val s1 = setOf(1)
val s2 = setOf(2, 3, 4)
Note that even though the representation is completely different the end user consuming either library will not be exposed to the underlying details of our implementations (OOP vs Functions). To an end user (except for minor syntactic differences) these two Sets are not different from each another.
Union
Now that we have a means of representing a set, let us implement union
in both representations. In the OOP approach we would implement union as follows:
class Set<T>(vararg values: T) {
fun union(other:Set<T>): Set<T> {
val unionSet = Set<T>()
val elements = ArrayList(this.elements)
elements.addAll(other.elements.filter { x -> !this.elements.contains(x) })
unionSet.elements = elements
return unionSet
}
...
}
In this case we are using the backing data stucture elements
to keep track of what is in the set. We can test this implementation as follows:
val s1 = Set(1)
val s2 = Set(2, 3, 4)
println(s1.union(s2).contains(1)) // -> true
println(s1.union(s2).contains(2)) // -> true
println(s1.union(s2).contains(3)) // -> true
println(s1.union(s2).contains(4)) // -> true
println(s1.union(s2).contains(5)) // -> false
println(s1.union(s2).contains(0)) // -> false
Using functions we get a much more pure definition (mathematically speaking):
fun <T>union(s1: (T)->Boolean, s2: (T)->Boolean): (T)->Boolean =
{element -> s1(element) || s2(element) }
An element
is contained in the union of set s1
and s2
if the element
is either in set s1
or s2
. This is clearly represented by the line element -> s1(element) || s2(element)
. Don't be fooled by the fact that Kotlin does not support structural types, had the language supported structural types our definition would be a bit more concise:
fun <T>union(s1: Set<T>, s2: Set<T>):Set<T> =
{element -> s1(element) || s2(element) }
Needless to say this is just semantic sugar; regular functions (with milk) will do just fine! One would use the union operator as follows:
val s1 = setOf(1)
val s2 = setOf(2, 3, 4)
println(contains(union(s1, s2), 1)) // -> true
println(contains(union(s1, s2), 2)) // -> true
println(contains(union(s1, s2), 3)) // -> true
println(contains(union(s1, s2), 4)) // -> true
println(contains(union(s1, s2), 5)) // -> false
println(contains(union(s1, s2), 0)) // -> false
Note again that the usage is very similar and an end user would find it hard to believe that the Set is implemented using functions.
Intersection
Having defined union
, intersection
will only be one synaptic leap away. Let's start off with the OOP version.
class Set<T>(vararg values: T) {
fun intersection(other:Set<T>): Set<T> {
val intersectionSet= Set<T>()
val elements = ArrayList<T>()
elements.addAll(this.elements.filter { x -> other.contains(x) })
intersectionSet.elements = elements
return intersectionSet
}
...
}
One would use the definition above as follows:
val s3 = Set(1, 2)
val s4 = Set(2, 3, 4)
println(s3.intersection(s4).contains(1)) // -> false
println(s3.intersection(s4).contains(2)) // -> true
println(s3.intersection(s4).contains(3)) // -> false
println(s3.intersection(s4).contains(4)) // -> false
println(s3.intersection(s4).contains(5)) // -> false
println(s3.intersection(s4).contains(0)) // -> false
Using functions, we would represent the intersection function as:
fun <T>intersection(s1: (T)->Boolean, s2: (T)->Boolean): (T)->Boolean =
{ element-> s1(element) && s2(element) }
Note how close this function is to the original mathematical definition; an element
is contained in the intersection of Set s1
and s2
if the element
is contained in set s1
and set s2
, hence:
element -> s1(element) && s2(element)
One would use the above definition as follows:
val s3 = setOf(1, 2)
val s4 = setOf(2, 3, 4)
println(contains(intersection(s3, s4), 1)) // -> false
println(contains(intersection(s3, s4), 2)) // -> true
println(contains(intersection(s3, s4), 3)) // -> false
println(contains(intersection(s3, s4), 4)) // -> false
println(contains(intersection(s3, s4), 5)) // -> false
println(contains(intersection(s3, s4), 0)) // -> false
Difference
For completeness let us now implement difference
in both representations. In the OOP approach we would implement difference as follows:
class Set<T>(vararg values: T) {
fun difference(other:Set<T>): Set<T> {
val differenceSet= Set<T>()
val elements = ArrayList<T>()
elements.addAll(this.elements.filter { x -> !other.contains(x) })
differenceSet.elements = elements
return differenceSet
}
...
}
Using functions we can implement this using:
fun <T>difference(s1: (T)->Boolean, s2: (T)->Boolean): (T)->Boolean =
{element-> s1(element) && !s2(element) }
Set of multiple elements
Before we conclude I would like to give an intuition for the function
fun <T>setOf(vararg elements: T): (T) -> Boolean {
return elements.map { setOf(it) } .reduce { s, t -> union(s, t) }
}
Specifically how we used map
and reduce
to create a characteristic function which accepts multiple elements as follows:
val s3 = setOf(2, 3, 4, 7)
What elements.map { setOf(it) }
does is map all elements
to a characteristic function, hence 2 becomes:
test -> 2.equals(test)
3 becomes:
test -> 3.equals(test)
and so on.
The reduce
function will combine the characteristic functions as follows:
union (
union(
union(
{test -> 2.equals(test)},
{test -> 3.equals(test)}
),
{test -> 4.equals(test)}
),
{test -> 7.equals(test)}
)
Each union
function creates a new characteristic function combining the underlying two characteristic functions. For e.g.
union(
{test -> 2.equals(test)},
{test -> 3.equals(test)}
)
would create the following characteristic function:
{test ->
{test1 -> 2.equals(test1)}(test) ||
{test2 -> 3.equals(test2)}(test)
}
If we want to test whether 2 is in the set union, we would reduce the characteristic function as follows:
union({test -> 2.equals(test)}, {test -> 3.equals(test)})(2)
{test ->
{test1 -> 2.equals(test1)}(test) || // 2 here is the element in the set s1
{test2 -> 3.equals(test2)}(test) // 3 here is the element in the set s2
}(2) // 2 is the value to test.
Replace all occurrences of test
with the value 2, we obtain:
{test1 -> 2.equals(test1)}(2) || {test2 -> 3.equals(test2)}(2)
Replace all occurrences of test1
and test2
with the value 2, we obtain:
2.equals(2) || 3.equals(2)
true
Hence the characteristic function for the set union between s1
and s2
when applied to 2 has reduced to true, which is correct.
Conclusion
In this post we have looked at how we can use functions as data elements. Using functions as a data representation might feel unnatural at first but this blurred boundary between functions as elements of control and functions as elements of data is quite powerful. I really hope that you find this technique useful. Stay safe and keep hacking!