Kinds of types in Scala, part 2: take type, return type or type parameters

In the previous post, we laid the foundation for understanding the type system in Scala. But concrete types only would be too little to make the language truly expressive. So, now we’ll try to parametrize it.

Type constructors and parametric types

We can think of a type as a set. But we can also think of a function as a set, a set of parameter-returned value pairs. This lets us also think of a function signature as a type: a set of all possible sets of such pairs where arguments belong to one type and the returned value to another (where the first value from a pair is unique within the set):

type T - a set of values
t: T - value belonging to set T ( $t \in T$ )
type S => T - a set of a subset of $S \times T$ such that each value is a function $S \rightarrow T$ (which means that first element of a pair must be unique)
$\{ f: (f: S \rightarrow T) \} \iff \{ f: f \subset S \times T \land \forall_{(s_1, t_1), (s_2, t_2) \in f} s_1 \neq s_2 \}$
f: S => T: a subset of pairs, where first element of a pair is unique
$f: S \rightarrow T \iff (f \subset S \times T \land \forall_{(s_1, t_1), (s_2, t_2) \in f} s_1 \neq s_2)$

A little bit mind-numbing, but not as much as the next discovery: we could pass a set as an argument of a function and receive another set as a returned value.

Let’s build one such function:

at first, let’s build a set of all possible types:
$T' = \{ T: isType(T) \}$
we can define a list of a type T as a function from an index (a natural number) to a value $t: T$ (which is not necessarily total function - it has no defined values after its last index)
$isList_T(l) = \forall_{(i, t) \in l}( i \in \mathbb{N} \land t: T \land \forall_{(i_2, t_2)} (i = i_2 \implies t = t_2 ))$
or shorter
$isList_T(l) = l: \mathbb{N} \rightarrow T$
now, we can build a set of all possible lists of a type $T$ :
$L_T = \{ l: isList_T(l) \}$
next step would be to build a set of all possible lists of a type $T$ for each possible type $T$ :
$L = \{ L_T: isType(T) \}$
finally, we can define a function which takes one $T \in T'$ and returns another set $L_T \in L$

$List: T' \rightarrow L$ $List(T) = \{ l: isList_T(l) \}$

What we built here is a function that takes a type $T$ and returns… a type $List_T$ or List[T] using Scala’s notation.

Such a function, which takes a type and builds another type, is called a type constructor. In Scala, we could denote $List$ type constructor as List[_].

Type constructors are interesting to us because they are a mathematical way of defining parametric types, known as class templates by C++ programmers and generics by Java programmers (though, to be honest, from a mathematician’s point of view they are crippled implementations of parametric types). (You will also see this term frequently when you start reading the definitions of type classes on Haskell wiki).

Where parametric types differ from what you might know from C++ or Java is in the consequences of their functional definition. You can have a higher-order function: a function which takes and/or returns another function. Same with type constructors: you can have a type constructor, which takes and/or returns another type constructor. Many type class definitions take a type constructor as a parameter:

trait Functor[F[_]] { // <- F is a type constructor
  
  def map[A, B](fa: F[A])(f: A => B): F[B]
}

Kinds and higher-kinded-types

These types of types (a type, a type constructor, etc) are known as kinds. Scala lets us investigate them in REPL using :kind:

scala> :kind String
String's kind is A

scala> :kind List
List's kind is F[+A]

scala> :kind Either
Either's kind is F[+A1,+A2]

scala> import scala.language.higherKinds
scala> trait NeedTC[F[_]]
scala> :kind NeedTC
NeedTC's kind is X[F[A]]

You can add a -v flag to get a detailed description with a type-theory notation ( $*$ read as type):

scala> :kind -v String
String's kind is A
*
This is a proper type.

scala> :kind -v List
List's kind is F[+A]
* -(+)-> *
This is a type constructor: a 1st-order-kinded type.

scala> :kind -v Either
Either's kind is F[+A1,+A2]
* -(+)-> * -(+)-> *
This is a type constructor: a 1st-order-kinded type.

scala> import scala.language.higherKinds
scala> trait NeedTC[F[_]]
scala> :kind -v NeedTC
NeedTC's kind is X[F[A]]
(* -> *) -> *
This is a type constructor that takes type constructor(s): a higher-kinded type.

But functions allow us to do a partial application. Can we do it for types in Scala? Well, we can, but in a messy way.

scala> :kind ({ type T[A] = Map[String, A] })#T
scala.collection.immutable.Map[String,?]'s kind is F[+A]

scala> :kind -v ({ type T[A] = Map[String, A] })#T
scala.collection.immutable.Map[String,?]'s kind is F[+A]
* -(+)-> *
This is a type constructor: a 1st-order-kinded type.

(Such a thing is called a type lambda, but we’ll talk more about it when we learn more about structural and path-dependent types. Let’s skip detailed explanations for now).

Among the detailed descriptions, there are terms like 1st-order-kinds and higher-kinded types. Indeed, the same way function taking/returning functions are called higher-ordered functions, type constructors taking/returning type constructors are called higher-kinded types.

Let’s take a look at how values and functions correspond to types, and how it is reflected in types and type constructors correspondence to kinds:

Value	Type	Notes
`"test"`	`String`	a plain value
`(i: Int) => i*2`	`Int => Int`	a 1st-order function
`(a: Int) => (b: Int) => a+b`	`Int => Int => Int`	a higher-ordered function
`((i: Int) => i*2)(2)`	`Int`	a plain value
`((a: Int) => (b: Int) => a+b)(2)`	`Int => Int`	a 1st-order function

and on a type level:

Type	Kind	Notes
`String`	`*`	a proper type
`Int => Int`	`*`	a proper type a 1st-order function
`Int => Int => Int`	`*`	a proper type a higher-ordered function
`List`	`* -> *`	a type constructor, 1st-order-kinded type
`Either`	`* -> * -> *`	a type constructor, 1st-order-kinded type
`List[A]`	`*`
`({ type T[A] = Either[String, A] })#T`	`* -> *`	a type constructor, 1st-order-kinded type

This table shows us how elegantly values correspond to types, functions correspond to type constructors, and function application corresponds to passing a type to a type constructor. Values can be thought of as nullary (0-argument) functions, while types can be thought of as nullary type constructors, and it still works perfectly! They are exactly the same thing, just one level of abstraction higher.

As far as I know, there are even higher levels in values-types-kinds hierarchy, such as sorts, but hardly ever there would be some practical applications for them.

Type constraints

We already talked about type bounds, and how we might impose restrictions on the inferred type just by giving it values it has to infer against. With parametric types, we might be able to do it more directly.

Let’s say we have an ADT like this:

sealed trait User { val id: Int }
case class Member(id: Int, name: String) extends User
case class Admin(id: Int, access: Set[String]) extends User

We would like to have a function that takes a list of users and returns a map from id to a user… except we would like that map to adjust: if we pass a set of Users it should return a Map[Int, User], if we pass a set of Members it should return a Map[Int, Member].

Our first attempt would probably look like this:

def byId(users: Set[User]): Map[Int, User] =
  users.map { u => u.id -> u }.toMap

It works, but it is not generic. We wanted it to be generic.

def byId[U](users: Set[U])(getId: U => Int): Map[Int, U] =
  users.map { u => getId(u) -> u }.toMap

It also works, but it forces us to pass an extractor every time - not a bad idea if it did something different each time, but it only needs to get the same property all our cases share via the User class. If only there was a way to suggest to the type system that U needs to extend User (or has User as an upper bound), and then make use of its properties…

As a matter of fact, there is.

def byId[U <: User](users: Set[U]): Map[Int, U] =
  users.map { u => u.id -> u }.toMap

byId(users: Set[Member]) // Map[Int, Member]

<: denotes an upper bound in type parameters. It looks like this, so that parser would not confuse it with <, but its meaning is similar - a type on the left is smaller (lies lower in hierarchy) than a type on the right. There is also a notation for lower bound, >::

def recover[E, A, B >: A](
    either: Either[E, A])(f: E => B): Either[E, B] =
  either match {
    case Left(e)  => Right(f(e))
    case Right(a) => Right(a)
  }

recover[String, Admin, User](err: Either[String, Admin]) {
    _ =>
  fallback: Member
}
// Either[String, User]

These are the type constraints provided by the language. Besides them there are also generalized type constraints, which exist as type classes provided by the compiler (if the compiler cannot create them, it means that a type fails the constraints):

<:< - let us require that one type is a subclass of the other.

def upcast[A, B](set: Set[A])(
  implicit ev: A <:< B
): Set[B] = set.map(ev(_))
  
upcast[Member, User](Set(m: Member)) // Set[User]

It is defined inside a scala.Predef.

=:= - let us require that one is equal to the other.

def update[A, B](set: Set[A])(f: A => B)(
  implicit ev: A =:= B
): Set[B] = set.map(f)
  
val members: Set[Member]
  
update[Member, Member](members)(identity) // ok
update[Member, User](members) { member =>
  member: User
} // compilation error!

It is defined inside a scala.Predef as well.

<%< - already removed from Scala, but it used to denote that type A is either a subtype of B or could be implicitly converted to it.
=:!= - provided by shapeless. Proves that types are different.

If you require that there be a type class in scope for your type parameter:

def doSomething[F[_]: Monad]: F[String]

I met people who would also call implicit syntax for type classes a type constraint (I require of a type T to be a monad).

Variance

Type constraints are useful when we want to describe our expectations about a type parameter and/or the returned type. But there are use cases where they are not enough.

Let’s take a look at Option. The first implementation would be something like:

sealed trait Option[T] {}
case class Some[T](value: T) extends Option[T]
case class None[T]() extends Option[T]

It is not something we are used to. We don’t want to create a new None for each type. Actually, we want just one case object, which we could use as a valid Option[T] everywhere. However, if we tried to:

sealed trait Option[T] {}
case class Some[T](value: T) extends Option[T]
case object None extends Option[Nothing]

then, when we try to use None as Option[String] or Option[Int] we would get a type error.

def f(stringOpt: Option[String]): Unit
f(None) // compilation error!

The reason for that is that we defined the Option as an invariant parametric type. Invariance means, that even if B <: A then Option[B] still cannot be substituted for Option[A]. It is a default and a good one!

A long time ago, Java let us do such things with arrays. In Java, you can create Array[Member] and pass it as a valid parameter for a function expecting Array[User]. That function assumes that the array can contain anything as long as it’s a User, so it can insert an Admin into the array. Then you end up with a java.lang.ArrayStoreException. Suddenly you have a reference to an array of Members that contains an Admin - the JVM recognizes such a state as invalid. To avoid repetition of this problem, when Java introduced generics, it made them invariant without an option to change the behavior.

But the issue only appears if you share a mutable state. If you have an immutable object, you can, for example, pass List[Member] as List[User], Option[B] as Option[A], and you would never end up with invalid behavior (if we modified something, we would create a modified copy, so the original would still fulfill its requirements).

A situation in which we explicitly state that if B :> A then F[B] :> F[A] is called covariance. To mark a type parameter as covariant we use +:

sealed trait Option[+T] {}
case class Some[+T](value: T) extends Option[T]
case object None extends Option[Nothing]

Covariance propagates down, so if we made T in Option covariant, we will receive complaints if we don’t do the same for T in Some. Usually, when we mark the parameter as covariant and we extend the type as an object, e.g. an empty container (None, Nil, EmptySet), we use Nothing as a hardcoded type - as the bottom type it subtypes all types, so such an object can always be upcast. Additionally, it is a nice explanation that this container is empty, as it is literally a container of nothing.

(As we can guess, most immutable collections would be covariant, while most, if not all, mutable collections are invariant).

It is quite common to have a situation where we want to pass a value of a specific parametric type and generalize it. What about the opposite situation, when we have a producer of values of a parametric type and we want to specify it:

trait Consumer[T] {
  def consume(value: T): Unit
}

val members: List[Member]
def consumer: Consumer[User]
members.foreach(consumer.consume) // error!

The case where we want to say that A <: B implies F[B] >: F[A] is called contravariance, as it is an opposition of covariance. We denote it with a - sign:

trait Consumer[-T] {
  def consume(value: T): Unit
}

val members: List[Member]
def consumer: Consumer[User]
members.foreach(consumer.consume) // works

By analogy with covariance and the bottom type Nothing, Any (the top type) would be the type parameter of a consumer that consumes any value.

Covariance and contravariance are very important to functional programming in Scala. They allow for intuitive behavior for the central element of the function-as-a-first-class-citizen approach, which is the function itself:

trait Function[-A, +B] {
  def apply(a: A): B
}

It is obvious to us that a function that takes User should also take Member, and a function returning Member could be used as a substitute for a function taking User. This also shows us that invariance, covariance, and contravariance - known together as variance - describe a type parameter and not the whole type. Here A is a contravariant type parameter of Function while B is a covariant type parameter.

(Actually, Scala’s function definition looks a little bit different - and depends on its arity - but the principle holds. It also holds for a type derived from a function, a PartialFunction).

Type theory also allows bivariance - a case when the type is covariant and contravariant at the same time - but Scala doesn’t allow such a situation.

Existential types

We talked a bit about type parameters, how we can put constraints on them, and how they translate to a whole type being a super/subtype. But what should we do when we don’t care?

def count[T](seqs: Seq[T]*): Int = seqs.map(_.size).sum

count(Seq(1,2,3), Seq("test")) // 4

Here, we don’t actually use the type T at all. It should work for all Ts we pass (if we add some type constraints, then for all T that match them). So, we can describe this function with a universal quantifier:

\forall_T count: Seq_{Seq_T} \rightarrow int

Because we can describe it with a universal quantifier, we could call the type Seq[T] (and Seq[Seq[T]]) a universal type.

But what if our function count were part of another function or a class, and it was the only place where we needed T, so we would have to pass it down anyway? Or what if we wanted to store Sequences in a map but didn’t care about the type, because each of them would be of some other type? (ClassTag[T] -> Set[T]).

If this were Java, we could use ? to indicate that we don’t care about the type:

int count(java.util.List<?>... seqs) {
   return Arrays.stream(seqs).mapToInt(seq -> seq.size()).sum();
}

This ? tells us something like we know that there is some concrete T we could put here, but we don’t care about it at the moment. We don’t know any details about this T and, to be honest, we don’t care. All we need is the assumption that there is one. In math, we could express that with an existential qualifier:

seq: Seq_? \iff \exists_{T} seq: Seq_T

count: Seq_{Seq_?} \rightarrow int

By analogy, we could name such a type an existential type. Scala lets us express it in two ways:

def count(seqs: (Seq[T] forSome {type T})*): Int = 
  seqs.map(_.size).sum

forSome is a syntax existing solely for existential types, I believe. The other, shorter version is:

def count(seqs: Seq[_]*): Int = 
  seqs.map(_.size).sum

which looks dangerously similar to type constructor syntax. As a matter of fact, we can distinguish these two different usages merely by the context in which they appear: if F[_] appears in the definition of a type parameter it is a type constructor; if it is used as the type of a variable or parameter it is an existential type.

There are a few things that would be difficult to implement without existential types (fixed-point types in recursion schemes, I guess?), but in our everyday life, we don’t use them that often. During his work on Dotty, Odersky found out that some of their applications were among the things making the language unsound, which is why forSome notation is removed from Dotty.

When a mathematician says that some theory is sound, all formulas in the theory can be logically derived from the axioms and rules of the theory.

In the case of a type system, it means that if our type system decides that some expression has a type T, then if you run it, the returned value will also be of type T.

It might be surprising for some, but (besides null, which breaks most type systems that allow it and everybody is aware of that) there are a lot of cases when a language can be broken and forced to return a value of an invalid type.

F-bound types

The last type-parameter-related types I want to talk about are F-bound types. The name originates from the F-type system, which was the first one that allowed their existence. But what are they and what problem do they solve? Let’s look at an example:

trait Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

class Int extends Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

class Double extends Number {
  
  def +(number: Number): Number
  def *(number: Number): Number
}

val a: Int
val b: Int
val c = a + b // Number!

In our example, if we defined our interface without any type parameters, we would end up with a situation where any value of a specific Number would lose information about its specific type. Also, without any notice it would allow additions of unrelated implementations:

val a: Char // Char extends Number
val b: Double
val c = a + b // Number, but what Number exactly?

We might make our type Number parametric so that we could control the type of parameters and returned values:

trait Number[N] {
  
  def +(number: N): N
  def *(number: N): N
}

class Int extends Number[Int] {
  
  def +(number: Int): Int
  def *(number: Int): Int
}

class Double extends Number[Double] {
  
  def +(number: Double): Double
  def *(number: Double): Double
}

val a: Int
val b: Int
val c = a + b // Int!

The issue with operations on unrelated implementations also disappears:

val a: Char
val b: Double
val c = a + b // error!

However, now we have a new issue:

class User extends Number[Int] // why is that a thing?

What we need here is some sort of restriction where we require that a type parameter must be the type that extends the parameterized type. In Scala, we can do it in two ways:

using self-types:
```
trait Number[N] { self: N =>
  // ...
}
```
(we’ll talk more about self-types later on),
using type constraints:
```
trait Number[N <: Number[_]] {
  // ...
}
```
(Here Number[_] is an existential type, not a type constructor).

This definition relies on a certain limitation of Scala (and Java) - we cannot implement the same parametrized type with different parameters in one class. Therefore:
```
class Double extends Number[Double]
class X extends Number[Double] // fails - different type params in Numbers
class Y extends Number[String] // fails - string is not a Number
```

Both ways help us ensure that the implementation is type-recursive.

This is quite a popular way of solving the initial problem in languages that don’t support (or encourage) type classes. In Java, it is a built-in method of implementing Enums (enum X becomes abstract class X extends Enum[X]). C++ knows F-bound types under the name Curiously Recurring Template Pattern (CRTP).

This pattern is certainly less encouraged in languages that DO support type classes (and do not support classes and inheritance, which means half the FP languages).

Summary

In this post, we talked about type constructors, parameters, and constraints. As a matter of fact, at this point we have enough knowledge to model any program we like in a nice way.

However, there are some other concepts related to types in Scala that are worth mentioning. They aren’t improvements to concepts you might know from other mainstream languages, but they add a way of encoding some more information about the data in the type.

In next post, we will talk about them to fill the gaps and make sure the type system won’t have much to hide from us!

kubuszok.com