Kinds of types in Scala, part 3: embedding some more info in a type

In the previous post, we understood how parametric types work, which let cover most of the cases we’ll have in our everyday coding. However, there are some interesting concepts, which, while not so heavily used, can come handy at a time.

Structural and refined types

We know, that in math a class is a collection of objects, that fulfill some conditions and programming borrowed the idea. But, what if we wanted to specify these conditions without givinteg them a name? In JavaScript, you can define a type like:

type User = { name: string, surname: string }

By most OOP languages’ standards, it is not a class. But in JavaScript and Go (and in mathematics), this is an allowed method to create a collection of values by a predicate (has name property of type string and surname property of type string) - in other words a type.

Scala also allows us to define a type this way:

type User = {
  val name: String
  val surname: String
}

We don’t have to extends User to make a value conforming to the type - it is enough to fulfill the requirements:

case class Somebody(name: String, surname: String)

def needUser(user: User): Unit

needUser(Somebody("test", "test")) // works!

We can call User a structural type. It is an interesting way of implementing a type, one where compiler checks if object has all the right methods and fields instead of scanning the class hierarchy. A problem with such approach is that in runtime Scala has to use a runtime reflection to access these fields, so it comes with a performance penalty.

Structural types are not restrained to definitions using only vals and defs. We can demand that it extends some class as well:

trait X { val x: String }
type Y = { val y: Int }
val z: X with Y = new X { val x = "test"; val y = 0 }

So a compound type containing a structural type is also a structural type.

The example above shows us another interesting thing. If we asked a REPL about a type of:

new X { val x = "test"; val y = 0 }

the answer would be:

AnyRef with X{val y: Int}

Scala automatically adds the information about new fields and methods to the type! Actually, any time we add something extra to the interface, if don’t upcast the returned value, Scala would reflect these extra properties in a type. We call such types the refined types, but they are just a special case of structural types.

It appears refined types are not unique to Scala. When local type inference arrived in Java, it also started to support them, though in a half-baked way. You cannot declare a structural type, so such types can be only used via vars, which hold an information about refinement. This incomplete implementation can lead some Java programmers to thinking that such feature is dangerous.

Path-dependent types

Let’s say we want to model a card game:

case class Card(color: String, value: String)
case class Player(name: String)
class Game {
  
  def getPlayers: Set[Players] = ???
  
  def getCards: Set[Cards] = ???
  
  def playerPlayCard(player: Player, card: Card): Unit = ???
}

However, in our application there could be several games running at once, so we are at risk of messing up objects from different games:

val game1: Game
val game2: Game
game1.playerPlayCard(game2.getPlayers.head,
                     game2.getCards.head)

Could we make sure that we can only use objects from right Game and do it in compile time (assertions throwing at runtime might be a little too late for our needs)?

Path-dependent types are way of stating that type of one object, depends on another object. It is a limited version of more generic idea of dependent types. Actually, Dotty origins its name from Dependent Object Types calculus that Odersky designed to make future versions of Scala sound.

So, how could we define such type?

class Game {
  
  case class Card(color: String, value: String)
  case class Player(name: String)
  
  def getPlayers: Set[this.Player] = ???
  
  def getCards: Set[this.Card] = ???
  
  def playerPlayCard(player: this.Player, card: this.Card): Unit = ???
}

Let’s check what types will have values returned from game1 and game2:

val game1 = new Game
val game2 = new Game
game1.getPlayers // Set[game1.Player]
game2.getPlayers // Set[game2.Player]

We got different types, so we could expect, that we cannot substitute one for the other:

game1.playerPlayCard(game2.getPlayers.head,
                     game2.getCards.head) // fails!

To create a path-dependent type, all we need is to declare a type inside another type! It doesn’t have to be a class:

class X {
  type Y = String
  val y: Y = "y"
}

val x1 = new X
val x2 = new X

However, if it’s just a type alias, Scala can prove that 2 types are the same, and so it will not help us much:

def y(x: X)(y: x.Y): Unit = ()

y(x1)(x2.y) // no complaints: x1.Y = String = x2.Y

That is, unless we make it a little less obvious, that the types are equal:

trait X {
  type Y = String
  val y: Y = "y"
}

class X2 extends X

val x1 = new X2
val x2 = new X2

y(x1)(x2.y) // fails!

It is not obvious anymore because we could e.g. make type Y abstract:

trait X {
  type Y
  val y: Y
}

val x1 = new X {
  type Y = String
  val y: Y = "y"
}
val x2 = new X {
  type Y = Int
  val y: Y = 1
}

OK, but what if we wanted to lose our requirements at some point?

def takeAnyPlayer(p: ???): Unit

We need to indicate a way of passing path-dependent type without its the context if needed. Such ability is granted to us via #:

def takeAnyPlayer(p: Game#Player): Unit

At this point there is very little we know for certain about our p. Scala most won’t be able to tell what is the exact type even if it would be obvious to you from the code. If there were some type constraints about the type, that would be guaranteed, you can rely on it. But anything that comes with the specification of path-dependent type is lost:

trait X {
  type Y <: AnyVal { def toLong: Long }
  val y: Y
}

val x1 = new X {
  override type Y = Int
  override val y: Y = 1
}
val x2 = new X {
  override type Y = Long
  override val y: Y = 1L
}

Hmm, here our type Y is defined directly inside a X instance, so Scala should be able to guess it’s exact type. We can confirm that indeed it is:

x1.y * x1.y // 1: Int
x2.y * x2.y // 1: Long

Same about #:

scala> trait X { type Y }
scala> :kind (X { type Y = Int })#Y
// Int's kind is A

We can use path-dependent types as method return type (path-dependent methods). However, we are not allowed to create a function which uses path-dependent types. It will be introduced in Dotty under name dependent function types.

Kind projector/type lambda

At this point we have enough information to understand what was happening inside some weird construction we used during partial-application in a type constructor with multiple type parameters:

scala> :kind ({ type T[A] = Either[String, A] })#T
scala.util.Either[String,?]'s kind is F[+A]

We start with Either[_, _] and we want to partially apply String as the first type parameter. In order to do so, we:

  • create a type alias T[A] = Either[String, A], which creates a parametric type T[A] with one type parameter,
  • we put it inside a structural type to create an opportunity for creating a path-dependent type,
  • finally we extract T as a path-dependent type, such that Scala can tell its specific type exactly,
  • since we didn’t applied type parameters, we end up with a single parameter type constructor,
  • we achieved partial-application of a type parameters to a type constructor, aka type lambda or kind projection.

Quite often kind projectors use λ to denote the output type. Because the syntax is quite messy, you might prefer to use a dedicated compiler plugin that would generate it for you. With kind-projector compiler plugin you would need to write just:

Either[String, ?]

In Dotty, type lambdas are supported natively with syntax:

[T] => Either[String, T]

Kind projectors become more useful the more you dive into hardcore functional programming as promoted by Cats and Scalaz.

Self-types

Mixins. Sometimes we want to add some functionality to existing class and traits are a great way of achieving that:

class Producer {
  
  def produce: Int = 1
}

trait DoubleProducer extends Producer {
  
  override def produce: Int = super.produce * 2
}

trait IncreaseProducer extends Producer {
  
  override def produce: Int = super.produce + 1
}

class MyProducer extends Producer with DoubleProducer with IncreaseProducer

(new MyProducer).produce // 3

Thing is, if we do it that way, we cannot make Producer a trait or abstract class. Perhaps we would like to use it only as interface, and then decorate the implementation (one of many).

In Scala sometimes we need to refer to the current instance. Usually, we might use this, but if we nest type definitions this will only point to the one we are currently inside. It makes it hard to access the outer object. We can give a name to this to make it less ambiguous:

trait Outer { self =>
  
  trait Inner {
    self // Outer
    this // self.Inner
  }
}

However, we might use type ascription to put a constraints on this self value (or whatever we decide to name it):

trait Producer {
  
  def produce: Int
}

trait DoubleProducer { self: Producer =>
  
  override def produce: Int = super.produce * 2
}

As we can see, we no longer need to extend decorated type - type ascription already allow us to access the decorated type’s members. Additionally this self-type acts as a type constraint on any type, that would like to mix-in the trait.

class OneProducer extends Producer {
  def produce: Int = 0
}

class MyProducer extends OneProducer with DoubleProducer // OK
class Invalid extends User with DoubleProducer // error!

Self-types, might be used to enforce ordering of low-priority implicits:

object MyType extends MyTypeImplicits

trait MyTypeImplicits extends MyTypeLowPriorityImplicits {
  
  // implicits
}

trait MyTypeLowPriorityImplicits { self: MyTypeImplicits =>
  
  // low-priority implicits
}

That is not the only use case though. Some people use self-types as a way of type-safe dependency injection:

trait FirstServiceComponent {
  
  trait FirstService
  val firstService: FirstService
}

trait SecondServiceComponent {
  
  trait SecondService
  val secondService: SecondService
}
trait FirstServiceComponentImpl extends FirstServiceComponent {
    self: SecondServiceComponent =>
  
  val firstService = new FirstService {
    // use secondService
  }
}

trait SecondServiceComponentImpl extends SecondServiceComponent {
    self: FirstServiceComponent => 
  
  val secondService = new SecondService {
    // use firstService
  }
}

object Application
    extends FirstServiceComponentImpl
    with SecondServiceComponentImpl

Each component of such construct is called layer. When we compose layers together, we are baking a cake. This metaphor is an origin of the name of this patter - a cake pattern. Me and many other programmers discovered, that such bastardized mixin causes more troubles, that it solves. Problems with extensions, maintenance and initialization order I already described in another article.

Creating new types out of existing: tagged and value types

Sometimes we don’t want to create a completely new product with 2 or more fields, nor a sum type. We also don’t want to use type alias, which doesn’t create a distinct type. We might want something like Surname or Name which is like String, but with no risk of accidentally passing values in a wrong way (surname as name, name as surname).

One ways of handling this issues was designed by Miles Sabin for shapeless, the other one was provided by language authors.

Solution proposed by Miles Sabin works by tricking the compiler to believe, that our value is of the type it is not:

object tag {
  def apply[U] = new Tagger[U]

  trait Tagged[U]
  type @@[+T, U] = T with Tagged[U]

  class Tagger[U] {
    def apply[T](t : T) : T @@ U = t.asInstanceOf[T @@ U]
  }
}

To understand this, let us see an example:

sealed trait Name
sealed trait Surname

val name = tag[Name]("John") // String @@ Name
val surname = tag[Surname]("Smith") // String @@ Surname

This @@ construct takes 2 types: our initial type, one of which we provide a value, and another type called tag, which will serve as a way of making a newly created type unique. We can always demand a type String with Tagged[Name] (or String @@ Name if you prefer) - the fact that String is final is checked only during class declaration and the mismatch between String with Tagged[Name] could only be found in runtime.

Meanwhile, the only methods we will have declared will be methods from String - which our value surely is. JVM will never have an opportunity to complain, so as long as compiler is OK about it, it will work. It appears, that .asInstanceOf[String @@ Name] is enough to silence it.

Tagged types, require that we will define a trait/class that will serve as tags. They can quite often be sealed traits that you won’t use anywhere else. As long as you have the tags, you can easily lift any type to a tagged type. In runtime, they will be untagged though, which might be advantage or disadvantage depending on situation (runtime reflection!). In compile time they will create different types, so you have to take that into consideration if you are using any type-level programming - type classes provided for primitives won’t work with tagged types (there is no one-standard for tagging) and you will have to handle it yourself.

To provide a standard solution to the mentioned problem Scala came up with value types. It is basically a class extending AnyVal with one val member.

// for convenience people prefer to use case classes for it
case class Name(value: String) extends AnyVal

// though it is not required
class Surname(val value: String) extends AnyVal
object Surname {
  def apply(value: String): Surname = new Surname(value)
}

Since there are few restrictions imposed on this class, the compiler has a lot of opportunities to optimize the wrapper away. Since it is a standard solution, many libraries handle it out of the box. The disadvantage it that the optimization doesn’t work reliably, so e.g. mocking is almost completely broken (because sometimes the value is boxed and sometimes wrapper is optimized away).

As a some sort of middle-ground a newtype library was designed.

@newtype case class Name(value: String)

It lets you declare your new type as a case class (which helps with IDE support), while macro annotation rewrites it into:

type Name = Name.Type
object Name {
  type Repr = String
  type Base = Any { type Name$newtype }
  trait Tag extends Any
  type Type <: Base with Tag

  def apply(x: String): Name = x.asInstanceOf[Name]

  // here AnyVal is used as extension methods optimization
  implicit final class Ops$newtype(val $this$: Type) extends AnyVal {
    def value: String = $this$.asInstanceOf[String]
  }
}

Phantom and literal types

sealed traits for each tag would pollute the JVM with unused interfaces. Could that be avoided? Well, it is planned.

There is a pattern for implementing something what is basically a type-level state machine.

sealed trait Switch
sealed trait On extends Switch
sealed trait Off extends Switch

class Bulb[S <: Switch] {
  
  def turnOn(implicit ev: S =:!= On): Bulb[On] =
    new Bulb[On]
  
  def turnOff(implicit ev: S =:!= Off): Bulb[Off] =
    new Bulb[Off]
}

val bulb = new Bulb[Off]
bulb.turnOff // error!
bulb.turnOn // Bulb[On]
bulb.turnOn.turnOn // error!
bulb.turnOn.turnOff // Bulb[Off]

Because Switch, On and Off are not used for anything else than indicating the state with type, they are named phantom types. Currently, it is just a pattern, but Dotty will let us mark type as an erased term, which makes it exist only in compile time.

Another type-level specific thing, that will arrive - this time with Scala 2.13 - is a literal type. With libraries like shapeless you can express e.g. case class as a HList. Thing is, it only stores information about types and their positions. There is no information about properties names, so generating things like JSON codecs is impossible. Or would be if not for LabelledGenerics, which return HList of tuples of Witness-types. Wait, what is a Witness? Let’s try to figure out from an example:

case class Test(a: String, b: Int)

def getGeneric[T, U](
  t: T)(implicit gen: Generic.Aux[T, U]): U = gen.to(t) 
def getLabelledGeneric[T, U](
  t: T)(implicit gen: LabelledGeneric.Aux[T, U]): U = gen.to(t)

getGeneric(test) // String :: Int :: HNil = "xyz" :: 1 :: HNil
getLabelledGeneric(test) 
// labelled.FieldType[Symbol @@ a, String] ::
//   labelled.FieldType[Symbol @@ b, Int] ::
//   ops.hlist.ZipWithKeys.hnilZipWithKeys.Out =
//     "xyz" :: 1 :: HNil

Interestingly, the LabelledGeneric created an instance, which stores a name of the field it originated from! (FieldType works the same way as tagged types - no extra information in runtime, but extra data in compile time).

It appears that compiler can create a singleton type of just one value. Since it is defined by its literal it is also known as a literal type. But even if compiler knows such types, it doesn’t let us create one… without some hacking.

Witness is a helper trait which implementation is provided by a shapeless macro:

trait Witness extends Serializable {
  type T
  val value: T {}
}

object Witness extends Dynamic {
  type Aux[T0] = Witness { type T = T0 }
  type Lt[Lub] = Witness { type T <: Lub }

  implicit def apply[T]: Witness.Aux[T] =
    macro SingletonTypeMacros.materializeImpl[T]

  implicit def apply[T](t: T): Witness.Lt[T] =
    macro SingletonTypeMacros.convertImpl
  
  // ...
}

This macros lets us currently access the literal types, which is so far accessible only via compiler internals. But with Scala 2.13 they will become available without costly hacks in the language.

Summary

In this article, I intended to talk a bit about types in Scala from the perspective of type and set theories, but also explain certain idiosyncrasies of Scala when it comes to implementing them.

We can see that Scala has quite powerful type system which is only going to get better. Even on its own, it is enough to give Scala a competitive edge over many other languages. It allowed development of many features which would be otherwise hard to implement, fragile or impossible.

If someone was looking for a compiled version of posts 1-3, it can be found here.