SetsProgramming Practice (CS109)Classes and objects IImmutable and mutable objects, references and the heap

Immutable and mutable objects, references and the heap

An object whose state cannot change after it has been constructed is called immutable (unchangable). The methods of an immutable object do not modify the state of the object. In Scala, all number types, strings, and tuples are immutable. The classes Point, Date, Student, and Card we defined above are all immutable.

Indeed, if we try to change the coordinates of a Point, we get an error:

scala> p.x = 7
<console>:10: error: reassignment to val
In other words, once a Point object has been created, its fields cannot be modified.

It is possible to define a mutable case class: we need to put the var keyword in front of the field names:

scala> case class MPoint(var x: Int, 
                         var y: Int)
defined class MPoint
scala> val p = MPoint(3,5)
p: MPoint = MPoint(3,5)
scala> p.x = 7
p.x: Int = 7
scala> p
res3: MPoint = MPoint(7,5)
Note that we could change the \(x\)-coordinate of the point p even though we defined p as a val-variable. Remember that this only means that the name p will always refer to the same object. It is possible to change the fields inside this object.

Mutable objects can lead to tricky mistakes. Consider the following code:

scala> val p = MPoint(3, 5)
p: MPoint = MPoint(3,5)
scala> val q = p
q: MPoint = MPoint(3,5)
scala> q.x = 7
q.x: Int = 7
scala> q
res10: MPoint = MPoint(7,5)
What is the value of p at this point? Surprisingly, p has changed as well:
 
scala> p
res11: MPoint = MPoint(7,5)

Arrays are of course mutable, and so the same effect can appear for arrays:

scala> val A = Array(1, 2, 3, 4)
A: Array[Int] = Array(1, 2, 3, 4)
scala> val B = A
B: Array[Int] = Array(1, 2, 3, 4)
scala> A(2) = 99
scala> B
res1: Array[Int] = Array(1, 2, 99, 4)
(Note again that even though we have defined A as a val variable, it is possible to change the contents of A.)

References and the heap

Why does this happen? To understand this, we need to understand how variables store objects.

All Scala objects are stored in an area of the Scala runtime system called the heap. Objects cannot exist anywhere else.

A variable is just a name for an object on the heap. You can think about a variable as a reference to the object on the heap. The reference uniquely indicates the object on the heap. (If you learnt C, you can think about this reference as a pointer. In reality it may not really be a memory address.)

An assignment operation (as in val q = p or val B = A above) creates a new name for an object on the heap. p and q are in fact two different names for the same MPoint object, and A and B are two names for the same array object:

Objects with several names

This problem can never happen for immutable objects, and so it is preferable to use immutable objects whenever that is possible.

null

A variable can also have the value null, which means that it does not reference any object.

Any operation on a variable with value null will fail. Since there is no object, it is impossible to call any method. However, it is allowed to compare a null value against any other value.

scala> var m: String = null
m: String = null
scala> m.substring(1,5)
java.lang.NullPointerException
scala> m.toString
java.lang.NullPointerException
scala> m == "CS109"
res0: Boolean = false
scala> m == null
res3: Boolean = true
scala> m != null
res4: Boolean = false

For efficiency reasons, variables of the types Int, Byte, Short, Long, Double, Float, Char, Boolean, and Unit cannot be null.

When you create an array, all the slots of the array are initialized to null. The only exceptions are the basic types above, they are initialized to zero (or false):

scala> var a = new Array[String](10)
a: Array[String] = Array(null, null, null, null, null, null, null, null, null, null)
scala> var p = new Array[Point](10)
p: Array[Point] = Array(null, null, null, null, null, null, null, null, null, null)
scala> var b = new Array[Int](10)
b: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
scala> var d = new Array[Double](10)
d: Array[Double] = Array(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
scala> var bol = new Array[Boolean](10)
bol: Array[Boolean] = Array(false, false, false, false, false, false, false, false, false, false)

Local variables

Now that we know that all objects live in the heap, you may wonder where the variable names, that is, the references, are stored.

A reference that is a field of an object (or a slot in an array), is stored inside that object in the heap.

Most other references are local variables of some function or method. They are stored inside a piece of memory called the activation record or stack frame of the function. The activation record is created automatically each time the function is called. For instance, this function

def test(m: Int) {
  val k = m + 27
  val s = "Hello World"
  val A = Array( s.length, k, m )
}
has four local variables, namely m, k, s, and A. (The parameters of a method are local val variables, with the only difference that the runtime system automatically copies the argument value into the variable when the method is called.)

The following shows the activation record of test and the heap when test(13) is called, just before the function returns:

Result of test(13)

Garbage collection

Scala objects are garbage collected: If the runtime system runs out of memory, it will check all the objects on the heap. If an object no longer has any reference pointing to it, the object is no longer useful, and will be deleted. It is hard to predict when garbage-collection will happen. If you run a small program only, probably no garbage-collection at all occurs.

Garbage collection allows the programmer not to worry about the memory management. There are other languages which do not provide an automatic garbage collection. For example, in C++ the programmer is responsible for the memory management. It is common for C or C++ programs to contain mistakes where objects are created but never destroyed, and so more and more unused and unusable objects fill up the heap. Such a program is said to contain a memory leak.

Arrays

Arrays are somewhat special objects—they are the only object that allows you to store an unbounded amount of information in a single object. (Scala provides many other classes to store large amounts of data efficiently and more conveniently than with arrays. But all of those classes are implemented internally using arrays or a large number of small objects.)

Arrays have a fixed size, and so we have to know the size of the array when we create the array object. If we want to put more objects in an array than the original array size allows, we need to create a new array and copy the data from the old array to the new one. The array methods ++ and :+ do this for us. The code below shows how we could do it manually:

  var A = new Array[Int](10)
  // .... many computations ...
  // now we need more space in A
  val B = new Array[Int](20)
  for (int i <- 0 until A.length)
    B(i) = A(i)
  A = B
  // old A will be garbage-collected
Note that to do this "trick", we had to declare A as a var variable, otherwise the assignment A = B would not have worked.
SetsProgramming Practice (CS109)Classes and objects IImmutable and mutable objects, references and the heap