Sets

# Sets

A Set is a data type used to store a collection of elements. There is no order on the collection, and all the elements must be distinct (so you cannot have several copies of the same element in the set). In other words, a Set is the Kotlin implementation of a set in mathematics.

Sets appear naturally in many problems: Correctly spelled words form a set. Prime numbers form a set.

Of course, we could simply use a list (or an array) to store a collection of items that forms a set. But this is not natural and often not efficient: A list is an indexed sequence, where the elements have an ordering, the same element can appear several times, and searching for an element can only be done by looking at all the elements one by one.

Fortunately, the Java standard library used by Kotlin provides for us a very nice Set data type. More precisely, Set is a parameterized data type, so there is a Set<Int>, Set<String>, and so on.

Let's create some sets:

>>> val s = setOf(2, 3, 5, 7, 9)
>>> s
[2, 3, 5, 7, 9]
>>> val w = setOf("CS109", "is", "wonderful")
>>> w
[CS109, is, wonderful]
>>> val e = setOf<String>()
>>> e
[]
>>> val e1 = emptySet<String>()
>>> e
[]

Note that for the empty set you have to indicate the type of elements, since Kotlin cannot deduce it from the elements themselves. There is also the special function emptySet, which gives a bit mor clarity.

You can also create sets by converting other collections (lists, arrays, ranges) using their toSet() method.

Remember that it doesn't matter in which order you write the elements, and an element cannot appear more than once:

>>> val s2 = setOf(9, 9, 5, 7, 3, 5, 3, 2)
>>> s == s2
true
>>> val w2 = setOf("wonderful", "is", "CS109")
>>> w == w2
true


The + and - operators can be used to add elements to a set, and to remove elements from a set. The result is a new set. It is okay to add elements already in the set, and to remove elements not in the set.

>>> s + 11
[2, 3, 5, 7, 9, 11]
>>> s - 7
[2, 3, 5, 9]
>>> s - 6
[2, 3, 5, 7, 9]
>>> s + 7
[2, 3, 5, 7, 9]
>>> s + 7 == s
true


The standard mathematical operations of union, intersection, difference, and containment can be implemented using Set methods:

• $$|s|$$ is s.size
• $$s \cup t$$ is s + t
• $$s \setminus t$$ is s - t
• $$x \in s$$? is s.contains(x) or x in s
• $$s \subseteq t$$? is t.containsAll(s)
• $$s \cap t$$ is s.intersect(t).

Here are some examples:

>>> val a = (1 .. 10).toSet()
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> val b = (1 .. 10 step 2).toSet()
>>> b
[1, 3, 5, 7, 9]
>>> val c = (1 .. 5).toSet()
>>> c
[1, 2, 3, 4, 5]
>>> a.containsAll(b)
true
>>> b.containsAll(a)
false
>>> a.containsAll(c)
true
>>> a - b
[2, 4, 6, 8, 10]
>>> b + c
[1, 3, 5, 7, 9, 2, 4]
>>> b.intersect(c)
[1, 3, 5]


Sets support many other operations, some of which you are already familiar with from lists, for instance:

• s.size is the size of the set;
• s.isEmpty() is the same as s.size == 0;
• s.isNotEmpty() is the same as s.size != 0;
• s.max(), s.min(), s.sum() return the largest, smallest, and sum of elements in the set;
• s.sorted() returns a list with the elements of s in sorted order;
• s.joinToString() returns a string with all elements of s concatenated together (with the same options as for lists);
• s.toList() and s.toMutableList() return a (mutable) list with the same elements as the set.

As a first example of using sets, here is a simple spell checker. It makes use of the file words.txt, containing 113809 English words, one per line.

We read the file and immediately convert it to a set. Then we allow the user to enter words from the terminal. We check if the word is in the set of correctly spelled words, and report this back (spell.kts).

import org.otfried.cs109.readString

val fname = "words.txt"

val words = java.io.File("words.txt").useLines { it.toSet() }

while (true) {
val w = readString("Enter a word> ").trim()
if (w == "")
break
if (w in words)
println("$w is a word") else println("Error:$w is not a word")
}

Here is an example run:
$kts spell.kts Enter a word> lovely lovely is a word Enter a word> lovly Error: lovly is not a word Enter a word> wierd Error: wierd is not a word Enter a word> weird weird is a word Enter a word> supercede Error: supercede is not a word Enter a word> supersede supersede is a word Enter a word>  You could also use sets to measure the similarity between two texts (for instance to classify documents on the web). Consider the set of words of each text, and compare the size of their union with the size of their intersection. #### Mutable sets All the sets we have looked at above are immutable: There is no way to change the contents of a Set. Set operations like union and intersection actually return a new set object. Sometimes it is more efficient or convenient to use a mutable set (but remember that these are more dangerous to use). The Java standard library provides a mutable set data type, called MutableSet in Kotlin. We add elements with add, and remove them with remove: >>> val s = mutableSetOf(1, 2, 3, 4) >>> s [1, 2, 3, 4] >>> s.add(9) true >>> s [1, 2, 3, 4, 9] >>> s.add(13) true >>> s [1, 2, 3, 4, 9, 13] >>> s.remove(2) true >>> s [1, 3, 4, 9, 13] >>> s.remove(12) false >>> s [1, 3, 4, 9, 13]  A classic application of sets is the Sieve of Erathosthenes to compute prime numbers. Here is one implementation (sieve.kts): fun sieve(n: Int): Set<Int> { var s = (2 .. n).toMutableSet() val sqrtn = Math.sqrt(n.toDouble()).toInt() for (i in 2 .. sqrtn) { if (i in s) { var k = i * i while (k <= n) { s.remove(k) k += i } } } return s } val num = if (args.size == 1) args[0].toInt() else 1000 val primes = sieve(num) for (i in primes) print("$i ")

println()

And the output of a run:
\$ kts sieve.kts 500
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181
191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277
281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383
389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487
491 499


There are many other applications of sets. For instance, when finding your way in a maze, you could store the positions already seen in a set. Similarly, when implementing a computer game, you could store the game positions already evaluated in a set.

 Sets