# WARNING: DATA RACE

Nicholas Ng

Monzo

### Go and concurrency

• Goroutines = go + function call
• Why don't we make everything goroutines?

What can possibly go wrong?

2

### Implement withdraw money

```// +build ignore

package main

import (
"fmt"
"math/rand"
"runtime"
"sync"
"time"
)

var wg sync.WaitGroup

func init() {
}

func (acc *bankAccount) doWithdraw(amount int) {
defer wg.Done()
time.Sleep(time.Duration(rand.Intn(10)) * time.Microsecond)
runtime.Gosched()
acc.Withdraw(amount)
}

func (acc *bankAccount) String() string {
if acc.Balance < 0 {
return fmt.Sprintf("-£%d", -acc.Balance)
}
return fmt.Sprintf("£%d", acc.Balance)
}

func (acc *bankAccount) PrintBalance() {
wg.Wait()
fmt.Println()
fmt.Println("Balance:", acc.String())
}

```
```type bankAccount struct{ Balance int }

func (acc *bankAccount) Withdraw(amount int) {
if bal := acc.Balance; bal >= amount { // if has enough money
fmt.Printf("✓ Balance = %d\tWithdrawing %d\n", bal, amount)
acc.Balance -= amount
return
}
fmt.Printf("✗ Balance = %d\tCan't withdraw %d\n", acc.Balance, amount)
}

func main() {
acc := &bankAccount{Balance: 10}
go acc.doWithdraw(11) // impossible
go acc.doWithdraw(5)
go acc.doWithdraw(4)
go acc.doWithdraw(7)

// .. wait for goroutines to complete
acc.PrintBalance() // Print final balance
}
```
3

### What went wrong

```func (acc *bankAccount) Withdraw(amount int) {
if ① acc.Balance >= amount { // if has enough money
③ acc.Balance = ② acc.Balance - amount
}
}```
`go Withdraw(5)` `go Withdraw(4)` `go Withdraw(7)`
③ write balance (5) read balance (5)
(not enough money)
4

### What went wrong

```func (acc *bankAccount) Withdraw(amount int) {
if ① acc.Balance >= amount { // if has enough money
③ acc.Balance = ② acc.Balance - amount
}
}```
`go Withdraw(5)` `go Withdraw(4)` `go Withdraw(7)`
write balance (5)
write balance (1) read balance (1) 🤔
write balance (-6) ❌

Also, output is non-deterministic: different behaviour depending on run

5

### How to fix it

Channels

• One goroutine 'owns' balance
• Other goroutines send message over channel to update balance

Mutex locks

• `Lock()` → read and update balance → `Unlock()`

Don't write concurrent code

• No problem if nothing is shared!

But why?

6

### Meet Leslie

Leslie Lamport

Known for distributed systems: Paxos, TLA+, LaTeX, etc.

7

### Ordering of events

For example, A "happened before" B (written AB)

We use it to talk about the ordering of events (hint: read/write of shared memory)

8

### Sequential Consistency

The concurrent execution is consistent with (or appears to match) sequential execution

A very desirable property (and strong guarantee)

• e.g. You can reason about a concurrent program like a sequential program
9

### 🏁 Data race: a strange kind of race

In a data race, the last thread/goroutine to write "wins" 🥇 (but read is ok)

Visualise with a happens-before graph

W(5): RRW abbreviates:

Within a goroutine, happens-before is sequential (source code order)

10

### ❌🏁 No data race

Synchronising with channels, mutex locks to avoid races is like match fixing

• Events may appear in different order, but always total order
• You can always find an equivalent sequential trace

All possible executions give same result as a sequential -- sequential consistent!

11

### Go Dynamic Race Detector

Run/build with race detector on

```go run -race main.go
go build -race main.go```

Test with race detector on

`go test -race ./...`
12

### Go Dynamic Race Detector

• Analyse happens-before graph as you execute

"go test -race" Under the Hood, Kavya Joshi

13

### Summary

• Using concurrency correctly, as always, is hard
• Detect data races 🐞 by analysing happens-before graph of memory accesses
• Test your code extensively with `go test -race`

Is extensive testing (including fuzzing) the only way? (hint: no)

There are many more interesting related topics (memory model, synchronisation)

14

## Extra slides

15

### The Go Memory Model

Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.

To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.

If you must read the rest of this document to understand the behavior of your program, you are being too clever.

Don't be clever.

• Defined in terms of happens-before relation
• Opportunities for optimisation and Go in relaxed memory hardware architectures
• SC-DRF: Sequential consistent if program is data-race free
16

### Potentials for static analysis

The Go Memory Model definition can be used as blueprint for type-based analysis

Operational semantics of a weak memory model with channel synchronization
Fava, Steffen, Stolz, JLAMP (2019)

A Static Verification Framework for Message Passing in Go using Behavioural Types
Lange, Ng, Toninho, Yoshida, ICSE (2018)

So, maybe?

17

### Benign Data Races

• Code with data races but doesn't matter is a myth
• If data races exist, program can do anything (not sequential consistent)
• Program with data race → anything can happen!
18

### Thank you

Nicholas Ng

Monzo

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)