A more optimistic method may be to just enter the room (without locking). If another thread tries to enter the same room, it will likely notice that somebody else is already using the room and will rollback its execution (or return from the room in the physical analogy). This is more optimistic because you assume that in the common case, it is unlikely that somebody will try to enter the room at the same time. Hence, you have avoided the overhead of having to acquire a lock. In software, this is done by enclosing the act of accessing the shared resource in a transaction. All accesses to the shared resource within a transaction are tentative. If during the accesses, the thread notices that the same resource is being concurrently accessed by another thread (thus violating atomicity), the transaction is rolled back. Let's look at some examples.
transfer() function
was used to transfer a unit of money from one account to another. Let's
recall the code which used fine-grained locking.
void transfer(account *a, account *b) {
if (a->account_id < b->account_id) {
acquire(&a->mutex);
acquire(&b->mutex);
} else {
acquire(&b->mutex);
acquire(&a->mutex);
}
if (a->money > 0) {
a->money--;
b->money++;
}
release(&a->mutex);
release(&b->mutex);
}
In this code which uses fine-grained locks, whenever we access a shared
account, we always acquire the corresponding lock (mutex) before the access.
However, in the common case, it is unlikely (though possible)
that another thread will be accessing the same accounts at the same time.
But we pay the overhead of lock acquisition each time!
It would be nice if we could structure our code as a transaction that can be rolled back if required, as follows:
void transfer(account *a, account *b) {
int am, bm;
do {
tx_begin();
am = tx_read(a->money);
bm = tx_read(b->money);
if (am > 0) {
am--;
bm++;
}
success = tx_commit(am, &a->money, bm, &b->money);
} while (!success);
}
Here tx_begin indicates the start of a transaction,
and tx_commit indicates the end of a transaction. At the
end, the commit operation atomically tries to update the values
of shared variables a->money (with value stored in local
variable am), and b->money (with value stored
in local variable bm). The commit succeeds, if no
concurrent transaction had read these shared variables during the
time the transaction was executing. It fails otherwise.
Notice that to implement an atomic commit (and rollback if needed), the runtime system needs to track, which locations were read and written. Also notice that this mechanism of a transaction is only useful if we expect that the probability of a commit failure is small. Because if there are likely to be a lot of rollbacks, then the performance of this system may actually be worse than that of coarse-grained locks.
The probability of a commit failure depends on the size of the transaction and on the expected concurrency of this code region.
Transactions are very useful for databases, where the read and commit operations are performed on disk blocks. Because disk accesses are anyways slow, it is relatively cheap to maintain this information about which disk blocks were read or written (in memory). However, in the case of operating systems, maintaining such information for memory bytes is much more expensive. Recently, new processors (e.g., Intel's Haswell) are providing hardware support for implementing transactions for memory accesses (also called transactional memory).
int cmpxchg(int *addr, int old, int new) {
int was = *addr;
if (was == old) {
*addr = new;
}
return was;
}
In the instruction form, this can be written as:
cmpxchg Rold, Rnew, MemThis instruction compares the value at location
Mem with
register Rold; if the values are equal, it replaces the
value at Mem with Rnew. Else it does nothing.
It returns the old value at location Mem in register
Rold. This instruction is atomic if prefixed with the
lock prefix..
For example, if we want to increment a shared variable hits
atomically using cmpxchg, we can write the code as:
int l_hits, l_new_hits;
retry:
l_hits = hits;
l_new_hits = l_hits + 1;
if (cmpxchg(&hits, l_hits, new_l_hits) != l_hits) {
goto retry;
}
Here, we first read the shared hits variable into a local
variable l_hits. We perform some computation on
l_hits to obtain a new value l_new_hits. Now,
we wish to write the new value l_new_hits
to the shared variable hits, but only if the hits
variable has not been modified in between (i.e., between the first
read to hits and this point). To do this, we can use
the cmpxchg instruction
on the memory address of shared variable hits with the
old value as l_hits and the new value as l_new_hits.
If the atomic cmpxchg instruction returns that the current
read value of hits is still equal to its previous read
value (l_hits), the increment operation was performed atomically.
On the other hand, if the current read value of hits differs
from the previous read value, it indicates a concurrent access to the
hits variable, and so the operation needs to be
retried (or rolled-back).
Simlarly, our list insert example can be written using compare-and-swap (or CAS) as follows:
void insert(struct list *l, int data) {
struct list_elem *e;
e = new list_elem;
e->data = data;
retry:
e->next = l->head;
if (cmpxchg(&l->head, e->next, e) != e->next) {
goto retry;
}
}
Once again, if two threads try to insert into a list concurrently,
one of them will execute the atomic cmpxchg instruction
first. Whichever thread executes cmpxchg
first, will succeed (as it will see
the same current value of l->head as the one that it
previously read). The successful thread will also atomically update
the value of l->head to its new value (with the inserted
element). The thread which executes cmpxchg second, will
notice that the value of l->head has changed from
its last read value, and this will cause a rollback.
To capture this programming pattern, an abstraction called read-write locks can be used. Read-write locks are defined as follows:
struct rwlock rwlock; void read_acquire(struct rwlock *); void write_acquire(struct rwlock *); void release(struct rwlock *);The
read_acquire function acquires the rwlock in
read mode, while the write_acquire function acquires
the rwlock in read/write mode.
The semantics are that an rwlock can be acquired in read
mode multiple times simultaneously. However, it can only be acquired
in write mode, if it is not currently held in either read or write mode
by any other thread.