Work stealing and mostly lock-free access

ParaSail uses work stealing to schedule the picothreads that make up a ParaSail program. With work stealing, there are a relatively small number of heavy-weight worker processes, roughly one per physical core/processor, each serving their own queue of picothreads (in a LIFO manner), and periodically stealing a picothread from some other worker process (using FIFO, so as to pick up a picothread that has been languishing on the other worker's queue). See the following blog entry for more discussion of work stealing:

http://parasail-programming-language.blogspot.com/2010/11/virtual-machine-for-parasail-with.html

What this means is that a worker's queue is referenced mostly by only one process, namely the owner of the queue.

A similar situation arises in the region-based storage management used in ParaSail. A region is created when a new scope is entered, and most of the allocation and deallocation within a region is done by the worker that created the scope. But due to work stealing, some other worker might be executing a picothread that is expanding or shrinking an object associated with the region, so some synchronization is necessary in this case.

So what sort of synchronization should be used for these situations where most of the access to a resource arises from an owning worker process, but some of the access arises from other non-owning workers? We could use a traditional lock-based mutex all of the time, but this slows down the common case where all the access comes from the owner. We could use a general N-way lock-free synchronization, but this generally requires some kind of atomic compare-and-swap and involves busy waiting. Atomic compare-and-swap is not always available in a portable fashion at the high-level language level, and busy waiting presumes that there are never more worker processes than there are available physical processors/cores, so the current holder of the lock-free resource is actually making progress while other workers are busy-waiting.

So for ParaSail we are adopting a middle ground between fully lock-based synchronization and N-way lock-free synchronization, which recognizes the asymmetric nature of the problem, namely that one process, the owner, will be performing most of the references. With the adopted solution, we only need atomic load and store, rather than atomic compare-and-swap, and there is never any busy waiting, so we can run on top of, for example, a time-slicing operating system, where some worker processes might be preempted.

So what is the adopted solution? For a given resource, we have two flags which are atomic variables, one mutex, and a queue, named as follows:

Owner-wants-resource flag
Nonowner-wants-resource flag
Resource mutex
Nonowner-waiting queue

When the owner wants to use the resource:

Owner sets the owner-wants-resource flag atomically;
It then checks the nonowner-wants-resource flag:

If nonowner-wants-resource flag is set:

Owner calls the mutex lock operation;
Owner manipulates the resource;
Owner clears the owner-wants-resource flag;
<<Check_Queue>> Owner then checks the nonowner-waiting queue:

If the queue is empty, it clears the nonowner-wants-resource flag;
If the queue is not empty, it wakes up one of the waiting nonowners;

Owner calls the mutex unlock operation (note that this might be combined with the above waking up of one of the nonowners -- e.g. using a lock handoff).

If nonowner-wants-resource flag is not set:

Owner manipulates the resource;
Owner clears the owner-wants-resource flag.
Owner rechecks the nonowner-wants-resource flag:

If nonowner-wants-resource flag is now set:

Owner calls the mutex lock operation;
Owner does the <<Check_Queue>> operation (see above);
Owner calls the mutex unlock operation (note that this might be combined with the waking up of one of the nonowners by Check_Queue -- e.g. using a lock handoff).

When a nonowner wants to use the resource:

Nonowner calls the mutex lock operation;
Nonowner sets the nonowner-wants-resource flag atomically;
Nonowner checks the owner-wants-resource flag;

While owner-wants-resource flag is set:

Nonowner adds itself to the nonowner-waiting queue;
When woken up, reacquire the lock (or get lock automatically via a handoff from the process that woke us up);

Nonowner manipulates the resource;
Nonowner does the <<Check_Queue>> operation (see above);
Nonowner calls the mutex unlock operation (note that this might be combined with the waking up of another nonowner by Check_Queue -- e.g. using a lock handoff).

How do we know this approach is safe? We need to prove that the resource is never manipulated simultaneously by the owner and a nonowner. This can only happen if the owner decides to not use the mutex, since otherwise the manipulation happens under protection of the mutex lock. We know that the owner sets the owner-wants-resource flag before checking the nonowner-wants-resource flag, and similarly a nonowner sets the nonowner-wants-resource flag before checking the owner-wants-resource flag. Therefore, if the owner decides to bypass the mutex, while a nonowner is going after the resource simultaneously, the nonowner must not yet have checked the owner-wants-resource flag (think about it!). If later the nonowner does reach the check on the owner-wants-resource before the owner is done, it will put itself onto a queue rather than immediately manipulating the resource.

How do we know this approach does not leave a nonowner waiting on the queue forever? We know the owner rechecks the nonowner-wants-resource flag after clearing the owner-wants-resource flag, so the owner will never miss the possibility that a nonowner queued itself while the owner was manipulating the resource.

So what does this approach accomplish? We see that the owner only uses a lock-based mutex when it bumps into a nonowner that is simultaneously manipulating the resource. On the other hand, a nonowner always uses a lock-based mutex, and in addition it uses a queue if it happens to bump into the owner simultaneously manipulating the resource. As mentioned above, this approach also avoids the need for atomic compare-and-swap, as well as avoiding the need for busy waiting.

Work stealing and mostly lock-free access

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112