assembly - Spinlock with XCHG -
the example implementation wikipedia provides spinlock x86 xchg command is:
; intel syntax locked: ; lock variable. 1 = locked, 0 = unlocked. dd 0 spin_lock: mov eax, 1 ; set eax register 1. xchg eax, [locked] ; atomically swap eax register ; lock variable. ; store 1 lock, leaving ; previous value in eax register. test eax, eax ; test eax itself. among other things, ; set processor's 0 flag if eax 0. ; if eax 0, lock unlocked , ; locked it. ; otherwise, eax 1 , didn't acquire lock. jnz spin_lock ; jump mov instruction if 0 flag ; not set; lock locked, , ; need spin until becomes unlocked. ret ; lock has been acquired, return calling ; function. spin_unlock: mov eax, 0 ; set eax register 0. xchg eax, [locked] ; atomically swap eax register ; lock variable. ret ; lock has been released.
from here https://en.wikipedia.org/wiki/spinlock#example_implementation
what don't understand why unlock need atomic. what's wrong
spin_unlock: mov [locked], 0
the unlock need have release semantics protect critical section. doesn't need sequential-consistency. atomicity isn't issue (see below).
so yes, on x86 simple store safe, , glibc's pthread_spin_unlock
so::
movl $1, (%rdi) xorl %eax, %eax retq
see simple maybe usable x86 spinlock implementation wrote in answer, using read-only spin loop pause
instruction.
possibly code adapted bit-field version.
unlocking btr
0 1 flag in bitfield isn't safe, because it's non-atomic read-modify-write of containing byte (or containing naturally-aligned 4 byte dword or 2 byte word).
so maybe whoever wrote didn't realize simple stores aligned addresses atomic on x86, on isas. x86 has weakly-ordered isas don't every store has release semantics. xchg
release lock makes every unlock full memory barrier, goes beyond normal locking semantics. (although on x86, taking lock full barrirer, because there's no way atomic rmw or atomic compare-and-swap without xchg
or other lock
ed instruction, , full barriers mfence
.)
the unlocking store doesn't technically need atomic, since ever store 0 or 1, lower byte matters. e.g. think still work if lock unaligned , split across cache-line boundary. tearing can happen doesn't matter, , what's happening low byte of lock modified atomically, operations put zeros upper 3 bytes.
if wanted return old value catch double-unlocking bugs, better implementation separately load , store:
spin_unlock: ;; pre-condition: [locked] non-zero mov eax, [locked] ; old value, debugging mov dword [locked], 0 ; on x86, atomic store "release" semantics. ;test eax,eax ;jz double_unlocking_detected ; or leave caller ret
Comments
Post a Comment