assembly - Spinlock with XCHG -


the example implementation wikipedia provides spinlock x86 xchg command is:

; intel syntax  locked:                      ; lock variable. 1 = locked, 0 = unlocked.      dd      0  spin_lock:      mov     eax, 1          ; set eax register 1.       xchg    eax, [locked]   ; atomically swap eax register                              ;  lock variable.                              ; store 1 lock, leaving                              ;  previous value in eax register.       test    eax, eax        ; test eax itself. among other things,                              ;  set processor's 0 flag if eax 0.                              ; if eax 0, lock unlocked ,                              ;  locked it.                              ; otherwise, eax 1 , didn't acquire lock.       jnz     spin_lock       ; jump mov instruction if 0 flag                              ;  not set; lock locked, ,                              ; need spin until becomes unlocked.       ret                     ; lock has been acquired, return calling                              ;  function.  spin_unlock:      mov     eax, 0          ; set eax register 0.       xchg    eax, [locked]   ; atomically swap eax register                              ;  lock variable.       ret                     ; lock has been released. 

from here https://en.wikipedia.org/wiki/spinlock#example_implementation

what don't understand why unlock need atomic. what's wrong

spin_unlock:      mov     [locked], 0   

the unlock need have release semantics protect critical section. doesn't need sequential-consistency. atomicity isn't issue (see below).

so yes, on x86 simple store safe, , glibc's pthread_spin_unlock so::

    movl    $1, (%rdi)     xorl    %eax, %eax     retq 

see simple maybe usable x86 spinlock implementation wrote in answer, using read-only spin loop pause instruction.


possibly code adapted bit-field version.

unlocking btr 0 1 flag in bitfield isn't safe, because it's non-atomic read-modify-write of containing byte (or containing naturally-aligned 4 byte dword or 2 byte word).

so maybe whoever wrote didn't realize simple stores aligned addresses atomic on x86, on isas. x86 has weakly-ordered isas don't every store has release semantics. xchg release lock makes every unlock full memory barrier, goes beyond normal locking semantics. (although on x86, taking lock full barrirer, because there's no way atomic rmw or atomic compare-and-swap without xchg or other locked instruction, , full barriers mfence.)

the unlocking store doesn't technically need atomic, since ever store 0 or 1, lower byte matters. e.g. think still work if lock unaligned , split across cache-line boundary. tearing can happen doesn't matter, , what's happening low byte of lock modified atomically, operations put zeros upper 3 bytes.


if wanted return old value catch double-unlocking bugs, better implementation separately load , store:

spin_unlock:      ;; pre-condition: [locked] non-zero       mov     eax,  [locked]        ; old value, debugging      mov     dword [locked], 0     ; on x86, atomic store "release" semantics.       ;test    eax,eax      ;jz    double_unlocking_detected    ; or leave caller      ret 

Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -