d - SSE strangeness with Functions -


i've been playing around d's inline assembler , sse, found don't understand. when try add 2 float4 vectors after declaration, calculation correct. if put calculation in separate function, series of nans.

//function contents identical code section in unittest float4 add(float4 lhs, float4 rhs) {     float4 res;     auto lhs_addr = &lhs;     auto rhs_addr = &rhs;     asm     {         mov rax, lhs_addr;         mov rbx, rhs_addr;         movups xmm0, [rax];         movups xmm1, [rbx];          addps xmm0, xmm1;         movups res, xmm0;     }     return res; }  unittest {     float4 lhs = {1, 2, 3, 4};     float4 rhs = {4, 3, 2, 1};      println(add(lhs, rhs)); //float4(nan, nan, nan, nan)      //identical code starts here     float4 res;     auto lhs_addr = &lhs;     auto rhs_addr = &rhs;     asm     {         mov rax, lhs_addr;         mov rbx, rhs_addr;         movups xmm0, [rax];         movups xmm1, [rbx];          addps xmm0, xmm1;         movups res, xmm0;     } //end identical code     println(res); //float4(5, 5, 5, 5) } 

the assembly functionally identical (as far can tell) this link.

edit: i'm using custom float4 struct (for now, array) because want able have add function float4 add(float4 lhs, float rhs). moment, results in compiler error this:

error: floating point constant expression expected instead of rhs

note: i'm using dmd 2.071.0

your code wierd, version of dmd use? works excpected:

import std.stdio; import core.simd;  float4 add(float4 lhs, float4 rhs) {     float4 res;     auto lhs_addr = &lhs;     auto rhs_addr = &rhs;     asm     {         mov rax, lhs_addr;         mov rbx, rhs_addr;         movups xmm0, [rax];         movups xmm1, [rbx];          addps xmm0, xmm1;         movups res, xmm0;     }     return res; }  void main() {     float4 lhs = [1, 2, 3, 4];     float4 rhs = [4, 3, 2, 1];      auto r = add(lhs, rhs);     writeln(r.array); //float4(5, 5, 5, 5)      //identical code starts here     float4 res;     auto lhs_addr = &lhs;     auto rhs_addr = &rhs;     asm     {         mov rax, lhs_addr;         mov rbx, rhs_addr;         movups xmm0, [rax];         movups xmm1, [rbx];          addps xmm0, xmm1;         movups res, xmm0;     } //end identical code     writeln(res.array); //float4(5, 5, 5, 5) } 

Comments

Popular posts from this blog

java - nested exception is org.hibernate.exception.SQLGrammarException: could not extract ResultSet Hibernate+SpringMVC -

sql - Postgresql tables exists, but getting "relation does not exist" when querying -

asp.net mvc - breakpoint on javascript in CSHTML? -