[FRIAM] Q_rsqrt() vs 1/sqrt()

uǝlƃ ↙↙↙ gepropella at gmail.com
Fri Jan 8 19:03:46 EST 2021


Thanks. I didn't think of trying -O2. That and -O1 give me a sqrtsd instruction. With both -O2 and -march=native, I get a vsqrtsd. And all 3 options give me a vrsqrtss. What's hilarious is the -O[12] make Q_rsqrt() faster than 1/sqrtf(), in spite of the assembler instruction(s).

gepr at cormac:~/lang/c$ ./O1.out 
1/sqrt() took 0.095175 s
Q_rsqrt() took 0.065637 s

gepr at cormac:~/lang/c$ ./O2.out 
1/sqrt() took 0.052231 s
Q_rsqrt() took 0.029407 s


On 1/8/21 3:28 PM, Marcus Daniels wrote:
> I mean I think it is that you may be targeting too low of a common denominator in terms of the processor.   That should work for doubles too.
> 
> -----Original Message-----
> From: Friam <friam-bounces at redfish.com> On Behalf Of Marcus Daniels
> Sent: Friday, January 8, 2021 3:23 PM
> To: The Friday Morning Applied Complexity Coffee Group <friam at redfish.com>
> Subject: Re: [FRIAM] Q_rsqrt() vs 1/sqrt()
> 
> mdaniels at daniels:~$ cat t.c
> #include <math.h>
> #include <stdlib.h>
> #include <stdio.h>
> 
> int main(int argc,const char **argv) {
>   float val = atof (argv[1]);
>   float ret = (1.0f/sqrtf(val));
>   printf("%f\n",(double) ret);
> }
> mdaniels at daniels:~$ gcc -march=native -O2 -ffast-math -S t.c mdaniels at daniels:~$ grep sqrt t.s
>         vrsqrtss        %xmm0, %xmm0, %xmm1


-- 
↙↙↙ uǝlƃ



More information about the Friam mailing list