[FRIAM] Q_rsqrt() vs 1/sqrt()

Marcus Daniels marcus at snoutfarm.com
Fri Jan 8 18:23:04 EST 2021


mdaniels at daniels:~$ cat t.c
#include <math.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc,const char **argv) {
  float val = atof (argv[1]);
  float ret = (1.0f/sqrtf(val));
  printf("%f\n",(double) ret);
}
mdaniels at daniels:~$ gcc -march=native -O2 -ffast-math -S t.c
mdaniels at daniels:~$ grep sqrt t.s
        vrsqrtss        %xmm0, %xmm0, %xmm1

-----Original Message-----
From: Friam <friam-bounces at redfish.com> On Behalf Of u?l? ???
Sent: Friday, January 8, 2021 2:49 PM
To: friam at redfish.com
Subject: Re: [FRIAM] Q_rsqrt() vs 1/sqrt()

Would out-of-order execution produce the same out-of-order order over, say, 10 executions?

The clock() results between GCC and TCC are similar. But the ASM looks fairly different. I'm still not seeing rsqrt or sqrt instructions even after specifying short floats throughout and using sqrtf(), with or without -O0, for whatever that's worth. But the speed of the 1/sqrtf() increased quite a bit from 1/sqrt().

gepr at cormac:~/lang/c$ ./gcc.out
1/sqrt() took 0.076633 s
Q_rsqrt() took 0.473007 s

gepr at cormac:~/lang/c$ ./tcc.out
1/sqrt() took 0.078259 s
Q_rsqrt() took 0.46164 s

On 1/8/21 8:46 AM, Stephen Taylor wrote:
> 	Because the hardware environment has changed, and the tradeoffs on integer and floating-point arithmetic are different. (Like it says in the Wikipedia article.)  Out of order execution might be messing up your measurements, too.

--
↙↙↙ uǝlƃ

- .... . -..-. . -. -.. -..-. .. ... -..-. .... . .-. .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn GMT-6  bit.ly/virtualfriam un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
archives: http://friam.471366.n2.nabble.com/
FRIAM-COMIC http://friam-comic.blogspot.com/ 


More information about the Friam mailing list