## More weirdness and hackology

Non-spam and Anime things that don't fit in C&C. Also where talk that you don't want to turn into spam goes. So No Spam allowed

### More weirdness and hackology

For reasons of just because, I went looking for fast sines and cosines last night. I think the coolest hack I found was about 1400 years old, and comes from an Indian mathematician, Bhaskara (I). It calculates a sine over an 180-degree arc with a worst error of 0.0016 (at 10°).

In pseudo-code it could be: a := (180 - angle) * angle; sin := (4 * a) / (40500 - a);

A 1967 article by one R. C. Gupta shows a cosine over the same arc (0–180°) derived from this rule.

Pseudo code: a := angle*angle; cos := (32400 - 4 * a) / (32400 + a);

Mind, I am yet to code them up to get some numbers for myself on error and stuff.
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

Got around to tabulate some numbers. The cosine is good from -90° to +90°, not 0–180°, as I first thought. Ain’t done maths in a long time, so perhaps no wonder I pulled a thinko. Oh, well…

Looks like it will be time to practice some input range reduction, as in:

1. Reduce deg to 0–360°.
2. If deg<180, return sin(deg), else return -sin(deg-180).

Hennyways and milwaukyroads, the sine’s error is 0 at 0, 30, 90, 150 and 180 degrees, 0.0016299 high at 12 and 168 degrees and 0.0013434 low at 51 and 129 degrees. The worst angular errors are 0.095492 high at 12 and 168 degrees and 0.132988 low at 59 and 121 degrees.

As always, it’s the reader’s decision if this error is a deal breaker.
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

That was a great help!
I'm now researching expressing trig. functions (not deterministic) through simple arithmetics (deterministic as per IEEE) for my game engine.

It works! Code: Select all
`d:\chentrah\modules\tests>fpc sinus.pasFree Pascal Compiler version 2.6.2 [2013/02/12] for i386Copyright (c) 1993-2012 by Florian Klaempfl and othersTarget OS: Win32 for i386Compiling sinus.pasLinking sinus.exe42 lines compiled, 0.2 sec , 60784 bytes code, 12860 bytes datad:\chentrah\modules\tests>sinusa=0.0000(0.00 deg); true=0.0000; pseudo=0.0000; delta=0.0000; relat=0.000%a=0.1047(6.00 deg); true=0.1045; pseudo=0.1058; delta=0.0013; relat=1.254%a=0.2094(12.00 deg); true=0.2079; pseudo=0.2095; delta=0.0016; relat=0.784%a=0.3142(18.00 deg); true=0.3090; pseudo=0.3103; delta=0.0013; relat=0.430%a=0.4189(24.00 deg); true=0.4067; pseudo=0.4074; delta=0.0007; relat=0.174%a=0.5236(30.00 deg); true=0.5000; pseudo=0.5000; delta=0.0000; relat=0.000%a=0.6283(36.00 deg); true=0.5878; pseudo=0.5872; delta=-0.0006; relat=0.107%a=0.7330(42.00 deg); true=0.6691; pseudo=0.6680; delta=-0.0011; relat=0.162%a=0.8378(48.00 deg); true=0.7431; pseudo=0.7418; delta=-0.0013; relat=0.176%a=0.9425(54.00 deg); true=0.8090; pseudo=0.8077; delta=-0.0013; relat=0.164%a=1.0472(60.00 deg); true=0.8660; pseudo=0.8649; delta=-0.0012; relat=0.134%a=1.1519(66.00 deg); true=0.9135; pseudo=0.9127; delta=-0.0009; relat=0.097%a=1.2566(72.00 deg); true=0.9511; pseudo=0.9505; delta=-0.0006; relat=0.059%a=1.3614(78.00 deg); true=0.9781; pseudo=0.9779; delta=-0.0003; relat=0.028%a=1.4661(84.00 deg); true=0.9945; pseudo=0.9945; delta=-0.0001; relat=0.007%a=1.5708(90.00 deg); true=1.0000; pseudo=1.0000; delta=0.0000; relat=0.000%a=1.6755(96.00 deg); true=0.9945; pseudo=0.9945; delta=-0.0001; relat=0.007%a=1.7802(102.00 deg); true=0.9781; pseudo=0.9779; delta=-0.0003; relat=0.028%a=1.8850(108.00 deg); true=0.9511; pseudo=0.9505; delta=-0.0006; relat=0.059%a=1.9897(114.00 deg); true=0.9135; pseudo=0.9127; delta=-0.0009; relat=0.097%a=2.0944(120.00 deg); true=0.8660; pseudo=0.8649; delta=-0.0012; relat=0.134%a=2.1991(126.00 deg); true=0.8090; pseudo=0.8077; delta=-0.0013; relat=0.164%a=2.3038(132.00 deg); true=0.7431; pseudo=0.7418; delta=-0.0013; relat=0.176%a=2.4086(138.00 deg); true=0.6691; pseudo=0.6680; delta=-0.0011; relat=0.162%a=2.5133(144.00 deg); true=0.5878; pseudo=0.5872; delta=-0.0006; relat=0.107%a=2.6180(150.00 deg); true=0.5000; pseudo=0.5000; delta=-0.0000; relat=0.000%a=2.7227(156.00 deg); true=0.4067; pseudo=0.4074; delta=0.0007; relat=0.174%a=2.8274(162.00 deg); true=0.3090; pseudo=0.3103; delta=0.0013; relat=0.430%a=2.9322(168.00 deg); true=0.2079; pseudo=0.2095; delta=0.0016; relat=0.784%a=3.0369(174.00 deg); true=0.1045; pseudo=0.1058; delta=0.0013; relat=1.254%a=3.1416(180.00 deg); true=-0.0000; pseudo=0.0000; delta=0.0000; relat=0.087%Press Enter to close the console.`

where

Code: Select all
`{\$mode objfpc}{\$ifdef windows}  {\$apptype console}{\$endif}{\$longstrings on}program sinus;uses sysutils, math;type float = single;const   pi = 3.141592653589793;  RadToDeg: float = 180 / pi;function pseudosin(a: float): float;begin  a:= RadToDeg * a;  a:= (180.0 - a) * a;  Result:= (4.0 * a) / (40500 - a);end;var   i: integer;  x, ytrue, ypseudo: float;  const steps = 30;  begin  for i:= 0 to steps do begin    x:= (pi / steps) * i;    ytrue:= sin(x);    ypseudo:= pseudosin(x);    WriteLn('a=',x:5:4,'(',x*RadToDeg:3:2,' deg); true=',ytrue:5:4,'; pseudo=',ypseudo:5:4,'; delta=',ypseudo - ytrue:5:4,'; relat=',100 * abs(ypseudo - ytrue) / max(ytrue, 0.0001):5:3,'%');  end;{\$ifdef windows}  WriteLn('Press Enter to close the console.');  ReadLn;{\$endif}end.`
Proud owner of 1.5 kilograms of Germanium transistors
Cheb  Posts: 1402

### Re: More weirdness and hackology

Aaand I'm totally lost... Spica75  Posts: 1994

### Re: More weirdness and hackology

I was kind of hoping this old thing would be of some use to you. Of course, there may be more algorithms on websites and in books that deal with programming embedded systems. Jack Ganssle (of http://www.ganssle.com/ mentions in http://www.ganssle.com/approx-2.htm the book “Computer Approximations” by John Hart and others as a source of some rough-and-(almost)-ready bits.

Spica75 wrote:Aaand I'm totally lost... The mad world of Klingon Programming does that to one.
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

Aaand I'm totally lost...

Determinism means the same calculations would give perfectly matching results, to the last bit, regardless of platform - i386 / x86_64 / arm, regardless of compiler version and compiler settings.
And there is standard for that, most CPUs conform to it, bus some may not (because, bugs) resulting in owners of some AMD CPUs get random disconnects from MMO game sessions.

Determinism is vital for achieving reproductive behavior - e.g. the same input signals will always result with the same simulation.
It's how demos in Doom 2 were recorded: just a tiny sequence of key presses and mouse movement.

Unfortunately, functions like sine, exponent or square root are not so lucky. They are *not* deterministic.

Hence, the quest of finding replacements that achieve roughly the same result using only basic math. The one above is very simple and fast and very precise - its max error is 1.254% (it gives 0.1058 instead of 0.1045)

On weaksauce CPUs such replacements may be simply used to speed things up.

See also https://en.wikipedia.org/wiki/Fast_inverse_square_root and know it is now embedded in modern CPUs, they support this in hardware while Carmack's solution causes slowdown when you use SSE2 (at least, Free Pascal compiler gets confused and generates travesty like unload from XMM register to RAM - load into general-purpose rregister - do the bit-logic shit - unload into memory - load back into XMM register) so instead of, say,
Code: Select all
`    function FastInverseSquareRoot(a: float): float; inline;    var      i: longint;    begin      i:= longint(pointer(@a)^);      i:= \$5f3759df - (i shr 1);      Result:= float(pointer(@i)^);      Result*= 1.5 - (a * 0.5 * Result * Result);      Result*= 1.5 - (a * 0.5 * Result * Result);    end; `

I have to use
Code: Select all
`      function FastInverseSquareRoot(a: float): float; inline; assembler;      asm        RSQRTSS xmm7, [a]        MOVSS [Result], xmm7      end['xmm7'];     `
Proud owner of 1.5 kilograms of Germanium transistors
Cheb  Posts: 1402

### Re: More weirdness and hackology

There are also tales of gcc (the Gnu C compiler) doing all integer multiplies with shifts and adds, even when the target had a built-in MUL (multiply) instruction that left shifting-and-adding eating dust.

https://en.wikipedia.org/wiki/Binary_multiplier
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

Té Rowan wrote:There are also tales of gcc (the Gnu C compiler) doing all integer multiplies with shifts and adds, even when the target had a built-in MUL (multiply) instruction that left shifting-and-adding eating dust.]

Did that myself, when I was programming a CDC 3100 in assembly language. I think it was faster than integer multiply/divide. I know it was faster than floating point. And don't even mention sines or cosines.
Visit Big Washuu's Lab of Arcane Knowledge at http://washuu.net
Ellen Kuhfeld  Posts: 1796

### Re: More weirdness and hackology

I got some interesting results, but Raspberry Pi 3 takes forever to iterate six functions through several billion values each, so I'll tell tomorrow Proud owner of 1.5 kilograms of Germanium transistors
Cheb  Posts: 1402

### Re: More weirdness and hackology

I’m yet to grab the latest of Agner Fog’s optimisation files, but those I do have indicate it can take tens or hundreds of cycles to calculate sines and cosines in hardware. At least that is better than the poor old 8087 and 80287, which lacked the die space and needed software assistance.
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

tens or hundreds of cycles to calculate sines and cosines in hardware

So true!
The horror!

x86: sin() is 12..16 times slower than simple multiplication.
RPi3: sin() is 15 times slower than simple multiplication.
While the fake-sin you provided is only 3..4 times slower.

At least Free Pascal 3.0.4 (3.0.0. on Raspbian), with -O3 -OpPentiumM -CfSSE2 optimizations for x86, says so.

Surprisingly, sqrt() is pretty fast nowadays, 2..3 times slower than multiplication on x86 and barely (1.41 times) slower on RPi3.

Fast inverse square root, on the other hand, using Carmack's method with bit logic trics... It's two times slower than sqrt() on x86 and five times on RPi3! (or the Pascal compiler isn't that good at understanding what I require of it). A complete fiasco!
Granted, fast inverse square root using dedicated SSE opcodes is even faster than sqrt(), but it is NOT deterministic (more of that later).
Proud owner of 1.5 kilograms of Germanium transistors
Cheb  Posts: 1402

### Re: More weirdness and hackology

In other news, I looked up the Wikipedia page on GOTOs, and realised that while Sinclair BASICs did not have an explicit computed GOTO (ON … GOTO), they did have a right impressive assigned GOTO. Y’see, the GOTO statement in them takes an expression instead of just a number. Wow, I think FORTRAN would be proud.
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

Ah, Fortran. It's a tossup between Fortran and French for my second language, though I've used neither lately.
Visit Big Washuu's Lab of Arcane Knowledge at http://washuu.net
Ellen Kuhfeld  Posts: 1796

### Re: More weirdness and hackology

Then you no doubt know – though fairly few others here would – that ”God is real unless declared integer” is a Fortran joke.

For the remainder, FORTRAN had as convention that variables with names beginning with the letters I-N were by default declared INTEGER while names beginning with A-H or O-Z were declared REAL (floating point). So, yes, a variable named GOD would be REAL unless you explicitly declared it INTEGER.

Ah… multi-lingual jokes…
Without force fields, power cannot generate.
Té Rowan  Posts: 1162

### Re: More weirdness and hackology

I'm better on physics jokes.

Two atoms are walking down the street together when one suddenly stops. "I've lost an electron! he cries.
"Are you sure?"
"I'm positive."
Visit Big Washuu's Lab of Arcane Knowledge at http://washuu.net
Ellen Kuhfeld  Posts: 1796

Next 