Half-precision floats handling
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
bfloat16nn/libmodules
kaqu a8266a11db fmul ok 2 years ago
..
README.md bfloat16 starter ... 2 years ago
bfloat16nncore.py fmul ok 2 years ago
bfloat16processor.py bfloat16 fadd working! 2 years ago
dramtransfer.py bfloat16 starter ... 2 years ago
systime.py bfloat16 starter ... 2 years ago

README.md

LIBMODULES

File contents:

dramtransfer.py - contains main helpers for DRAM access.

bfloat16nncore.py - Neural network core processing

bfloat16processor.py - contains the bfloat16nn processing paths.

A note on the square root logic: I have implemented a variant of Goldschmidt's algorithm which allows for up to ⚠ 3.5% error, but there is simply no replacement for speed! If you need more accuracy, you will have to implement Newton-Raphson in s/w or perhaps doubles w/ external lib. calls. Example:

    // Newton-Raphson approximation (6 digits after decimal ok)        
    #define MAXITERATION 128
    #define ACCURRACY 1E-16

    float f = <value>; // Whatever you wanna calc.!
    float approx = 0.5 * f; // 1st approximation
    float betterapprox;            
    for(int i=0;i < MAXITERATION;i++) {
        betterapprox = 0.5 * (approx + f/approx);                
        if(f_abs(betterapprox - approx) < ACCURRACY)
            break;    
        approx = betterapprox;        
    }

systime.py - contains system time support