FP32 with fpu enabled

classic Classic list List threaded Threaded
15 messages Options
Pito Pito
Reply | Threaded
Open this post in threaded view
|

FP32 with fpu enabled

Hi, did somebody try to compile as FP32 with fpu support (e.g. for
the stm32f4 board)? P.


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled



2011/11/20 pito <[hidden email]>
Hi, did somebody try to compile as FP32 with fpu support (e.g. for
the stm32f4 board)? P.

It's not that easy unfortunately. eLua needs double precision floating point and the STM32F4 FPU only supports single precision. We still need to figure out a good way to use it.

Best,
Bogdan
 


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

.. I think 22bit integer may be enough for some applications. Or is
there any other constrain? P.

----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "eLua Users and Development List (www.eluaproject.net)"
<[hidden email]>
Předmět: Re: [eLua-dev] FP32 with fpu enabled
Datum: 20.11.2011 - 11:03:46

> 2011/11/20 pito <[hidden email]>
>
> > Hi, did somebody try to compile as FP32 with fpu
> > support (e.g. for
> > > the stm32f4 board)? P.
> >
>
> It's not that easy unfortunately. eLua needs
> double precision floating
> point and the STM32F4 FPU only supports single
> precision. We still need to
> figure out a good way to use it.
>
> Best,
> Bogdan
>
>
> >
> >
> > --
> > Jak se vyhnout nachlazení a dalším zdravotním
> > potížím v nepříjemném
> > > podzimním období? Čtěte speciál Zdraví na podzim
> > na
> > > http://web.volny.cz/data/click.php?id=1290
> >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
>


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Tim Michals Tim Michals
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

In reply to this post by Pito
GCC provides three basic options for compiling floating-point code:
  • Software floating point emulation, which is the default. In this case, the compiler implements floating-point arithmetic by means of library calls.
  • VFP hardware floating-point support using the soft-float ABI. This is selected by the -mfloat-abi=softfp option. When you select this variant, the compiler generates VFP floating-point instructions, but the resulting code uses the same call and return conventions as code compiled with software floating point.
  • VFP hardware floating-point support using the VFP ABI, which is the VFP variant of the Procedure Call Standard for the ARM® Architecture (AAPCS). This ABI uses VFP registers to pass function arguments and return values, resulting in faster floating-point code. To use this variant, compile with -mfloat-abi=hard.
The CodeSoucery is compiled as soft, so, to make the most use hard, libc, and newlib need to be compiled using hard. Using the softfp, might be the easiest, so the standard libraries can still be used.

Another issue, maybe add two new types to eLua, long int (64) , float (32) and leaving the standard as int, there is a lot of discussion on the forum about timer support etc 64 bits, so just lump both int-64, and float into expanding the types. But, that is a lot of work.


From: pito <[hidden email]>
To: [hidden email]
Sent: Sunday, November 20, 2011 3:38 AM
Subject: [eLua-dev] FP32 with fpu enabled

Hi, did somebody try to compile as FP32 with fpu support (e.g. for
the stm32f4 board)? P.


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

Hi,

On Sun, Nov 20, 2011 at 5:06 PM, Tim michals <[hidden email]> wrote:
GCC provides three basic options for compiling floating-point code:
  • Software floating point emulation, which is the default. In this case, the compiler implements floating-point arithmetic by means of library calls.
  • VFP hardware floating-point support using the soft-float ABI. This is selected by the -mfloat-abi=softfp option. When you select this variant, the compiler generates VFP floating-point instructions, but the resulting code uses the same call and return conventions as code compiled with software floating point.
  • VFP hardware floating-point support using the VFP ABI, which is the VFP variant of the Procedure Call Standard for the ARM® Architecture (AAPCS). This ABI uses VFP registers to pass function arguments and return values, resulting in faster floating-point code. To use this variant, compile with -mfloat-abi=hard.
The CodeSoucery is compiled as soft, so, to make the most use hard, libc, and newlib need to be compiled using hard. Using the softfp, might be the easiest, so the standard libraries can still be used.

Yes, softfp seems to be the best starting point here.
 

Another issue, maybe add two new types to eLua, long int (64) ,

This already happened. Have you missed the recent system timer thread? 

 
float (32)

We could definitly do this, although I can see eLua breaking in various, wonderfully unexpected ways when the number type won't be able to represent a full 32-bit integer anymore :)
 
and leaving the standard as int, there is a lot of discussion on the forum about timer support etc 64 bits, so just lump both int-64, and float into expanding the types.

Yupppp, you definitely missed the system timer thread :)
 
But, that is a lot of work.

I think the best option we have with this is LNUM. If LNUM can differentiate between single and double precision operations (like it does for integers and floats) we might be able to benefit from the hardware acceleration. I'll have to take a closer look at that at some point.

Best,
Bogdan
 


From: pito <[hidden email]>
To: [hidden email]
Sent: Sunday, November 20, 2011 3:38 AM
Subject: [eLua-dev] FP32 with fpu enabled

Hi, did somebody try to compile as FP32 with fpu support (e.g. for
the stm32f4 board)? P.


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

There are few new DSP instructions (stm32f4) which might speed-up
the soft math as well - marketing materials I saw say e.g.:
Single cycle MUL/MAC: signed/unsigned multiply, signed/unsigned MAC,
signed/unsigned MAC 64bit.
They claim speeds improvements vs. CM3: 4x for 16bit MAC, 2x for
32bit MAC, up to 7x for 64bit MAC..
p.



----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "Tim michals" <[hidden email]>, "eLua Users and
Development List (www.eluaproject.net)" <[hidden email]>
Předmět: Re: [eLua-dev] FP32 with fpu enabled
Datum: 20.11.2011 - 17:14:44

> Hi,
>
> On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
> <[hidden email]> wrote:
>
> >  GCC provides three basic options for compiling
> >  floating-point code:
> >  >
> >    - Software floating point emulation, which is
> >    the default. In this
> >    >    case, the compiler implements floating-point
> >    arithmetic by means of library
> >    >    calls.
> >    - VFP hardware floating-point support using
> >    the soft-float ABI. This
> >    >    is selected by the -mfloat-abi=softfp option.
> >    When you select this
> >    >    variant, the compiler generates VFP
> >    floating-point instructions, but the
> >    >    resulting code uses the same call and return
> >    conventions as code compiled
> >    >    with software floating point.
> >    - VFP hardware floating-point support using
> >    the VFP ABI, which is the
> >    >    VFP variant of the Procedure Call Standard
> >    for the ARM(R) Architecture
> >    >    (AAPCS). This ABI uses VFP registers to pass
> >    function arguments and return
> >    >    values, resulting in faster floating-point
> >    code. To use this variant,
> >    >    compile with -mfloat-abi=hard.
> >
> > The CodeSoucery is compiled as soft, so, to make
> > the most use hard, libc,
> > > and newlib need to be compiled using hard. Using
> > the softfp, might be the
> > > easiest, so the standard libraries can still be
> > used.
> > >
>
> Yes, softfp seems to be the best starting point
> here.
>
>
> >
> > Another issue, maybe add two new types to eLua,
> > long int (64) ,
> > >
>
> This already happened. Have you missed the recent
> system timer thread?
>
> http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
> >
>
> > float (32)
> >
>
> We could definitly do this, although I can see
> eLua breaking in various,
> wonderfully unexpected ways when the number type
> won't be able to represent
> a full 32-bit integer anymore :)
>
>
> > and leaving the standard as int, there is a lot
> > of discussion on the forum
> > > about timer support etc 64 bits, so just lump
> > both int-64, and float into
> > > expanding the types.
> >
>
> Yupppp, you definitely missed the system timer
> thread :)
>
>
> > But, that is a lot of work.
> >
>
> I think the best option we have with this is LNUM.
> If LNUM can
> differentiate between single and double precision
> operations (like it does
> for integers and floats) we might be able to
> benefit from the hardware
> acceleration. I'll have to take a closer look at
> that at some point.
>
> Best,
> Bogdan
>
>
> >
> >   ------------------------------
> > *From:* pito <[hidden email]>
> > *To:* [hidden email]
> > *Sent:* Sunday, November 20, 2011 3:38 AM
> > *Subject:* [eLua-dev] FP32 with fpu enabled
> >
> > Hi, did somebody try to compile as FP32 with fpu
> > support (e.g. for
> > > the stm32f4 board)? P.
> >
> >
> > --
> > Jak se vyhnout nachlazení a dalším zdravotním
> > potížím v nepříjemném
> > > podzimním období? Čtěte speciál Zdraví na podzim
> > na
> > > http://web.volny.cz/data/click.php?id=1290
> >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
>


--
Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
videa, která vás rychle, názorně a zábavnou formou naučí ovládat
programy Excel, Word a PowerPoint. Seriál najdete na
http://web.volny.cz/data/click.php?id=1293


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled



2011/11/20 pito <[hidden email]>
There are few new DSP instructions (stm32f4) which might speed-up
the soft math as well - marketing materials I saw say e.g.:
Single cycle MUL/MAC: signed/unsigned multiply, signed/unsigned MAC,
signed/unsigned MAC 64bit.
They claim speeds improvements vs. CM3: 4x for 16bit MAC, 2x for
32bit MAC, up to 7x for 64bit MAC..

MAC is mostly a DSP operation, general purpose code doesn't use it that much, so I wouldn't hold my breath for too long.

Best,
Bogdan

p.



----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "Tim michals" <[hidden email]>, "eLua Users and
Development List (www.eluaproject.net)" <[hidden email]>
Předmět: Re: [eLua-dev] FP32 with fpu enabled
Datum: 20.11.2011 - 17:14:44

> Hi,
>
> On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
> <[hidden email]> wrote:
>
> >  GCC provides three basic options for compiling
> >  floating-point code:
> >  >
> >    - Software floating point emulation, which is
> >    the default. In this
> >    >    case, the compiler implements floating-point
> >    arithmetic by means of library
> >    >    calls.
> >    - VFP hardware floating-point support using
> >    the soft-float ABI. This
> >    >    is selected by the -mfloat-abi=softfp option.
> >    When you select this
> >    >    variant, the compiler generates VFP
> >    floating-point instructions, but the
> >    >    resulting code uses the same call and return
> >    conventions as code compiled
> >    >    with software floating point.
> >    - VFP hardware floating-point support using
> >    the VFP ABI, which is the
> >    >    VFP variant of the Procedure Call Standard
> >    for the ARM(R) Architecture
> >    >    (AAPCS). This ABI uses VFP registers to pass
> >    function arguments and return
> >    >    values, resulting in faster floating-point
> >    code. To use this variant,
> >    >    compile with -mfloat-abi=hard.
> >
> > The CodeSoucery is compiled as soft, so, to make
> > the most use hard, libc,
> > > and newlib need to be compiled using hard. Using
> > the softfp, might be the
> > > easiest, so the standard libraries can still be
> > used.
> > >
>
> Yes, softfp seems to be the best starting point
> here.
>
>
> >
> > Another issue, maybe add two new types to eLua,
> > long int (64) ,
> > >
>
> This already happened. Have you missed the recent
> system timer thread?
>
> http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
> >
>
> > float (32)
> >
>
> We could definitly do this, although I can see
> eLua breaking in various,
> wonderfully unexpected ways when the number type
> won't be able to represent
> a full 32-bit integer anymore :)
>
>
> > and leaving the standard as int, there is a lot
> > of discussion on the forum
> > > about timer support etc 64 bits, so just lump
> > both int-64, and float into
> > > expanding the types.
> >
>
> Yupppp, you definitely missed the system timer
> thread :)
>
>
> > But, that is a lot of work.
> >
>
> I think the best option we have with this is LNUM.
> If LNUM can
> differentiate between single and double precision
> operations (like it does
> for integers and floats) we might be able to
> benefit from the hardware
> acceleration. I'll have to take a closer look at
> that at some point.
>
> Best,
> Bogdan
>
>
> >
> >   ------------------------------
> > *From:* pito <[hidden email]>
> > *To:* [hidden email]
> > *Sent:* Sunday, November 20, 2011 3:38 AM
> > *Subject:* [eLua-dev] FP32 with fpu enabled
> >
> > Hi, did somebody try to compile as FP32 with fpu
> > support (e.g. for
> > > the stm32f4 board)? P.
> >
> >
> > --
> > Jak se vyhnout nachlazení a dalším zdravotním
> > potížím v nepříjemném
> > > podzimním období? Čtěte speciál Zdraví na podzim
> > na
> > > http://web.volny.cz/data/click.php?id=1290
> >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
>


--
Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
videa, která vás rychle, názorně a zábavnou formou naučí ovládat
programy Excel, Word a PowerPoint. Seriál najdete na
http://web.volny.cz/data/click.php?id=1293




_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

I know, but (math) libs may utilise that. There is alot of "MACs"
done inside the libs code :) p.

----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "pito" <[hidden email]>
Předmět: Re: [eLua-dev] FP32 with fpu enabled
Datum: 20.11.2011 - 18:26:08

> 2011/11/20 pito <[hidden email]>
>
> > There are few new DSP instructions (stm32f4)
> > which might speed-up
> > > the soft math as well - marketing materials I
> > saw say e.g.:
> > > Single cycle MUL/MAC: signed/unsigned multiply,
> > signed/unsigned MAC,
> > > signed/unsigned MAC 64bit.
> > They claim speeds improvements vs. CM3: 4x for
> > 16bit MAC, 2x for
> > > 32bit MAC, up to 7x for 64bit MAC..
> >
>
> MAC is mostly a DSP operation, general purpose
> code doesn't use it that
> much, so I wouldn't hold my breath for too long.
>
> Best,
> Bogdan
>
> p.
> >
> >
> >
> > ----- PŮVODNÍ ZPRÁVA -----
> > Od: "Bogdan Marinescu"
> > <[hidden email]>
> > > Komu: "Tim michals" <[hidden email]>, "eLua
> > Users and
> > > Development List (www.eluaproject.net)"
> > <[hidden email]>
> > > Předmět: Re: [eLua-dev] FP32 with fpu enabled
> > Datum: 20.11.2011 - 17:14:44
> >
> > > Hi,
> > >
> > > On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
> > > <[hidden email]> wrote:
> > >
> > > >  GCC provides three basic options for
> > > >  compiling
> > > >  > > >  floating-point code:
> > > >  >
> > > >    - Software floating point emulation,
> > > >    which is
> > > >    > > >    the default. In this
> > > >    >    case, the compiler implements
> > > >    >    floating-point
> > > >    >    > > >    arithmetic by means of library
> > > >    >    calls.
> > > >    - VFP hardware floating-point support
> > > >    using
> > > >    > > >    the soft-float ABI. This
> > > >    >    is selected by the
> > > >    >    -mfloat-abi=softfp option.
> > > >    >    > > >    When you select this
> > > >    >    variant, the compiler generates VFP
> > > >    floating-point instructions, but the
> > > >    >    resulting code uses the same call
> > > >    >    and return
> > > >    >    > > >    conventions as code compiled
> > > >    >    with software floating point.
> > > >    - VFP hardware floating-point support
> > > >    using
> > > >    > > >    the VFP ABI, which is the
> > > >    >    VFP variant of the Procedure Call
> > > >    >    Standard
> > > >    >    > > >    for the ARM(R) Architecture
> > > >    >    (AAPCS). This ABI uses VFP registers
> > > >    >    to pass
> > > >    >    > > >    function arguments and return
> > > >    >    values, resulting in faster
> > > >    >    floating-point
> > > >    >    > > >    code. To use this variant,
> > > >    >    compile with -mfloat-abi=hard.
> > > >
> > > > The CodeSoucery is compiled as soft, so, to
> > > > make
> > > > > > > the most use hard, libc,
> > > > > and newlib need to be compiled using hard.
> > > > > Using
> > > > > > > > the softfp, might be the
> > > > > easiest, so the standard libraries can
> > > > > still be
> > > > > > > > used.
> > > > >
> > >
> > > Yes, softfp seems to be the best starting
> > > point
> > > > > here.
> > >
> > >
> > > >
> > > > Another issue, maybe add two new types to
> > > > eLua,
> > > > > > > long int (64) ,
> > > > >
> > >
> > > This already happened. Have you missed the
> > > recent
> > > > > system timer thread?
> > >
> > >
> > http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
> > > > > > >
> > >
> > > > float (32)
> > > >
> > >
> > > We could definitly do this, although I can see
> > > eLua breaking in various,
> > > wonderfully unexpected ways when the number
> > > type
> > > > > won't be able to represent
> > > a full 32-bit integer anymore :)
> > >
> > >
> > > > and leaving the standard as int, there is a
> > > > lot
> > > > > > > of discussion on the forum
> > > > > about timer support etc 64 bits, so just
> > > > > lump
> > > > > > > > both int-64, and float into
> > > > > expanding the types.
> > > >
> > >
> > > Yupppp, you definitely missed the system timer
> > > thread :)
> > >
> > >
> > > > But, that is a lot of work.
> > > >
> > >
> > > I think the best option we have with this is
> > > LNUM.
> > > > > If LNUM can
> > > differentiate between single and double
> > > precision
> > > > > operations (like it does
> > > for integers and floats) we might be able to
> > > benefit from the hardware
> > > acceleration. I'll have to take a closer look
> > > at
> > > > > that at some point.
> > >
> > > Best,
> > > Bogdan
> > >
> > >
> > > >
> > > >   ------------------------------
> > > > *From:* pito <[hidden email]>
> > > > *To:* [hidden email]
> > > > *Sent:* Sunday, November 20, 2011 3:38 AM
> > > > *Subject:* [eLua-dev] FP32 with fpu enabled
> > > >
> > > > Hi, did somebody try to compile as FP32 with
> > > > fpu
> > > > > > > support (e.g. for
> > > > > the stm32f4 board)? P.
> > > >
> > > >
> > > > --
> > > > Jak se vyhnout nachlazení a dalším
> > > > zdravotním
> > > > > > > potížím v nepříjemném
> > > > > podzimním období? Čtěte speciál Zdraví na
> > > > > podzim
> > > > > > > > na
> > > > > http://web.volny.cz/data/click.php?id=1290
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > > > > eLua-dev mailing list
> > > > [hidden email]
> > > > https://lists.berlios.de/mailman/listinfo/elua-dev
> > > > > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > > > > eLua-dev mailing list
> > > > [hidden email]
> > > > https://lists.berlios.de/mailman/listinfo/elua-dev
> > > > > > > >
> > > >
> > >
> >
> >
> > --
> > Videokurzy MS Office zdarma! Portál VOLNÝ.cz
> > přináší online výuková
> > > videa, která vás rychle, názorně a zábavnou
> > formou naučí ovládat
> > > programy Excel, Word a PowerPoint. Seriál
> > najdete na
> > > http://web.volny.cz/data/click.php?id=1293
> >
> >
> >
>


--
Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Dan Debeer Dan Debeer
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

In reply to this post by BogdanM
I believe the bulk of the time in any program is spent in the interpreter and hardware floating point makes no material difference to the overall execution speed of the program.  Now if you are implementing an FFT,  FIR or IIR filter or some other special DSP code that depends on dense numerical code the appropriate way to implement the DSP code is as C-subroutines and linked in as such. 

I did some testing on a 72MHz STM32 board and found that for floating point add, subtract, multiply and divide operations the interpreter overhead is about 60x the time spent in the floating point routines for a tight loop.  In fact, the overhead is so large that it is difficult to tell when the loop is using floating point or integers for the numerical calculations.

YMMV but I suggest we get feel happy about the large RAM and fast clock and make sure we document how to call optimized code from within eLUA instead of trying to invoke the floating point engine for operations where we won't see the difference in performance but will see interesting new bugs.

Regards
Dan


From: Bogdan Marinescu <[hidden email]>
To: pito <[hidden email]>
Cc: [hidden email]
Sent: Sunday, November 20, 2011 9:26 AM
Subject: Re: [eLua-dev] FP32 with fpu enabled



2011/11/20 pito <[hidden email]>
There are few new DSP instructions (stm32f4) which might speed-up
the soft math as well - marketing materials I saw say e.g.:
Single cycle MUL/MAC: signed/unsigned multiply, signed/unsigned MAC,
signed/unsigned MAC 64bit.
They claim speeds improvements vs. CM3: 4x for 16bit MAC, 2x for
32bit MAC, up to 7x for 64bit MAC..

MAC is mostly a DSP operation, general purpose code doesn't use it that much, so I wouldn't hold my breath for too long.

Best,
Bogdan

p.



----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "Tim michals" <[hidden email]>, "eLua Users and
Development List (www.eluaproject.net)" <[hidden email]>
Předmět: Re: [eLua-dev] FP32 with fpu enabled
Datum: 20.11.2011 - 17:14:44

> Hi,
>
> On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
> <[hidden email]> wrote:
>
> >  GCC provides three basic options for compiling
> >  floating-point code:
> >  >
> >    - Software floating point emulation, which is
> >    the default. In this
> >    >    case, the compiler implements floating-point
> >    arithmetic by means of library
> >    >    calls.
> >    - VFP hardware floating-point support using
> >    the soft-float ABI. This
> >    >    is selected by the -mfloat-abi=softfp option.
> >    When you select this
> >    >    variant, the compiler generates VFP
> >    floating-point instructions, but the
> >    >    resulting code uses the same call and return
> >    conventions as code compiled
> >    >    with software floating point.
> >    - VFP hardware floating-point support using
> >    the VFP ABI, which is the
> >    >    VFP variant of the Procedure Call Standard
> >    for the ARM(R) Architecture
> >    >    (AAPCS). This ABI uses VFP registers to pass
> >    function arguments and return
> >    >    values, resulting in faster floating-point
> >    code. To use this variant,
> >    >    compile with -mfloat-abi=hard.
> >
> > The CodeSoucery is compiled as soft, so, to make
> > the most use hard, libc,
> > > and newlib need to be compiled using hard. Using
> > the softfp, might be the
> > > easiest, so the standard libraries can still be
> > used.
> > >
>
> Yes, softfp seems to be the best starting point
> here.
>
>
> >
> > Another issue, maybe add two new types to eLua,
> > long int (64) ,
> > >
>
> This already happened. Have you missed the recent
> system timer thread?
>
> http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
> >
>
> > float (32)
> >
>
> We could definitly do this, although I can see
> eLua breaking in various,
> wonderfully unexpected ways when the number type
> won't be able to represent
> a full 32-bit integer anymore :)
>
>
> > and leaving the standard as int, there is a lot
> > of discussion on the forum
> > > about timer support etc 64 bits, so just lump
> > both int-64, and float into
> > > expanding the types.
> >
>
> Yupppp, you definitely missed the system timer
> thread :)
>
>
> > But, that is a lot of work.
> >
>
> I think the best option we have with this is LNUM.
> If LNUM can
> differentiate between single and double precision
> operations (like it does
> for integers and floats) we might be able to
> benefit from the hardware
> acceleration. I'll have to take a closer look at
> that at some point.
>
> Best,
> Bogdan
>
>
> >
> >   ------------------------------
> > *From:* pito <[hidden email]>
> > *To:* [hidden email]
> > *Sent:* Sunday, November 20, 2011 3:38 AM
> > *Subject:* [eLua-dev] FP32 with fpu enabled
> >
> > Hi, did somebody try to compile as FP32 with fpu
> > support (e.g. for
> > > the stm32f4 board)? P.
> >
> >
> > --
> > Jak se vyhnout nachlazení a dalším zdravotním
> > potížím v nepříjemném
> > > podzimním období? Čtěte speciál Zdraví na podzim
> > na
> > > http://web.volny.cz/data/click.php?id=1290
> >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
>


--
Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
videa, která vás rychle, názorně a zábavnou formou naučí ovládat
programy Excel, Word a PowerPoint. Seriál najdete na
http://web.volny.cz/data/click.php?id=1293




_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

On Mon, Nov 21, 2011 at 12:53 PM, Dan Debeer <[hidden email]> wrote:
> I believe the bulk of the time in any program is spent in the interpreter
> and hardware floating point makes no material difference to the overall
> execution speed of the program.  Now if you are implementing an FFT,  FIR or
> IIR filter or some other special DSP code that depends on dense numerical
> code the appropriate way to implement the DSP code is as C-subroutines and
> linked in as such.

Indeed, I would not suggest trying to implement DSP code in pure Lua,
especially on an embedded target.  I think there are some valuable
things we could hook in with here for modules like ADC and possibly
others that might deal with arrays of incoming data that need to be
processed in a CPU-efficient manner before delivering lower bandwidth
results to Lua.

>
> I did some testing on a 72MHz STM32 board and found that for floating point
> add, subtract, multiply and divide operations the interpreter overhead is
> about 60x the time spent in the floating point routines for a tight loop.

This sounds close to what one might expect (give or take single digit
scalar quantities):
http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php

> In fact, the overhead is so large that it is difficult to tell when the loop
> is using floating point or integers for the numerical calculations.
> YMMV but I suggest we get feel happy about the large RAM and fast clock and
> make sure we document how to call optimized code from within eLUA instead of
> trying to invoke the floating point engine for operations where we won't see
> the difference in performance but will see interesting new bugs.

There are a couple of good related points here:
1) Relative to optimizing Lua, check out Lua Performance Tips
(http://www.lua.org/gems/sample.pdf)

The biggest, and probably the most valuable I've found is setting
variables and even references to C functions as local.  This makes an
_immense_ amount of difference, especially when there are large
differences in wait state performance between flash and SRAM.  The
less indirection the interpreter has to use to look up things every
single cycle it does something in a tight loop, the better.

(that said, "premature optimization is the root of all evil" etc.. :-) )

2) I would encourage the use of functions that do "vectorized" (as it
is called in MATLAB, basically meaning do operations on more than one
data outside the interpeter for each call) operations when you do have
heavy lifting to do in C.  If you're going to do something in a loop
that requires a lot of computation and there's a fairly low memory way
to do that in C with one function call rather than calling that
function repeatedly in a loop (10x, 100x, 1000x), it might be worth it
if you're having performance problems.

The latter of these is fairly key in CPU-bound scripting-language code
where the interpreter is going to cost you quite a bit per cycle or
per VM instruction.  Lua might be quite fast for its class of language
and implementation, but there are worthwhile performance gains to be
made by understanding what is costing you time and how you can work
with that in an efficient manner.

All that said, I'd love it if we could appropriately use the FPU when
needed, since it seems like a waste to not take advantage of the
hardware when it's sitting there waiting for instructions :-)  I agree
with Bogdan on the front that something like LNUM's model (which, as I
recall, segregates value storage and operations into integer and
floating point ones based on fairly simple rules) might be a good
direction to look to to integrate functionality like this.   All that
said, I expect we might only see major performance differences in
specific use cases and you're likely to get more mileage out of using
"local" variables and references to C functions to avoid lookups in
flash.

>
> Regards
> Dan
>
> ________________________________
> From: Bogdan Marinescu <[hidden email]>
> To: pito <[hidden email]>
> Cc: [hidden email]
> Sent: Sunday, November 20, 2011 9:26 AM
> Subject: Re: [eLua-dev] FP32 with fpu enabled
>
>
>
> 2011/11/20 pito <[hidden email]>
>
> There are few new DSP instructions (stm32f4) which might speed-up
> the soft math as well - marketing materials I saw say e.g.:
> Single cycle MUL/MAC: signed/unsigned multiply, signed/unsigned MAC,
> signed/unsigned MAC 64bit.
> They claim speeds improvements vs. CM3: 4x for 16bit MAC, 2x for
> 32bit MAC, up to 7x for 64bit MAC..
>
> MAC is mostly a DSP operation, general purpose code doesn't use it that
> much, so I wouldn't hold my breath for too long.
> Best,
> Bogdan
>
> p.
>
>
>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "Bogdan Marinescu" <[hidden email]>
> Komu: "Tim michals" <[hidden email]>, "eLua Users and
> Development List (www.eluaproject.net)" <[hidden email]>
> Předmět: Re: [eLua-dev] FP32 with fpu enabled
> Datum: 20.11.2011 - 17:14:44
>
>> Hi,
>>
>> On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
>> <[hidden email]> wrote:
>>
>> >  GCC provides three basic options for compiling
>> >  floating-point code:
>> >  >
>> >    - Software floating point emulation, which is
>> >    the default. In this
>> >    >    case, the compiler implements floating-point
>> >    arithmetic by means of library
>> >    >    calls.
>> >    - VFP hardware floating-point support using
>> >    the soft-float ABI. This
>> >    >    is selected by the -mfloat-abi=softfp option.
>> >    When you select this
>> >    >    variant, the compiler generates VFP
>> >    floating-point instructions, but the
>> >    >    resulting code uses the same call and return
>> >    conventions as code compiled
>> >    >    with software floating point.
>> >    - VFP hardware floating-point support using
>> >    the VFP ABI, which is the
>> >    >    VFP variant of the Procedure Call Standard
>> >    for the ARM(R) Architecture
>> >    >    (AAPCS). This ABI uses VFP registers to pass
>> >    function arguments and return
>> >    >    values, resulting in faster floating-point
>> >    code. To use this variant,
>> >    >    compile with -mfloat-abi=hard.
>> >
>> > The CodeSoucery is compiled as soft, so, to make
>> > the most use hard, libc,
>> > > and newlib need to be compiled using hard. Using
>> > the softfp, might be the
>> > > easiest, so the standard libraries can still be
>> > used.
>> > >
>>
>> Yes, softfp seems to be the best starting point
>> here.
>>
>>
>> >
>> > Another issue, maybe add two new types to eLua,
>> > long int (64) ,
>> > >
>>
>> This already happened. Have you missed the recent
>> system timer thread?
>>
>>
>> http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
>> >
>>
>> > float (32)
>> >
>>
>> We could definitly do this, although I can see
>> eLua breaking in various,
>> wonderfully unexpected ways when the number type
>> won't be able to represent
>> a full 32-bit integer anymore :)
>>
>>
>> > and leaving the standard as int, there is a lot
>> > of discussion on the forum
>> > > about timer support etc 64 bits, so just lump
>> > both int-64, and float into
>> > > expanding the types.
>> >
>>
>> Yupppp, you definitely missed the system timer
>> thread :)
>>
>>
>> > But, that is a lot of work.
>> >
>>
>> I think the best option we have with this is LNUM.
>> If LNUM can
>> differentiate between single and double precision
>> operations (like it does
>> for integers and floats) we might be able to
>> benefit from the hardware
>> acceleration. I'll have to take a closer look at
>> that at some point.
>>
>> Best,
>> Bogdan
>>
>>
>> >
>> >   ------------------------------
>> > *From:* pito <[hidden email]>
>> > *To:* [hidden email]
>> > *Sent:* Sunday, November 20, 2011 3:38 AM
>> > *Subject:* [eLua-dev] FP32 with fpu enabled
>> >
>> > Hi, did somebody try to compile as FP32 with fpu
>> > support (e.g. for
>> > > the stm32f4 board)? P.
>> >
>> >
>> > --
>> > Jak se vyhnout nachlazení a dalším zdravotním
>> > potížím v nepříjemném
>> > > podzimním období? Čtěte speciál Zdraví na podzim
>> > na
>> > > http://web.volny.cz/data/click.php?id=1290
>> >
>> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>>
>
>
> --
> Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
> videa, která vás rychle, názorně a zábavnou formou naučí ovládat
> programy Excel, Word a PowerPoint. Seriál najdete na
> http://web.volny.cz/data/click.php?id=1293
>
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Tony-12 Tony-12
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

I think handling single precision floating point is important, because it's not just a matter of the STM32F4, it's becoming pervasive; everybody (well, at least ST, TI, Atmel, NXP, and Freescale) are introducing affordable MCUs with it.  Also, IIRC, even the "big" MCU's (like the Cortex A8 in the BeagleBone) are single precision.  Unfortunately, double precision FPU's on MCU's are still scarce...

I'm not sure what is the best way to do it, but it does seem a shame not to take advantage of the FPU.

--Tony

2011/11/21 James Snyder <[hidden email]>
On Mon, Nov 21, 2011 at 12:53 PM, Dan Debeer <[hidden email]> wrote:
> I believe the bulk of the time in any program is spent in the interpreter
> and hardware floating point makes no material difference to the overall
> execution speed of the program.  Now if you are implementing an FFT,  FIR or
> IIR filter or some other special DSP code that depends on dense numerical
> code the appropriate way to implement the DSP code is as C-subroutines and
> linked in as such.

Indeed, I would not suggest trying to implement DSP code in pure Lua,
especially on an embedded target.  I think there are some valuable
things we could hook in with here for modules like ADC and possibly
others that might deal with arrays of incoming data that need to be
processed in a CPU-efficient manner before delivering lower bandwidth
results to Lua.

>
> I did some testing on a 72MHz STM32 board and found that for floating point
> add, subtract, multiply and divide operations the interpreter overhead is
> about 60x the time spent in the floating point routines for a tight loop.

This sounds close to what one might expect (give or take single digit
scalar quantities):
http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php

> In fact, the overhead is so large that it is difficult to tell when the loop
> is using floating point or integers for the numerical calculations.
> YMMV but I suggest we get feel happy about the large RAM and fast clock and
> make sure we document how to call optimized code from within eLUA instead of
> trying to invoke the floating point engine for operations where we won't see
> the difference in performance but will see interesting new bugs.

There are a couple of good related points here:
1) Relative to optimizing Lua, check out Lua Performance Tips
(http://www.lua.org/gems/sample.pdf)

The biggest, and probably the most valuable I've found is setting
variables and even references to C functions as local.  This makes an
_immense_ amount of difference, especially when there are large
differences in wait state performance between flash and SRAM.  The
less indirection the interpreter has to use to look up things every
single cycle it does something in a tight loop, the better.

(that said, "premature optimization is the root of all evil" etc.. :-) )

2) I would encourage the use of functions that do "vectorized" (as it
is called in MATLAB, basically meaning do operations on more than one
data outside the interpeter for each call) operations when you do have
heavy lifting to do in C.  If you're going to do something in a loop
that requires a lot of computation and there's a fairly low memory way
to do that in C with one function call rather than calling that
function repeatedly in a loop (10x, 100x, 1000x), it might be worth it
if you're having performance problems.

The latter of these is fairly key in CPU-bound scripting-language code
where the interpreter is going to cost you quite a bit per cycle or
per VM instruction.  Lua might be quite fast for its class of language
and implementation, but there are worthwhile performance gains to be
made by understanding what is costing you time and how you can work
with that in an efficient manner.

All that said, I'd love it if we could appropriately use the FPU when
needed, since it seems like a waste to not take advantage of the
hardware when it's sitting there waiting for instructions :-)  I agree
with Bogdan on the front that something like LNUM's model (which, as I
recall, segregates value storage and operations into integer and
floating point ones based on fairly simple rules) might be a good
direction to look to to integrate functionality like this.   All that
said, I expect we might only see major performance differences in
specific use cases and you're likely to get more mileage out of using
"local" variables and references to C functions to avoid lookups in
flash.

>
> Regards
> Dan
>
> ________________________________
> From: Bogdan Marinescu <[hidden email]>
> To: pito <[hidden email]>
> Cc: [hidden email]
> Sent: Sunday, November 20, 2011 9:26 AM
> Subject: Re: [eLua-dev] FP32 with fpu enabled
>
>
>
> 2011/11/20 pito <[hidden email]>
>
> There are few new DSP instructions (stm32f4) which might speed-up
> the soft math as well - marketing materials I saw say e.g.:
> Single cycle MUL/MAC: signed/unsigned multiply, signed/unsigned MAC,
> signed/unsigned MAC 64bit.
> They claim speeds improvements vs. CM3: 4x for 16bit MAC, 2x for
> 32bit MAC, up to 7x for 64bit MAC..
>
> MAC is mostly a DSP operation, general purpose code doesn't use it that
> much, so I wouldn't hold my breath for too long.
> Best,
> Bogdan
>
> p.
>
>
>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "Bogdan Marinescu" <[hidden email]>
> Komu: "Tim michals" <[hidden email]>, "eLua Users and
> Development List (www.eluaproject.net)" <[hidden email]>
> Předmět: Re: [eLua-dev] FP32 with fpu enabled
> Datum: 20.11.2011 - 17:14:44
>
>> Hi,
>>
>> On Sun, Nov 20, 2011 at 5:06 PM, Tim michals
>> <[hidden email]> wrote:
>>
>> >  GCC provides three basic options for compiling
>> >  floating-point code:
>> >  >
>> >    - Software floating point emulation, which is
>> >    the default. In this
>> >    >    case, the compiler implements floating-point
>> >    arithmetic by means of library
>> >    >    calls.
>> >    - VFP hardware floating-point support using
>> >    the soft-float ABI. This
>> >    >    is selected by the -mfloat-abi=softfp option.
>> >    When you select this
>> >    >    variant, the compiler generates VFP
>> >    floating-point instructions, but the
>> >    >    resulting code uses the same call and return
>> >    conventions as code compiled
>> >    >    with software floating point.
>> >    - VFP hardware floating-point support using
>> >    the VFP ABI, which is the
>> >    >    VFP variant of the Procedure Call Standard
>> >    for the ARM(R) Architecture
>> >    >    (AAPCS). This ABI uses VFP registers to pass
>> >    function arguments and return
>> >    >    values, resulting in faster floating-point
>> >    code. To use this variant,
>> >    >    compile with -mfloat-abi=hard.
>> >
>> > The CodeSoucery is compiled as soft, so, to make
>> > the most use hard, libc,
>> > > and newlib need to be compiled using hard. Using
>> > the softfp, might be the
>> > > easiest, so the standard libraries can still be
>> > used.
>> > >
>>
>> Yes, softfp seems to be the best starting point
>> here.
>>
>>
>> >
>> > Another issue, maybe add two new types to eLua,
>> > long int (64) ,
>> > >
>>
>> This already happened. Have you missed the recent
>> system timer thread?
>>
>>
>> http://elua-development.2368040.n2.nabble.com/IMPORTANT-New-feature-on-the-master-branch-system-timer-td6918200.html
>> >
>>
>> > float (32)
>> >
>>
>> We could definitly do this, although I can see
>> eLua breaking in various,
>> wonderfully unexpected ways when the number type
>> won't be able to represent
>> a full 32-bit integer anymore :)
>>
>>
>> > and leaving the standard as int, there is a lot
>> > of discussion on the forum
>> > > about timer support etc 64 bits, so just lump
>> > both int-64, and float into
>> > > expanding the types.
>> >
>>
>> Yupppp, you definitely missed the system timer
>> thread :)
>>
>>
>> > But, that is a lot of work.
>> >
>>
>> I think the best option we have with this is LNUM.
>> If LNUM can
>> differentiate between single and double precision
>> operations (like it does
>> for integers and floats) we might be able to
>> benefit from the hardware
>> acceleration. I'll have to take a closer look at
>> that at some point.
>>
>> Best,
>> Bogdan
>>
>>
>> >
>> >   ------------------------------
>> > *From:* pito <[hidden email]>
>> > *To:* [hidden email]
>> > *Sent:* Sunday, November 20, 2011 3:38 AM
>> > *Subject:* [eLua-dev] FP32 with fpu enabled
>> >
>> > Hi, did somebody try to compile as FP32 with fpu
>> > support (e.g. for
>> > > the stm32f4 board)? P.
>> >
>> >
>> > --
>> > Jak se vyhnout nachlazení a dalším zdravotním
>> > potížím v nepříjemném
>> > > podzimním období? Čtěte speciál Zdraví na podzim
>> > na
>> > > http://web.volny.cz/data/click.php?id=1290
>> >
>> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>>
>
>
> --
> Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
> videa, která vás rychle, názorně a zábavnou formou naučí ovládat
> programy Excel, Word a PowerPoint. Seriál najdete na
> http://web.volny.cz/data/click.php?id=1293
>
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Martin Guy Martin Guy
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

2011/11/22 Tony <[hidden email]>:
> I think handling single precision floating point is important, because it's
> not just a matter of the STM32F4, it's becoming pervasive; everybody (well,
> at least ST, TI, Atmel, NXP, and Freescale) are introducing affordable MCUs
> with it.  Also, IIRC, even the "big" MCU's (like the Cortex A8 in the
> BeagleBone) are single precision.  Unfortunately, double precision FPU's on
> MCU's are still scarce...
>
> I'm not sure what is the best way to do it, but it does seem a shame not to
> take advantage of the FPU.

Compile a single-precision math library like liboil and make Lua
bindings to it, with the understanding from callers that the math is
performed in single precision. At worst, you
have to do a 64-bit float to 32-bit float conversion on each datum and result.

luaoil?  oil.dfft()...

   M
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

I think for "real-world" applications running on an MCU, ie. the
following setup might cover 95% of them:
eLua FP32 (with fpu support)
----------------------------
eLua number: 32bit fp
integer in mantissa: +/- 8.388.607 (24bit signed)
timer tick: 10ms, it overflows in 23h with above integer size

The Q is how to handle situation when you want to tackle 32bit
values from ie timers or other paripherals when applicable. This
could be done outside eLua, passing 16bit words. This setup may
speed up eLua, lower the overhead and the code will be smaller, I
guess.
P.


--
Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled



2011/11/24 pito <[hidden email]>
I think for "real-world" applications running on an MCU, ie. the
following setup might cover 95% of them:
eLua FP32 (with fpu support)
----------------------------
eLua number: 32bit fp
integer in mantissa: +/- 8.388.607 (24bit signed)
timer tick: 10ms, it overflows in 23h with above integer size

I think 95% is way too optimistic. On some eLua targets you can't even access all the GPIO pins with a number type that doesn't cover 32-bit integers.
 

The Q is how to handle situation when you want to tackle 32bit
values from ie timers or other paripherals when applicable. This
could be done outside eLua, passing 16bit words.

I'm not sure I follow. Do you want to pass 32-bit numbers as two separate 16-bit numbers? 
 
This setup may
speed up eLua, lower the overhead and the code will be smaller, I
guess.

I don't really think it would speed it up too much. The code will be smaller, yes. The overhead can't possibly be smaller than what we have now, when we use 32-bit integers directly.

Best,
Bogdan
 
P.


--
Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Roger Critchlow Roger Critchlow
Reply | Threaded
Open this post in threaded view
|

Re: FP32 with fpu enabled

Something I thought about doing years ago was to use the Quiet NaN coding of IEEE doubles as a type tag.  If you're stuck with keeping 64bit floats as the base numeric type, but you have to emulate the 64bit arithmetic, then it might be a win.  More advantage would come if you folded the rest of the lua type tagging into the same scheme, so you weren't doing too many levels of type dispatch.

An IEEE double is a quiet NaN if all 10 bits of exponent are 1's and the most significant bit of the fraction is a 1.  At that point the sign bit and the remaining 52 bits of fraction can be used to represent anything you like.  Using the top 16 bits of the double as a primary type tag,  most of the 64k values represent double values, but there are 32 different bit patterns that are IEEE Quiet NaN's, 0xffff .. 0xfff0 and 0x7fff .. 0x7ff0.  http://en.wikipedia.org/wiki/NaN for more.

So keep doubles as the base numeric type, but store 32 bit floats as Quiet NaN tagged values inside doubles.  Use the FPU to implement 32 bit floating point if it's available, or simply promote the values to doubles on first touch and use the existing emulation.

Or let the double with the high 16 bit value 0xffff signify a lua object, the second 16 bits be a class identifier, and use the low 32 bits as the object pointer or the immediate value of the object.  

A 53 bit value representation space is a terrible thing to waste.

-- rec --

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev