Flexible RAM allocation

classic Classic list List threaded Threaded
8 messages Options
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Flexible RAM allocation

Hi, as the newest mcus have got more ram segments (e.g. stm32f4 has
2 ram segments) it would be interesting to tell to the allocator
where to put the stack/system parts. For example: with stm32f4 the
CCM memory (64kB) is fastest (0ws) so maybe we have to put there
stack, heap, etc. I think this would require to have e.g.:

#define RAMSYS_START  (void*)end
#define RAMSYS_END  (void*)(CCMstart+CCMSize-STACKSIZE-1)
#define RAM1_START  (void*)(SRAMStart)
#define RAM1_END  (void*)(SRAMStart + SRAMSize-1)

and force the elua compiler to use RAMSYS_START/END.. to place
"system-like stuff" where we want to have it. Is that possible?
P.


--
Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
videa, která vás rychle, názorně a zábavnou formou naučí ovládat
programy Excel, Word a PowerPoint. Seriál najdete na
http://web.volny.cz/data/click.php?id=1293


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

To be more specific, for example (stm32f4) - we need to have this on
one place:
..
// MCU specific RAM regions defs
#define CCMDATARAM_BASE   0x10000000
#define CCMDATARAM_SIZE   ( 64 * 1024 )
#define SRAM_BASE   0x20000000
#define SRAM_SIZE   ( 128 * 1024 )

// !! here we always place the system specific stuff
#define RAMSYS_BASE   CCMDATARAM_BASE
#define RAMSYS_SIZE   CCMDATARAM_SIZE

// here are optional (slower) RAM regions
#define RAM1_BASE   SRAM_BASE
#define RAM1_SIZE   SRAM_SIZE

// defs for the allocator
#define RAMSYS_START  (void*)end
#define RAMSYS_END  ( void* )( RAMSYS_BASE + RAMSYS_SIZE -
STACK_SIZE_TOTAL - 1 )
#define RAM1_START  (void*)(RAM1_BASE)
#define RAM1_END  (void*)(RAM1_BASE + RAM1_SIZE - 1)

and the allocator stuff then:
...
// Allocator data: define your free memory zones here in two arrays
// (start addresses and end addresses)
#define MEM_START_ADDRESS       { RAMSYS_START, RAM1_START }
#define MEM_END_ADDRESS         { RAMSYS_END, RAM1_END }
..

The compiler must always compile the system stuff into RAMSYS_BASE
region - needs to be set somewhere :).
P.

----- PŮVODNÍ ZPRÁVA -----
Od: "pito" <[hidden email]>
Komu: [hidden email]
Předmět: [eLua-dev] Flexible RAM allocation
Datum: 18.11.2011 - 11:59:06

> Hi, as the newest mcus have got more ram segments
> (e.g. stm32f4 has
> 2 ram segments) it would be interesting to tell to
> the allocator
> where to put the stack/system parts. For example:
> with stm32f4 the
> CCM memory (64kB) is fastest (0ws) so maybe we
> have to put there
> stack, heap, etc. I think this would require to
> have e.g.:
>
> #define RAMSYS_START  (void*)end
> #define RAMSYS_END
> (void*)(CCMstart+CCMSize-STACKSIZE-1)
> #define RAM1_START  (void*)(SRAMStart)
> #define RAM1_END  (void*)(SRAMStart + SRAMSize-1)
>
> and force the elua compiler to use
> RAMSYS_START/END.. to place
> "system-like stuff" where we want to have it. Is
> that possible?
> P.
>
>
> --
> Videokurzy MS Office zdarma! Portál VOLNÝ.cz
> přináší online výuková
> videa, která vás rychle, názorně a zábavnou formou
> naučí ovládat
> programy Excel, Word a PowerPoint. Seriál najdete
> na
> http://web.volny.cz/data/click.php?id=1293
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>


--
Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation



2011/11/18 pito <[hidden email]>
To be more specific, for example (stm32f4) - we need to have this on
one place:
..
// MCU specific RAM regions defs
#define CCMDATARAM_BASE   0x10000000
#define CCMDATARAM_SIZE   ( 64 * 1024 )
#define SRAM_BASE   0x20000000
#define SRAM_SIZE   ( 128 * 1024 )

// !! here we always place the system specific stuff
#define RAMSYS_BASE   CCMDATARAM_BASE
#define RAMSYS_SIZE   CCMDATARAM_SIZE

// here are optional (slower) RAM regions
#define RAM1_BASE   SRAM_BASE
#define RAM1_SIZE   SRAM_SIZE

// defs for the allocator
#define RAMSYS_START  (void*)end
#define RAMSYS_END  ( void* )( RAMSYS_BASE + RAMSYS_SIZE -
STACK_SIZE_TOTAL - 1 )
#define RAM1_START  (void*)(RAM1_BASE)
#define RAM1_END  (void*)(RAM1_BASE + RAM1_SIZE - 1)

and the allocator stuff then:
...
// Allocator data: define your free memory zones here in two arrays
// (start addresses and end addresses)
#define MEM_START_ADDRESS       { RAMSYS_START, RAM1_START }
#define MEM_END_ADDRESS         { RAMSYS_END, RAM1_END }
..

The compiler must always compile the system stuff into RAMSYS_BASE
region - needs to be set somewhere :).

That would be the linker script (src/platform/stm32/stm32.ld or whatever is used for STM32F4).

Best,
Bogdan
 
P.

----- PŮVODNÍ ZPRÁVA -----
Od: "pito" <[hidden email]>
Komu: [hidden email]
Předmět: [eLua-dev] Flexible RAM allocation
Datum: 18.11.2011 - 11:59:06

> Hi, as the newest mcus have got more ram segments
> (e.g. stm32f4 has
> 2 ram segments) it would be interesting to tell to
> the allocator
> where to put the stack/system parts. For example:
> with stm32f4 the
> CCM memory (64kB) is fastest (0ws) so maybe we
> have to put there
> stack, heap, etc. I think this would require to
> have e.g.:
>
> #define RAMSYS_START  (void*)end
> #define RAMSYS_END
> (void*)(CCMstart+CCMSize-STACKSIZE-1)
> #define RAM1_START  (void*)(SRAMStart)
> #define RAM1_END  (void*)(SRAMStart + SRAMSize-1)
>
> and force the elua compiler to use
> RAMSYS_START/END.. to place
> "system-like stuff" where we want to have it. Is
> that possible?
> P.
>
>
> --
> Videokurzy MS Office zdarma! Portál VOLNÝ.cz
> přináší online výuková
> videa, která vás rychle, názorně a zábavnou formou
> naučí ovládat
> programy Excel, Word a PowerPoint. Seriál najdete
> na
> http://web.volny.cz/data/click.php?id=1293
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>


--
Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

In reply to this post by Pito
> #define RAMSYS_BASE   CCMDATARAM_BASE
> #define RAMSYS_SIZE   CCMDATARAM_SIZE
and of course all references to "SRAM.." in the sources has to be
replaced with "RAMSYS.." :)
p.


--
Staňte se fanoušky portálu VOLNÝ.cz na Facebooku a můžete si zahrát
o vstupenky na konopný veletrh a vaporizér
http://web.volny.cz/data/click.php?id=1297

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

Well, I did some effort towards this and:

1. in stm32.ld I did following
..
MEMORY
{
sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K
/* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */
flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K
}
as it does not want to take RAMSYS_BASE etc. as the constant, and
..
end = .;
PROVIDE( _estack = 0x10010000 );

as I do assume it is the top of eula stack (now in CCM).
Also I put the new defs around the allocator.

2. It compiles, it runs smaller code, but when I run my biglife I
get:
[Hard fault handler]
R0 = 2001fbc0
R1 = 2001fcd7
R2 = 2001fcd7
R3 = 4f
R12 = 0
LR = 800f653
PC = 800f654
PSR = 61000000
BFAR = 2d2d0a35
CFSR = 8600
HFSR = 40000000
DFSR = 8
AFSR = 0

3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS
IT IS SLOWER:

-- STM32F4 Discovery kit (168MHz MCU, eLua 0.8)
-- SYS in SRAM
-- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs
-- SYS in CCM:
-- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs

This is a big surprise for me, though. Bigger than neutrino's speeds
at Grand Sasso. But not sure everything is set up properly, somebody
else has to confirm :) ..
P.


--
Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Martin Guy Martin Guy
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

2011/11/18 pito <[hidden email]>:
> 1. in stm32.ld I did following
> ..
> MEMORY
> {
> sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K
> /* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */
> flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K
> }

instead of the standard

MEMORY
{
    sram (W!RX) : ORIGIN = 0x20000000, LENGTH = 64k
    flash (RX) : ORIGIN = 0x08000000, LENGTH = 512k
}


> 2. It compiles, it runs smaller code, but when I run my biglife I
> get:
> [Hard fault handler]
> R0 = 2001fbc0
> R1 = 2001fcd7
> R2 = 2001fcd7
> R3 = 4f
> R12 = 0
> LR = 800f653
> PC = 800f654
> PSR = 61000000
> BFAR = 2d2d0a35
> CFSR = 8600
> HFSR = 40000000
> DFSR = 8
> AFSR = 0
>
> 3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS
> IT IS SLOWER:
>
> -- SYS in SRAM
> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs
> -- SYS in CCM:
> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs
>
> This is a big surprise for me, though.

Such a small difference could be caused by it having less RAM, if you
rar using 64K internal and 128K CCM, and so eLua having to garbage
collect more frequently, which suggests that your RAM speeds are
identical.

By comparison, the difference in speed between the small internal SRAM
and the external SDRAM on the AVR32 platforms was a factor of 6, not
the 1% you are seeing - that could just be noise in the measurement.

   M
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

On Fri, Nov 18, 2011 at 1:01 PM, Martin Guy <[hidden email]> wrote:

> 2011/11/18 pito <[hidden email]>:
>> 1. in stm32.ld I did following
>> ..
>> MEMORY
>> {
>> sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K
>> /* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */
>> flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K
>> }
>
> instead of the standard
>
> MEMORY
> {
>    sram (W!RX) : ORIGIN = 0x20000000, LENGTH = 64k
>    flash (RX) : ORIGIN = 0x08000000, LENGTH = 512k
> }
>
>
>> 2. It compiles, it runs smaller code, but when I run my biglife I
>> get:
>> [Hard fault handler]
>> R0 = 2001fbc0
>> R1 = 2001fcd7
>> R2 = 2001fcd7
>> R3 = 4f
>> R12 = 0
>> LR = 800f653
>> PC = 800f654
>> PSR = 61000000
>> BFAR = 2d2d0a35
>> CFSR = 8600
>> HFSR = 40000000
>> DFSR = 8
>> AFSR = 0
>>
>> 3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS
>> IT IS SLOWER:
>>
>> -- SYS in SRAM
>> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs
>> -- SYS in CCM:
>> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs
>>
>> This is a big surprise for me, though.
>
> Such a small difference could be caused by it having less RAM, if you
> rar using 64K internal and 128K CCM, and so eLua having to garbage
> collect more frequently, which suggests that your RAM speeds are
> identical.

Yeah, if you include something like:
print(collectgarbage("count"))

You can get an idea of how many collectable objects the VM has as time
goes along.

I haven't tried this on STM32F4, but at least on the desktop where it
wasn't doing as much aggressive collection it got up to 263kB by the
end of N=1000 for sieve.

If you want to be fair for the comparison, you might want to force a
full garbage collection every cycle or every N cycles so that you can
be sure that both configurations are doing the same amount of work.

For reference:
http://www.lua.org/manual/5.1/manual.html#pdf-collectgarbage

>
> By comparison, the difference in speed between the small internal SRAM
> and the external SDRAM on the AVR32 platforms was a factor of 6, not
> the 1% you are seeing - that could just be noise in the measurement.
>
>   M
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: Flexible RAM allocation

In reply to this post by Martin Guy
Let me clarify a little bit - what I am trying to do is following:
to have a simple way to decide where to put elau system stuff -
either in SRAM or CCM. Therefore, I created a set of #defines to
make it easier:

1. in platform_conf.h
..
//$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

// STM32F4 MCU specific
#define CCMDATARAM_BASE   0x10000000
#define CCMDATARAM_SIZE   ( 64 * 1024 )
#define SRAM_BASE    0x20000000
#define SRAM_SIZE   ( 128 * 1024 )

// we always place system stuff in the RAMSYS
// default is SRAM
//#define RAMSYS_BASE   SRAM_BASE
//#define RAMSYS_SIZE   SRAM_SIZE
// now we place it in CCM
#define RAMSYS_BASE    CCMDATARAM_BASE
#define RAMSYS_SIZE    CCMDATARAM_SIZE

// here are optional regions
#define RAM1_BASE   SRAM_BASE
#define RAM1_SIZE   SRAM_SIZE

// defs for the allocator, do not touch
#define RAMSYS_START  ( void* )end
#define RAMSYS_END  ( void* )( RAMSYS_BASE + RAMSYS_SIZE -
STACK_SIZE_TOTAL - 1 )
#define RAM1_START  ( void* )(RAM1_BASE)
#define RAM1_END  ( void* )(RAM1_BASE + RAM1_SIZE - 1)

// Allocator data: define your free memory zones here in two arrays
// (start addresses and end addresses)
#define MEM_START_ADDRESS       { RAM1_START, RAMSYS_START  }
#define MEM_END_ADDRESS         { RAM1_END, RAMSYS_END  }

// $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

2. in stm32.ld I did  the change maually to point sram to CCM (any
hint how to force the linker to accept e.g ORIGIN = RAMSYS_BASE ??)

MEMORY
{
sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64k
flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K
}

and
..
end = .;
PROVIDE( _estack = 0x10010000 );

to have it in CCM.

3. It compiles, but there is still a bug somewhere. Small sources
run, bigger crashes.

4. As the sieve runs, I did the simple test - you are right the GC
probably fills the time gap there.. A better test is needed.

5. There is still a bug somewhere I cannot find yet. I am sure it
has to work with elua system compiled into CCM(64k) and with
SRAM(128k) as RAM1.

P.


--
Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev