Hi, as the newest mcus have got more ram segments (e.g. stm32f4 has
2 ram segments) it would be interesting to tell to the allocator where to put the stack/system parts. For example: with stm32f4 the CCM memory (64kB) is fastest (0ws) so maybe we have to put there stack, heap, etc. I think this would require to have e.g.: #define RAMSYS_START (void*)end #define RAMSYS_END (void*)(CCMstart+CCMSize-STACKSIZE-1) #define RAM1_START (void*)(SRAMStart) #define RAM1_END (void*)(SRAMStart + SRAMSize-1) and force the elua compiler to use RAMSYS_START/END.. to place "system-like stuff" where we want to have it. Is that possible? P. -- Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková videa, která vás rychle, názorně a zábavnou formou naučí ovládat programy Excel, Word a PowerPoint. Seriál najdete na http://web.volny.cz/data/click.php?id=1293 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
To be more specific, for example (stm32f4) - we need to have this on
one place: .. // MCU specific RAM regions defs #define CCMDATARAM_BASE 0x10000000 #define CCMDATARAM_SIZE ( 64 * 1024 ) #define SRAM_BASE 0x20000000 #define SRAM_SIZE ( 128 * 1024 ) // !! here we always place the system specific stuff #define RAMSYS_BASE CCMDATARAM_BASE #define RAMSYS_SIZE CCMDATARAM_SIZE // here are optional (slower) RAM regions #define RAM1_BASE SRAM_BASE #define RAM1_SIZE SRAM_SIZE // defs for the allocator #define RAMSYS_START (void*)end #define RAMSYS_END ( void* )( RAMSYS_BASE + RAMSYS_SIZE - STACK_SIZE_TOTAL - 1 ) #define RAM1_START (void*)(RAM1_BASE) #define RAM1_END (void*)(RAM1_BASE + RAM1_SIZE - 1) and the allocator stuff then: ... // Allocator data: define your free memory zones here in two arrays // (start addresses and end addresses) #define MEM_START_ADDRESS { RAMSYS_START, RAM1_START } #define MEM_END_ADDRESS { RAMSYS_END, RAM1_END } .. The compiler must always compile the system stuff into RAMSYS_BASE region - needs to be set somewhere :). P. ----- PŮVODNÍ ZPRÁVA ----- Od: "pito" <[hidden email]> Komu: [hidden email] Předmět: [eLua-dev] Flexible RAM allocation Datum: 18.11.2011 - 11:59:06 > Hi, as the newest mcus have got more ram segments > (e.g. stm32f4 has > 2 ram segments) it would be interesting to tell to > the allocator > where to put the stack/system parts. For example: > with stm32f4 the > CCM memory (64kB) is fastest (0ws) so maybe we > have to put there > stack, heap, etc. I think this would require to > have e.g.: > > #define RAMSYS_START (void*)end > #define RAMSYS_END > (void*)(CCMstart+CCMSize-STACKSIZE-1) > #define RAM1_START (void*)(SRAMStart) > #define RAM1_END (void*)(SRAMStart + SRAMSize-1) > > and force the elua compiler to use > RAMSYS_START/END.. to place > "system-like stuff" where we want to have it. Is > that possible? > P. > > > -- > Videokurzy MS Office zdarma! Portál VOLNÝ.cz > přináší online výuková > videa, která vás rychle, názorně a zábavnou formou > naučí ovládat > programy Excel, Word a PowerPoint. Seriál najdete > na > http://web.volny.cz/data/click.php?id=1293 > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > -- Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/18 pito <[hidden email]> To be more specific, for example (stm32f4) - we need to have this on That would be the linker script (src/platform/stm32/stm32.ld or whatever is used for STM32F4). Best, Bogdan
P. _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
In reply to this post by Pito
> #define RAMSYS_BASE CCMDATARAM_BASE
> #define RAMSYS_SIZE CCMDATARAM_SIZE and of course all references to "SRAM.." in the sources has to be replaced with "RAMSYS.." :) p. -- Staňte se fanoušky portálu VOLNÝ.cz na Facebooku a můžete si zahrát o vstupenky na konopný veletrh a vaporizér http://web.volny.cz/data/click.php?id=1297 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Well, I did some effort towards this and:
1. in stm32.ld I did following .. MEMORY { sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K /* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */ flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K } as it does not want to take RAMSYS_BASE etc. as the constant, and .. end = .; PROVIDE( _estack = 0x10010000 ); as I do assume it is the top of eula stack (now in CCM). Also I put the new defs around the allocator. 2. It compiles, it runs smaller code, but when I run my biglife I get: [Hard fault handler] R0 = 2001fbc0 R1 = 2001fcd7 R2 = 2001fcd7 R3 = 4f R12 = 0 LR = 800f653 PC = 800f654 PSR = 61000000 BFAR = 2d2d0a35 CFSR = 8600 HFSR = 40000000 DFSR = 8 AFSR = 0 3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS IT IS SLOWER: -- STM32F4 Discovery kit (168MHz MCU, eLua 0.8) -- SYS in SRAM -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs -- SYS in CCM: -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs This is a big surprise for me, though. Bigger than neutrino's speeds at Grand Sasso. But not sure everything is set up properly, somebody else has to confirm :) .. P. -- Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/18 pito <[hidden email]>:
> 1. in stm32.ld I did following > .. > MEMORY > { > sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K > /* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */ > flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K > } instead of the standard MEMORY { sram (W!RX) : ORIGIN = 0x20000000, LENGTH = 64k flash (RX) : ORIGIN = 0x08000000, LENGTH = 512k } > 2. It compiles, it runs smaller code, but when I run my biglife I > get: > [Hard fault handler] > R0 = 2001fbc0 > R1 = 2001fcd7 > R2 = 2001fcd7 > R3 = 4f > R12 = 0 > LR = 800f653 > PC = 800f654 > PSR = 61000000 > BFAR = 2d2d0a35 > CFSR = 8600 > HFSR = 40000000 > DFSR = 8 > AFSR = 0 > > 3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS > IT IS SLOWER: > > -- SYS in SRAM > -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs > -- SYS in CCM: > -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs > > This is a big surprise for me, though. Such a small difference could be caused by it having less RAM, if you rar using 64K internal and 128K CCM, and so eLua having to garbage collect more frequently, which suggests that your RAM speeds are identical. By comparison, the difference in speed between the small internal SRAM and the external SDRAM on the AVR32 platforms was a factor of 6, not the 1% you are seeing - that could just be noise in the measurement. M _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
On Fri, Nov 18, 2011 at 1:01 PM, Martin Guy <[hidden email]> wrote:
> 2011/11/18 pito <[hidden email]>: >> 1. in stm32.ld I did following >> .. >> MEMORY >> { >> sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64K >> /* CCMRAM (xrw) : ORIGIN = 0x20000000 , LENGTH = 128K */ >> flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K >> } > > instead of the standard > > MEMORY > { > sram (W!RX) : ORIGIN = 0x20000000, LENGTH = 64k > flash (RX) : ORIGIN = 0x08000000, LENGTH = 512k > } > > >> 2. It compiles, it runs smaller code, but when I run my biglife I >> get: >> [Hard fault handler] >> R0 = 2001fbc0 >> R1 = 2001fcd7 >> R2 = 2001fcd7 >> R3 = 4f >> R12 = 0 >> LR = 800f653 >> PC = 800f654 >> PSR = 61000000 >> BFAR = 2d2d0a35 >> CFSR = 8600 >> HFSR = 40000000 >> DFSR = 8 >> AFSR = 0 >> >> 3. when I run my sieve.lua, I get the expected results, BUT IT SHOWS >> IT IS SLOWER: >> >> -- SYS in SRAM >> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.261334 secs >> -- SYS in CCM: >> -- Sieve 256, Count: 54, 1000x Iterations elapsed: 6.396253 secs >> >> This is a big surprise for me, though. > > Such a small difference could be caused by it having less RAM, if you > rar using 64K internal and 128K CCM, and so eLua having to garbage > collect more frequently, which suggests that your RAM speeds are > identical. Yeah, if you include something like: print(collectgarbage("count")) You can get an idea of how many collectable objects the VM has as time goes along. I haven't tried this on STM32F4, but at least on the desktop where it wasn't doing as much aggressive collection it got up to 263kB by the end of N=1000 for sieve. If you want to be fair for the comparison, you might want to force a full garbage collection every cycle or every N cycles so that you can be sure that both configurations are doing the same amount of work. For reference: http://www.lua.org/manual/5.1/manual.html#pdf-collectgarbage > > By comparison, the difference in speed between the small internal SRAM > and the external SDRAM on the AVR32 platforms was a factor of 6, not > the 1% you are seeing - that could just be noise in the measurement. > > M > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
In reply to this post by Martin Guy
Let me clarify a little bit - what I am trying to do is following:
to have a simple way to decide where to put elau system stuff - either in SRAM or CCM. Therefore, I created a set of #defines to make it easier: 1. in platform_conf.h .. //$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ // STM32F4 MCU specific #define CCMDATARAM_BASE 0x10000000 #define CCMDATARAM_SIZE ( 64 * 1024 ) #define SRAM_BASE 0x20000000 #define SRAM_SIZE ( 128 * 1024 ) // we always place system stuff in the RAMSYS // default is SRAM //#define RAMSYS_BASE SRAM_BASE //#define RAMSYS_SIZE SRAM_SIZE // now we place it in CCM #define RAMSYS_BASE CCMDATARAM_BASE #define RAMSYS_SIZE CCMDATARAM_SIZE // here are optional regions #define RAM1_BASE SRAM_BASE #define RAM1_SIZE SRAM_SIZE // defs for the allocator, do not touch #define RAMSYS_START ( void* )end #define RAMSYS_END ( void* )( RAMSYS_BASE + RAMSYS_SIZE - STACK_SIZE_TOTAL - 1 ) #define RAM1_START ( void* )(RAM1_BASE) #define RAM1_END ( void* )(RAM1_BASE + RAM1_SIZE - 1) // Allocator data: define your free memory zones here in two arrays // (start addresses and end addresses) #define MEM_START_ADDRESS { RAM1_START, RAMSYS_START } #define MEM_END_ADDRESS { RAM1_END, RAMSYS_END } // $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 2. in stm32.ld I did the change maually to point sram to CCM (any hint how to force the linker to accept e.g ORIGIN = RAMSYS_BASE ??) MEMORY { sram (W!RX) : ORIGIN = 0x10000000, LENGTH = 64k flash (RX) : ORIGIN = 0x08000000, LENGTH = 1024K } and .. end = .; PROVIDE( _estack = 0x10010000 ); to have it in CCM. 3. It compiles, but there is still a bug somewhere. Small sources run, bigger crashes. 4. As the sieve runs, I did the simple test - you are right the GC probably fills the time gap there.. A better test is needed. 5. There is still a bug somewhere I cannot find yet. I am sure it has to work with elua system compiled into CCM(64k) and with SRAM(128k) as RAM1. P. -- Vše o vztazích a sexu: z pohledu žen! Čtěte www.femina.cz _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Free forum by Nabble | Edit this page |