Hi,
It seems to me the flash prefetch is not ON currently (in system_stm32f4xx.c line 396), the line shall be: FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN |FLASH_ACR_DCEN | FLASH_ACR_LATENCY_5WS; A benchmark I do shows: 1. without FLASH_ACR_PRFTEN set: ..1000x Iterations elapsed: 6.396281 secs 2. with FLASH_ACR_PRFTEN: ..1000x Iterations elapsed: 5.618634 secs 3. with 4ws (do not do it at home, at your own risk, you may brick your board!): ..1000x Iterations elapsed: 5.258016 secs P. -- Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném podzimním období? Čtěte speciál Zdraví na podzim na http://web.volny.cz/data/click.php?id=1290 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
..it would be a nice option to have for compiling eLua interpreter
into ram (ie in CCM for stm32f4). Flash seems to be quite slow :).. p. ----- PŮVODNÍ ZPRÁVA ----- Od: "pito" <[hidden email]> Komu: [hidden email] Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on Datum: 21.11.2011 - 10:53:47 > Hi, > It seems to me the flash prefetch is not ON > currently (in > system_stm32f4xx.c line 396), the line shall be: > > FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN > |FLASH_ACR_DCEN | > FLASH_ACR_LATENCY_5WS; > > A benchmark I do shows: > > 1. without FLASH_ACR_PRFTEN set: > ..1000x Iterations elapsed: 6.396281 secs > > 2. with FLASH_ACR_PRFTEN: > ..1000x Iterations elapsed: 5.618634 secs > > 3. with 4ws (do not do it at home, at your own > risk, you may brick > your board!): > ..1000x Iterations elapsed: 5.258016 secs > > P. > > > -- > Jak se vyhnout nachlazení a dalším zdravotním > potížím v nepříjemném > podzimním období? Čtěte speciál Zdraví na podzim > na > http://web.volny.cz/data/click.php?id=1290 > > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > -- Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném podzimním období? Čtěte speciál Zdraví na podzim na http://web.volny.cz/data/click.php?id=1290 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/21 pito <[hidden email]>:
> ..it would be a nice option to have for compiling eLua interpreter > into ram (ie in CCM for stm32f4). Flash seems to be quite slow :).. It's certainly be possible to force some more things into RAM. One easy thing you can do is partially or fully disable LTR: http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html However, some of the performance difference you'd see form this you could get by using "local" variables selectively within the program to reduce flash lookups. Also note that the CCM is only for data, the MCU won't execute any code in that RAM. You could certainly use it for storing data structures and bytecode though. On a related note: I've turned on prefetch by default in the branch, removed a double call to the clock setup code that was being done, and double-checked the PLL setup, which I'm now pretty sure is correct after getting a bit confused by the example code that ST ships. I'm not completely sure why they don't enable prefetch by default, there is a related errata, but I'm not sure exactly what affect it has since it seems like there's definitely a performance benefit to turning prefetch on, and they enable it by default on STM32F2xx parts. The related Errata is: "The ART Accelerator prefetch queue instruction is not supported. This limitation does not prevent the ART Accelerator from using the cache enable/disable capability and the selection of the number of wait states according to the system frequency" > p. > > ----- PŮVODNÍ ZPRÁVA ----- > Od: "pito" <[hidden email]> > Komu: [hidden email] > Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on > Datum: 21.11.2011 - 10:53:47 > >> Hi, >> It seems to me the flash prefetch is not ON >> currently (in >> system_stm32f4xx.c line 396), the line shall be: >> >> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN >> |FLASH_ACR_DCEN | >> FLASH_ACR_LATENCY_5WS; >> >> A benchmark I do shows: >> >> 1. without FLASH_ACR_PRFTEN set: >> ..1000x Iterations elapsed: 6.396281 secs >> >> 2. with FLASH_ACR_PRFTEN: >> ..1000x Iterations elapsed: 5.618634 secs >> >> 3. with 4ws (do not do it at home, at your own >> risk, you may brick >> your board!): >> ..1000x Iterations elapsed: 5.258016 secs >> >> P. >> >> >> -- >> Jak se vyhnout nachlazení a dalším zdravotním >> potížím v nepříjemném >> podzimním období? Čtěte speciál Zdraví na podzim >> na >> http://web.volny.cz/data/click.php?id=1290 >> >> >> >> _______________________________________________ >> eLua-dev mailing list >> [hidden email] >> https://lists.berlios.de/mailman/listinfo/elua-dev >> > > > -- > Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném > podzimním období? Čtěte speciál Zdraví na podzim na > http://web.volny.cz/data/click.php?id=1290 > > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/28 James Snyder <[hidden email]> 2011/11/21 pito <[hidden email]>: he I don't understand. So it doesn't exist, but turning it on makes a difference ? :) Best, Bogdan
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Well, I'm not completely sure. I haven't tried replicating pito's
results, but I think I'll try and see how much time variance I get when I run some benchmarks. One might figure that those timing differences could be chalked up to using a low frequency system timer, but the difference there is many cycles at the current 16 Hz rate that the system timer is set for. The fact that they have greyed out the prefetch feature in their clock configuration tool, and they've said here that the "prefetch instruction" isn't supported on RevA parts would seem to suggest that that setting shouldn't do anything. I'm not completely clear if there's a difference between any meaning of a "prefetch queue instruction" and simply enabling prefetch as a feature in the flash access control register. Without further detail, it might be in our best interests to disable it for RevA parts even though there's no guidance provided on whether any unexpected results might occur from attempting to enable it? The errata just says that the prefetch instruction is unsupported, that there's no workaround and it will be fixed in the next silicon rev. It would have been a bit clearer, if they were referring to this issue, if they had said that "instruction prefetch" was unsupported. The description of "instruction prefetch" is basically what you would expect of such a feature in the "flash memory interface" manual (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf): "Each Flash memory read operation provides 128 bits from either four instructions of 32 bits or 8 instructions of 16 bits according to the program launched. So, in case of sequential code, at least four CPU cycles are needed to execute the previous read instruction line. Prefetch on the I-Code bus can be used to read the next sequential instruction line from the Flash memory while the current instruction line is being requested by the CPU. Prefetch is enabled by setting the PRFTEN bit in the FLASH_ACR register. " *shrug* 2011/11/28 Bogdan Marinescu <[hidden email]>: > > > 2011/11/28 James Snyder <[hidden email]> >> >> 2011/11/21 pito <[hidden email]>: >> > ..it would be a nice option to have for compiling eLua interpreter >> > into ram (ie in CCM for stm32f4). Flash seems to be quite slow :).. >> >> It's certainly be possible to force some more things into RAM. One >> easy thing you can do is partially or fully disable LTR: >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html >> >> However, some of the performance difference you'd see form this you >> could get by using "local" variables selectively within the program to >> reduce flash lookups. >> >> Also note that the CCM is only for data, the MCU won't execute any >> code in that RAM. You could certainly use it for storing data >> structures and bytecode though. >> >> On a related note: I've turned on prefetch by default in the branch, >> removed a double call to the clock setup code that was being done, and >> double-checked the PLL setup, which I'm now pretty sure is correct >> after getting a bit confused by the example code that ST ships. I'm >> not completely sure why they don't enable prefetch by default, there >> is a related errata, but I'm not sure exactly what affect it has since >> it seems like there's definitely a performance benefit to turning >> prefetch on, and they enable it by default on STM32F2xx parts. T >> >> he >> related Errata is: >> "The ART Accelerator prefetch queue instruction is not supported. >> This limitation does not prevent the ART Accelerator from using the >> cache enable/disable >> capability and the selection of the number of wait states according to >> the system frequency" > > I don't understand. So it doesn't exist, but turning it on makes a > difference ? :) > Best, > Bogdan > >> >> > p. >> > >> > ----- PŮVODNÍ ZPRÁVA ----- >> > Od: "pito" <[hidden email]> >> > Komu: [hidden email] >> > Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on >> > Datum: 21.11.2011 - 10:53:47 >> > >> >> Hi, >> >> It seems to me the flash prefetch is not ON >> >> currently (in >> >> system_stm32f4xx.c line 396), the line shall be: >> >> >> >> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN >> >> |FLASH_ACR_DCEN | >> >> FLASH_ACR_LATENCY_5WS; >> >> >> >> A benchmark I do shows: >> >> >> >> 1. without FLASH_ACR_PRFTEN set: >> >> ..1000x Iterations elapsed: 6.396281 secs >> >> >> >> 2. with FLASH_ACR_PRFTEN: >> >> ..1000x Iterations elapsed: 5.618634 secs >> >> >> >> 3. with 4ws (do not do it at home, at your own >> >> risk, you may brick >> >> your board!): >> >> ..1000x Iterations elapsed: 5.258016 secs >> >> >> >> P. >> >> >> >> >> >> -- >> >> Jak se vyhnout nachlazení a dalším zdravotním >> >> potížím v nepříjemném >> >> podzimním období? Čtěte speciál Zdraví na podzim >> >> na >> >> http://web.volny.cz/data/click.php?id=1290 >> >> >> >> >> >> >> >> _______________________________________________ >> >> eLua-dev mailing list >> >> [hidden email] >> >> https://lists.berlios.de/mailman/listinfo/elua-dev >> >> >> > >> > >> > -- >> > Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném >> > podzimním období? Čtěte speciál Zdraví na podzim na >> > http://web.volny.cz/data/click.php?id=1290 >> > >> > >> > >> > _______________________________________________ >> > eLua-dev mailing list >> > [hidden email] >> > https://lists.berlios.de/mailman/listinfo/elua-dev >> > >> _______________________________________________ >> eLua-dev mailing list >> [hidden email] >> https://lists.berlios.de/mailman/listinfo/elua-dev > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > > eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
In reply to this post by BogdanM
In the doc RM0090 1315p. long ref manual I found only this text
related to one of the main marketing gadgets - the ART (p.57): ...Thanks to the ART Accelerator™, the CPU can operate up to 168 MHz frequency without wait states, thereby increasing the overall system speed and efficiency (see Table 3). To release the processor 210 DMIPS performance at this frequency, the accelerator implements an instruction prefetch queue and branch cache, which enables program execution from Flash memory at up to 168 MHz without wait states. ----------------- If the prefetch enable does not influence the systick timer speed somehow, than the 13% speedup result is there. Maybe even bigger as the benchmark I use is only a simple sieve. p. ----- PŮVODNÍ ZPRÁVA ----- Od: "Bogdan Marinescu" <[hidden email]> Komu: "eLua Users and Development List (www.eluaproject.net)" <[hidden email]> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on Datum: 28.11.2011 - 18:56:41 > 2011/11/28 James Snyder <[hidden email]> > > > 2011/11/21 pito <[hidden email]>: > > > ..it would be a nice option to have for > > > compiling eLua interpreter > > > > > into ram (ie in CCM for stm32f4). Flash seems > > > to be quite slow :).. > > > > > > It's certainly be possible to force some more > > things into RAM. One > > > easy thing you can do is partially or fully > > disable LTR: > > > http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html > > > > > However, some of the performance difference > > you'd see form this you > > > could get by using "local" variables selectively > > within the program to > > > reduce flash lookups. > > > > Also note that the CCM is only for data, the MCU > > won't execute any > > > code in that RAM. You could certainly use it > > for storing data > > > structures and bytecode though. > > > > On a related note: I've turned on prefetch by > > default in the branch, > > > removed a double call to the clock setup code > > that was being done, and > > > double-checked the PLL setup, which I'm now > > pretty sure is correct > > > after getting a bit confused by the example code > > that ST ships. I'm > > > not completely sure why they don't enable > > prefetch by default, there > > > is a related errata, but I'm not sure exactly > > what affect it has since > > > it seems like there's definitely a performance > > benefit to turning > > > prefetch on, and they enable it by default on > > STM32F2xx parts. T > > > he > > related Errata is: > > "The ART Accelerator prefetch queue instruction > > is not supported. > > > This limitation does not prevent the ART > > Accelerator from using the > > > cache enable/disable > > capability and the selection of the number of > > wait states according to > > > the system frequency" > > > > I don't understand. So it doesn't exist, but > turning it on makes a > difference ? :) > > Best, > Bogdan > > > > > > > p. > > > > > > ----- PŮVODNÍ ZPRÁVA ----- > > > Od: "pito" <[hidden email]> > > > Komu: [hidden email] > > > Předmět: [eLua-dev] STM32f4DSCY - MCU settings > > > - flash prefetch on > > > > > Datum: 21.11.2011 - 10:53:47 > > > > > >> Hi, > > >> It seems to me the flash prefetch is not ON > > >> currently (in > > >> system_stm32f4xx.c line 396), the line shall > > >> be: > > >> > >> > > >> FLASH->ACR = FLASH_ACR_PRFTEN | > > >> FLASH_ACR_ICEN > > >> > >> |FLASH_ACR_DCEN | > > >> FLASH_ACR_LATENCY_5WS; > > >> > > >> A benchmark I do shows: > > >> > > >> 1. without FLASH_ACR_PRFTEN set: > > >> ..1000x Iterations elapsed: 6.396281 secs > > >> > > >> 2. with FLASH_ACR_PRFTEN: > > >> ..1000x Iterations elapsed: 5.618634 secs > > >> > > >> 3. with 4ws (do not do it at home, at your > > >> own > > >> > >> risk, you may brick > > >> your board!): > > >> ..1000x Iterations elapsed: 5.258016 secs > > >> > > >> P. > > >> > > >> > > >> -- > > >> Jak se vyhnout nachlazení a dalším zdravotním > > >> potížím v nepříjemném > > >> podzimním období? Čtěte speciál Zdraví na > > >> podzim > > >> > >> na > > >> http://web.volny.cz/data/click.php?id=1290 > > >> > > >> > > >> > > >> _______________________________________________ > > >> > >> eLua-dev mailing list > > >> [hidden email] > > >> https://lists.berlios.de/mailman/listinfo/elua-dev > > >> > >> > > > > > > > > > -- > > > Jak se vyhnout nachlazení a dalším zdravotním > > > potížím v nepříjemném > > > > > podzimním období? Čtěte speciál Zdraví na > > > podzim na > > > > > http://web.volny.cz/data/click.php?id=1290 > > > > > > > > > > > > _______________________________________________ > > > > > eLua-dev mailing list > > > [hidden email] > > > https://lists.berlios.de/mailman/listinfo/elua-dev > > > > > > > _______________________________________________ > > eLua-dev mailing list > > [hidden email] > > https://lists.berlios.de/mailman/listinfo/elua-dev > > > > -- Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
In reply to this post by jbsnyder
fyi - the st-link utility I've just connected to the disco board
shows in the Device Information following: Device: STM32F4xx Device ID: 0x411 Revision ID: Rev B Flash size: Uknown ----- PŮVODNÍ ZPRÁVA ----- Od: "James Snyder" <[hidden email]> Komu: "eLua Users and Development List (www.eluaproject.net)" <[hidden email]> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on Datum: 28.11.2011 - 20:01:43 > Well, I'm not completely sure. I haven't tried > replicating pito's > results, but I think I'll try and see how much > time variance I get > when I run some benchmarks. One might figure that > those timing > differences could be chalked up to using a low > frequency system timer, > but the difference there is many cycles at the > current 16 Hz rate that > the system timer is set for. > > The fact that they have greyed out the prefetch > feature in their clock > configuration tool, and they've said here that the > "prefetch > instruction" isn't supported on RevA parts would > seem to suggest that > that setting shouldn't do anything. I'm not > completely clear if > there's a difference between any meaning of a > "prefetch queue > instruction" and simply enabling prefetch as a > feature in the flash > access control register. > > Without further detail, it might be in our best > interests to disable > it for RevA parts even though there's no guidance > provided on whether > any unexpected results might occur from attempting > to enable it? The > errata just says that the prefetch instruction is > unsupported, that > there's no workaround and it will be fixed in the > next silicon rev. > It would have been a bit clearer, if they were > referring to this > issue, if they had said that "instruction > prefetch" was unsupported. > > The description of "instruction prefetch" is > basically what you would > expect of such a feature in the "flash memory > interface" manual > (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf): > > "Each Flash memory read operation provides 128 > bits from either four > instructions of 32 bits > or 8 instructions of 16 bits according to the > program launched. So, in > case of sequential > code, at least four CPU cycles are needed to > execute the previous read > instruction line. > Prefetch on the I-Code bus can be used to read the > next sequential > instruction line from the > Flash memory while the current instruction line is > being requested by > the CPU. Prefetch is > enabled by setting the PRFTEN bit in the FLASH_ACR > register. " > > *shrug* > > 2011/11/28 Bogdan Marinescu > <[hidden email]>: > > > > > > 2011/11/28 James Snyder > > <[hidden email]> > > >> > >> 2011/11/21 pito <[hidden email]>: > >> > ..it would be a nice option to have for > >> > compiling eLua interpreter > >> > >> > into ram (ie in CCM for stm32f4). Flash seems > >> > to be quite slow :).. > >> > >> > >> It's certainly be possible to force some more > >> things into RAM. One > >> >> easy thing you can do is partially or fully > >> disable LTR: > >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html > >> >> > >> However, some of the performance difference > >> you'd see form this you > >> >> could get by using "local" variables > >> selectively within the program to > >> >> reduce flash lookups. > >> > >> Also note that the CCM is only for data, the > >> MCU won't execute any > >> >> code in that RAM. You could certainly use it > >> for storing data > >> >> structures and bytecode though. > >> > >> On a related note: I've turned on prefetch by > >> default in the branch, > >> >> removed a double call to the clock setup code > >> that was being done, and > >> >> double-checked the PLL setup, which I'm now > >> pretty sure is correct > >> >> after getting a bit confused by the example > >> code that ST ships. I'm > >> >> not completely sure why they don't enable > >> prefetch by default, there > >> >> is a related errata, but I'm not sure exactly > >> what affect it has since > >> >> it seems like there's definitely a performance > >> benefit to turning > >> >> prefetch on, and they enable it by default on > >> STM32F2xx parts. T > >> >> > >> he > >> related Errata is: > >> "The ART Accelerator prefetch queue instruction > >> is not supported. > >> >> This limitation does not prevent the ART > >> Accelerator from using the > >> >> cache enable/disable > >> capability and the selection of the number of > >> wait states according to > >> >> the system frequency" > > > > I don't understand. So it doesn't exist, but > > turning it on makes a > > > difference ? :) > > Best, > > Bogdan > > > >> > >> > p. > >> > > >> > ----- PŮVODNÍ ZPRÁVA ----- > >> > Od: "pito" <[hidden email]> > >> > Komu: [hidden email] > >> > Předmět: [eLua-dev] STM32f4DSCY - MCU > >> > settings - flash prefetch on > >> > >> > Datum: 21.11.2011 - 10:53:47 > >> > > >> >> Hi, > >> >> It seems to me the flash prefetch is not ON > >> >> currently (in > >> >> system_stm32f4xx.c line 396), the line shall > >> >> be: > >> >> >> >> > >> >> FLASH->ACR = FLASH_ACR_PRFTEN | > >> >> FLASH_ACR_ICEN > >> >> >> >> |FLASH_ACR_DCEN | > >> >> FLASH_ACR_LATENCY_5WS; > >> >> > >> >> A benchmark I do shows: > >> >> > >> >> 1. without FLASH_ACR_PRFTEN set: > >> >> ..1000x Iterations elapsed: 6.396281 secs > >> >> > >> >> 2. with FLASH_ACR_PRFTEN: > >> >> ..1000x Iterations elapsed: 5.618634 secs > >> >> > >> >> 3. with 4ws (do not do it at home, at your > >> >> own > >> >> >> >> risk, you may brick > >> >> your board!): > >> >> ..1000x Iterations elapsed: 5.258016 secs > >> >> > >> >> P. > >> >> > >> >> > >> >> -- > >> >> Jak se vyhnout nachlazení a dalším > >> >> zdravotním > >> >> >> >> potížím v nepříjemném > >> >> podzimním období? Čtěte speciál Zdraví na > >> >> podzim > >> >> >> >> na > >> >> http://web.volny.cz/data/click.php?id=1290 > >> >> > >> >> > >> >> > >> >> _______________________________________________ > >> >> >> >> eLua-dev mailing list > >> >> [hidden email] > >> >> https://lists.berlios.de/mailman/listinfo/elua-dev > >> >> >> >> > >> > > >> > > >> > -- > >> > Jak se vyhnout nachlazení a dalším zdravotním > >> > potížím v nepříjemném > >> > >> > podzimním období? Čtěte speciál Zdraví na > >> > podzim na > >> > >> > http://web.volny.cz/data/click.php?id=1290 > >> > > >> > > >> > > >> > _______________________________________________ > >> > >> > eLua-dev mailing list > >> > [hidden email] > >> > https://lists.berlios.de/mailman/listinfo/elua-dev > >> > >> > > >> _______________________________________________ > >> eLua-dev mailing list > >> [hidden email] > >> https://lists.berlios.de/mailman/listinfo/elua-dev > >> > > > > > _______________________________________________ > > eLua-dev mailing list > > [hidden email] > > https://lists.berlios.de/mailman/listinfo/elua-dev > > > > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > -- Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/28 pito <[hidden email]>:
> fyi - the st-link utility I've just connected to the disco board > shows in the Device Information following: > Device: STM32F4xx > Device ID: 0x411 > Revision ID: Rev B > Flash size: Uknown I see the same, although from the packaging it looks like it's a RevA, and it does appear to have the incorrect Device ID (0x411) matching the errata for Rev A (which says it should match the Dev ID for a STM32F2), which should be 0x413 instead. I'm not sure whether the Rev B is accurate or not? The relevant rev/dev id from mine is: 0x20006411 @ 0xE0042000 Rev ID: 0x2000 Dev ID: 0x411 > > ----- PŮVODNÍ ZPRÁVA ----- > Od: "James Snyder" <[hidden email]> > Komu: "eLua Users and Development List (www.eluaproject.net)" > <[hidden email]> > Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch > on > Datum: 28.11.2011 - 20:01:43 > >> Well, I'm not completely sure. I haven't tried >> replicating pito's >> results, but I think I'll try and see how much >> time variance I get >> when I run some benchmarks. One might figure that >> those timing >> differences could be chalked up to using a low >> frequency system timer, >> but the difference there is many cycles at the >> current 16 Hz rate that >> the system timer is set for. >> >> The fact that they have greyed out the prefetch >> feature in their clock >> configuration tool, and they've said here that the >> "prefetch >> instruction" isn't supported on RevA parts would >> seem to suggest that >> that setting shouldn't do anything. I'm not >> completely clear if >> there's a difference between any meaning of a >> "prefetch queue >> instruction" and simply enabling prefetch as a >> feature in the flash >> access control register. >> >> Without further detail, it might be in our best >> interests to disable >> it for RevA parts even though there's no guidance >> provided on whether >> any unexpected results might occur from attempting >> to enable it? The >> errata just says that the prefetch instruction is >> unsupported, that >> there's no workaround and it will be fixed in the >> next silicon rev. >> It would have been a bit clearer, if they were >> referring to this >> issue, if they had said that "instruction >> prefetch" was unsupported. >> >> The description of "instruction prefetch" is >> basically what you would >> expect of such a feature in the "flash memory >> interface" manual >> (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf): >> > "Each Flash memory read operation provides 128 >> bits from either four >> instructions of 32 bits >> or 8 instructions of 16 bits according to the >> program launched. So, in >> case of sequential >> code, at least four CPU cycles are needed to >> execute the previous read >> instruction line. >> Prefetch on the I-Code bus can be used to read the >> next sequential >> instruction line from the >> Flash memory while the current instruction line is >> being requested by >> the CPU. Prefetch is >> enabled by setting the PRFTEN bit in the FLASH_ACR >> register. " >> >> *shrug* >> >> 2011/11/28 Bogdan Marinescu >> <[hidden email]>: >> > >> > >> > 2011/11/28 James Snyder >> > <[hidden email]> >> > >> >> >> 2011/11/21 pito <[hidden email]>: >> >> > ..it would be a nice option to have for >> >> > compiling eLua interpreter >> >> > >> > into ram (ie in CCM for stm32f4). Flash seems >> >> > to be quite slow :).. >> >> > >> >> >> It's certainly be possible to force some more >> >> things into RAM. One >> >> >> easy thing you can do is partially or fully >> >> disable LTR: >> >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html >> >> >> >> >> However, some of the performance difference >> >> you'd see form this you >> >> >> could get by using "local" variables >> >> selectively within the program to >> >> >> reduce flash lookups. >> >> >> >> Also note that the CCM is only for data, the >> >> MCU won't execute any >> >> >> code in that RAM. You could certainly use it >> >> for storing data >> >> >> structures and bytecode though. >> >> >> >> On a related note: I've turned on prefetch by >> >> default in the branch, >> >> >> removed a double call to the clock setup code >> >> that was being done, and >> >> >> double-checked the PLL setup, which I'm now >> >> pretty sure is correct >> >> >> after getting a bit confused by the example >> >> code that ST ships. I'm >> >> >> not completely sure why they don't enable >> >> prefetch by default, there >> >> >> is a related errata, but I'm not sure exactly >> >> what affect it has since >> >> >> it seems like there's definitely a performance >> >> benefit to turning >> >> >> prefetch on, and they enable it by default on >> >> STM32F2xx parts. T >> >> >> >> >> he >> >> related Errata is: >> >> "The ART Accelerator prefetch queue instruction >> >> is not supported. >> >> >> This limitation does not prevent the ART >> >> Accelerator from using the >> >> >> cache enable/disable >> >> capability and the selection of the number of >> >> wait states according to >> >> >> the system frequency" >> > >> > I don't understand. So it doesn't exist, but >> > turning it on makes a >> > > difference ? :) >> > Best, >> > Bogdan >> > >> >> >> >> > p. >> >> > >> >> > ----- PŮVODNÍ ZPRÁVA ----- >> >> > Od: "pito" <[hidden email]> >> >> > Komu: [hidden email] >> >> > Předmět: [eLua-dev] STM32f4DSCY - MCU >> >> > settings - flash prefetch on >> >> > >> > Datum: 21.11.2011 - 10:53:47 >> >> > >> >> >> Hi, >> >> >> It seems to me the flash prefetch is not ON >> >> >> currently (in >> >> >> system_stm32f4xx.c line 396), the line shall >> >> >> be: >> >> >> >> >> >> >> >> FLASH->ACR = FLASH_ACR_PRFTEN | >> >> >> FLASH_ACR_ICEN >> >> >> >> >> |FLASH_ACR_DCEN | >> >> >> FLASH_ACR_LATENCY_5WS; >> >> >> >> >> >> A benchmark I do shows: >> >> >> >> >> >> 1. without FLASH_ACR_PRFTEN set: >> >> >> ..1000x Iterations elapsed: 6.396281 secs >> >> >> >> >> >> 2. with FLASH_ACR_PRFTEN: >> >> >> ..1000x Iterations elapsed: 5.618634 secs >> >> >> >> >> >> 3. with 4ws (do not do it at home, at your >> >> >> own >> >> >> >> >> risk, you may brick >> >> >> your board!): >> >> >> ..1000x Iterations elapsed: 5.258016 secs >> >> >> >> >> >> P. >> >> >> >> >> >> >> >> >> -- >> >> >> Jak se vyhnout nachlazení a dalším >> >> >> zdravotním >> >> >> >> >> potížím v nepříjemném >> >> >> podzimním období? Čtěte speciál Zdraví na >> >> >> podzim >> >> >> >> >> na >> >> >> http://web.volny.cz/data/click.php?id=1290 >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> >> >> eLua-dev mailing list >> >> >> [hidden email] >> >> >> https://lists.berlios.de/mailman/listinfo/elua-dev >> >> >> >> >> >> >> > >> >> > >> >> > -- >> >> > Jak se vyhnout nachlazení a dalším zdravotním >> >> > potížím v nepříjemném >> >> > >> > podzimním období? Čtěte speciál Zdraví na >> >> > podzim na >> >> > >> > http://web.volny.cz/data/click.php?id=1290 >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > >> > eLua-dev mailing list >> >> > [hidden email] >> >> > https://lists.berlios.de/mailman/listinfo/elua-dev >> >> > >> > >> >> _______________________________________________ >> >> eLua-dev mailing list >> >> [hidden email] >> >> https://lists.berlios.de/mailman/listinfo/elua-dev >> >> > >> > >> > _______________________________________________ >> > eLua-dev mailing list >> > [hidden email] >> > https://lists.berlios.de/mailman/listinfo/elua-dev >> > > >> > >> _______________________________________________ >> eLua-dev mailing list >> [hidden email] >> https://lists.berlios.de/mailman/listinfo/elua-dev >> > > > -- > Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a > pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál > portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301 > > eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/28 James Snyder <[hidden email]> 2011/11/28 pito <[hidden email]>: From what I could gather, if turning on the prefetch actually makes a difference, it must be a Rev B. This kind of confusion is pretty normal for new chips.
Best, Bogdan
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
-- James Snyder Biomedical Engineering Northwestern University ph: (847) 448-0386
I could imagine them wanting to fix the problem ASAP, and perhaps rushing things out since its one of their large selling points on the platform. Also I can confirm the ~13% performance difference on my own hardware, using a lightly modified sieve.lua, I think from the programming language shoot out then modified to include timing and to only run up to 256, 1000x
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
> Also I can confirm the ~13% performance difference
> on my own hardware, I am still not sure the prefetch cache works full speed as the 13% we measured is not much, indeed (but it could be masked by some processes within eLua however). When we did "prefetch on" on pic32mx (retrobsd) the filesystem performance went up by ~30% (where the most time we wait on sdcard's responses..) p. -- Žijte život gangstera ve velkém městě. Plňte mise pro kmotra, bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na http://web.volny.cz/data/click.php?id=1305 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
2011/11/29 pito <[hidden email]>:
>> Also I can confirm the ~13% performance difference >> on my own hardware, > > I am still not sure the prefetch cache works full speed as the 13% > we measured is not much, indeed (but it could be masked by some > processes within eLua however). When we did "prefetch on" on pic32mx > (retrobsd) the filesystem performance went up by ~30% (where the > most time we wait on sdcard's responses..) I think it would be helpful to try one or more of the following: 1) Try this on one of ST's other MCUs that has known working prefetch and toggle it 2) Simplify the test case to sequential code that definitely doesn't branch and would always pull instructions from flash. I haven't evaluated what's going on at the C level in this code to know how effective prefetch _should_ be. 3) Lower the clock to one where you can run with zero wait states and compare performance between prefetch enabled with wait states and zero wait states. Given that a previous test in which the wait state setting was reduced resulted in a performance gain might suggest that this would yield a significant difference. Also note that this thing has an instruction cache and data cache, which may or may not be helping significantly depending on the code and how many hits and misses we might get. I haven't traced through what Lua would be doing on the hardware level in the inner loop of the code example, but a quick test yielded this: Count: 54, 1000x 5.656045 sec -- prefetch & caches on Count: 54, 1000x 7.241642 sec -- prefetch on, caches off Count: 54, 1000x 8.561787 sec -- prefetch off, caches off There's about a 15% time reduction between w/o and w/ prefetch, and about 21% for caches being enabled. None of this seems to meet with the marketing speak that says that prefetch should make it behave like zero wait state, but It's difficult to know if what we see is related to a hardware implementation deficiency or the nature of the test case. > p. > > > -- > Žijte život gangstera ve velkém městě. Plňte mise pro kmotra, > bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na > http://web.volny.cz/data/click.php?id=1305 > _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
For the analysis:
This is from the chibios rtos benchmark I ran on stm32VL discoboard, stm32f100 overclocked to 56MHz, 0(zero)ws by default: ---------------------------------------------------------------------------- --- Test Case 11.8 (Benchmark, round robin context switching) --- Score : 429800 ctxswc/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.9 (Benchmark, I/O Queues throughput) --- Score : 501116 bytes/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.10 (Benchmark, virtual timers set/reset) --- Score : 689800 timers/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.11 (Benchmark, semaphores wait/signal) --- Score : 1106436 wait+signal/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.12 (Benchmark, mutexes lock/unlock) --- Score : 642236 lock+unlock/S --- Result: SUCCESS This is from chibios rtos test I ran on stm32f4discoboard default chibios settings (settings most probably 168MHz, 5ws, pref/caches on): ---------------------------------------------------------------------------- --- Test Case 11.8 (Benchmark, round robin context switching) --- Score : 1286180 ctxswc/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.9 (Benchmark, I/O Queues throughput) --- Score : 1669036 bytes/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.10 (Benchmark, virtual timers set/reset) --- Score : 2022172 timers/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.11 (Benchmark, semaphores wait/signal) --- Score : 3010608 wait+signal/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.12 (Benchmark, mutexes lock/unlock) --- Score : 1799912 lock+unlock/S --- Result: SUCCESS This is what Giovanni has run on his stm32f4discobard(a different compiler): ---------------------------------------------------------------------------- --- Test Case 11.8 (Benchmark, round robin context switching) --- Score : 1367420 ctxswc/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.9 (Benchmark, I/O Queues throughput) --- Score : 1844568 bytes/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.10 (Benchmark, virtual timers set/reset) --- Score : 2151998 timers/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.11 (Benchmark, semaphores wait/signal) --- Score : 2685712 wait+signal/S --- Result: SUCCESS ---------------------------------------------------------------------------- --- Test Case 11.12 (Benchmark, mutexes lock/unlock) --- Score : 1886020 lock+unlock/S --- Result: SUCCESS The stm32f100(@56MHz,0ws) against stm32f4(168MHz,5ws,ART) - it seems the results are ~1:3 (56/168 0ws/0ws). You may double check.. p. ----- PŮVODNÍ ZPRÁVA ----- Od: "James Snyder" <[hidden email]> Komu: "pito" <[hidden email]> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on Datum: 29.11.2011 - 20:32:55 > 2011/11/29 pito <[hidden email]>: > >> Also I can confirm the ~13% performance > >> difference > >> >> on my own hardware, > > > > I am still not sure the prefetch cache works > > full speed as the 13% > > > we measured is not much, indeed (but it could be > > masked by some > > > processes within eLua however). When we did > > "prefetch on" on pic32mx > > > (retrobsd) the filesystem performance went up by > > ~30% (where the > > > most time we wait on sdcard's responses..) > > I think it would be helpful to try one or more of > the following: > 1) Try this on one of ST's other MCUs that has > known working prefetch > and toggle it > > 2) Simplify the test case to sequential code that > definitely doesn't > branch and would always pull instructions from > flash. I haven't > evaluated what's going on at the C level in this > code to know how > effective prefetch _should_ be. > > 3) Lower the clock to one where you can run with > zero wait states and > compare performance between prefetch enabled with > wait states and zero > wait states. Given that a previous test in which > the wait state > setting was reduced resulted in a performance gain > might suggest that > this would yield a significant difference. > > Also note that this thing has an instruction cache > and data cache, > which may or may not be helping significantly > depending on the code > and how many hits and misses we might get. I > haven't traced through > what Lua would be doing on the hardware level in > the inner loop of the > code example, but a quick test yielded this: > > Count: 54, 1000x 5.656045 sec -- prefetch & caches > on > Count: 54, 1000x 7.241642 sec -- prefetch on, > caches off > Count: 54, 1000x 8.561787 sec -- prefetch off, > caches off > > There's about a 15% time reduction between w/o and > w/ prefetch, and > about 21% for caches being enabled. > > None of this seems to meet with the marketing > speak that says that > prefetch should make it behave like zero wait > state, but It's > difficult to know if what we see is related to a > hardware > implementation deficiency or the nature of the > test case. > > > p. > > > > > > -- > > Žijte život gangstera ve velkém městě. Plňte > > mise pro kmotra, > > > bojujte s ostatními hráči z celého světa. > > Zahrajte si hru Mafia na > > > http://web.volny.cz/data/click.php?id=1305 > > > -- Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková videa, která vás rychle, názorně a zábavnou formou naučí ovládat programy Excel, Word a PowerPoint. Seriál najdete na http://web.volny.cz/data/click.php?id=1293 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
f100 f4_p f4_g
429800 1286180 1367420 501116 1669036 1844568 689800 2022172 2151998 1106436 3010608 2685712 642236 1799912 1886020 2.993 3.182 3.331 3.681 2.932 3.120 2.721 2.427 2.803 2.937 ----- PŮVODNÍ ZPRÁVA ----- Od: "pito" <[hidden email]> Komu: [hidden email], [hidden email], [hidden email] Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on Datum: 29.11.2011 - 21:09:07 > For the analysis: > This is from the chibios rtos benchmark I ran on > stm32VL discoboard, > stm32f100 overclocked to 56MHz, 0(zero)ws by > default: > ---------------------------------------------------------------------------- > > --- Test Case 11.8 (Benchmark, round robin context > switching) > --- Score : 429800 ctxswc/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.9 (Benchmark, I/O Queues > throughput) > --- Score : 501116 bytes/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.10 (Benchmark, virtual timers > set/reset) > --- Score : 689800 timers/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.11 (Benchmark, semaphores > wait/signal) > --- Score : 1106436 wait+signal/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.12 (Benchmark, mutexes > lock/unlock) > --- Score : 642236 lock+unlock/S > --- Result: SUCCESS > > This is from chibios rtos test I ran on > stm32f4discoboard default > chibios settings (settings most probably 168MHz, > 5ws, pref/caches > on): > ---------------------------------------------------------------------------- > > --- Test Case 11.8 (Benchmark, round robin context > switching) > --- Score : 1286180 ctxswc/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.9 (Benchmark, I/O Queues > throughput) > --- Score : 1669036 bytes/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.10 (Benchmark, virtual timers > set/reset) > --- Score : 2022172 timers/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.11 (Benchmark, semaphores > wait/signal) > --- Score : 3010608 wait+signal/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.12 (Benchmark, mutexes > lock/unlock) > --- Score : 1799912 lock+unlock/S > --- Result: SUCCESS > > This is what Giovanni has run on his > stm32f4discobard(a different > compiler): > ---------------------------------------------------------------------------- > > --- Test Case 11.8 (Benchmark, round robin context > switching) > --- Score : 1367420 ctxswc/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.9 (Benchmark, I/O Queues > throughput) > --- Score : 1844568 bytes/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.10 (Benchmark, virtual timers > set/reset) > --- Score : 2151998 timers/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.11 (Benchmark, semaphores > wait/signal) > --- Score : 2685712 wait+signal/S > --- Result: SUCCESS > ---------------------------------------------------------------------------- > > --- Test Case 11.12 (Benchmark, mutexes > lock/unlock) > --- Score : 1886020 lock+unlock/S > --- Result: SUCCESS > > The stm32f100(@56MHz,0ws) against > stm32f4(168MHz,5ws,ART) - it seems > the results are ~1:3 (56/168 0ws/0ws). > You may double check.. > p. > > ----- PŮVODNÍ ZPRÁVA ----- > Od: "James Snyder" <[hidden email]> > Komu: "pito" <[hidden email]> > Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings > - flash prefetch > on > Datum: 29.11.2011 - 20:32:55 > > > 2011/11/29 pito <[hidden email]>: > > >> Also I can confirm the ~13% performance > > >> difference > > >> >> on my own hardware, > > > > > > I am still not sure the prefetch cache works > > > full speed as the 13% > > > > we measured is not much, indeed (but it > > > > could be > > > > > > masked by some > > > > processes within eLua however). When we did > > > "prefetch on" on pic32mx > > > > (retrobsd) the filesystem performance went > > > > up by > > > > > > ~30% (where the > > > > most time we wait on sdcard's responses..) > > > > I think it would be helpful to try one or more > > of > > > the following: > > 1) Try this on one of ST's other MCUs that has > > known working prefetch > > and toggle it > > > > 2) Simplify the test case to sequential code > > that > > > definitely doesn't > > branch and would always pull instructions from > > flash. I haven't > > evaluated what's going on at the C level in this > > code to know how > > effective prefetch _should_ be. > > > > 3) Lower the clock to one where you can run with > > zero wait states and > > compare performance between prefetch enabled > > with > > > wait states and zero > > wait states. Given that a previous test in which > > the wait state > > setting was reduced resulted in a performance > > gain > > > might suggest that > > this would yield a significant difference. > > > > Also note that this thing has an instruction > > cache > > > and data cache, > > which may or may not be helping significantly > > depending on the code > > and how many hits and misses we might get. I > > haven't traced through > > what Lua would be doing on the hardware level in > > the inner loop of the > > code example, but a quick test yielded this: > > > > Count: 54, 1000x 5.656045 sec -- prefetch & > > caches > > > on > > Count: 54, 1000x 7.241642 sec -- prefetch on, > > caches off > > Count: 54, 1000x 8.561787 sec -- prefetch off, > > caches off > > > > There's about a 15% time reduction between w/o > > and > > > w/ prefetch, and > > about 21% for caches being enabled. > > > > None of this seems to meet with the marketing > > speak that says that > > prefetch should make it behave like zero wait > > state, but It's > > difficult to know if what we see is related to a > > hardware > > implementation deficiency or the nature of the > > test case. > > > > > p. > > > > > > > > > -- > > > Žijte život gangstera ve velkém městě. Plňte > > > mise pro kmotra, > > > > bojujte s ostatními hráči z celého světa. > > > Zahrajte si hru Mafia na > > > > http://web.volny.cz/data/click.php?id=1305 > > > > > > > > -- > Videokurzy MS Office zdarma! Portál VOLNÝ.cz > přináší online výuková > videa, která vás rychle, názorně a zábavnou formou > naučí ovládat > programy Excel, Word a PowerPoint. Seriál najdete > na > http://web.volny.cz/data/click.php?id=1293 > > > -- Žijte život gangstera ve velkém městě. Plňte mise pro kmotra, bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na http://web.volny.cz/data/click.php?id=1305 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
The average is 1 : 3.012 I see a new benchmarks for f4 on Giovanni's page: http://www.chibios.org/dokuwiki/doku.php?id=chibios:metrics see "Latest Test Reports".. You may compare with other mcu's there as well. p. > 2.993 3.182 > 3.331 3.681 > 2.932 3.120 > 2.721 2.427 > 2.803 2.937 > -- VOLNÝ Klub, limitované kolekce značek jako North Face, Diesel a další se slevami 30 - 80 %. http://web.volny.cz/data/click.php?id=1306 _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Free forum by Nabble | Edit this page |