STM32f4DSCY - MCU settings - flash prefetch on

classic Classic list List threaded Threaded
15 messages Options
Pito Pito
Reply | Threaded
Open this post in threaded view
|

STM32f4DSCY - MCU settings - flash prefetch on

Hi,
It seems to me the flash prefetch is not ON currently (in
system_stm32f4xx.c line 396), the line shall be:

FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN |FLASH_ACR_DCEN |
FLASH_ACR_LATENCY_5WS;

A benchmark I do shows:

1. without FLASH_ACR_PRFTEN set:
..1000x Iterations elapsed: 6.396281 secs

2. with FLASH_ACR_PRFTEN:
..1000x Iterations elapsed: 5.618634 secs

3. with 4ws (do not do it at home, at your own risk, you may brick
your board!):
..1000x Iterations elapsed: 5.258016 secs

P.


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

..it would be a nice option to have for compiling eLua interpreter
into ram (ie in CCM for stm32f4). Flash seems to be quite slow :)..
p.

----- PŮVODNÍ ZPRÁVA -----
Od: "pito" <[hidden email]>
Komu: [hidden email]
Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on
Datum: 21.11.2011 - 10:53:47

> Hi,
> It seems to me the flash prefetch is not ON
> currently (in
> system_stm32f4xx.c line 396), the line shall be:
>
> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN
> |FLASH_ACR_DCEN |
> FLASH_ACR_LATENCY_5WS;
>
> A benchmark I do shows:
>
> 1. without FLASH_ACR_PRFTEN set:
> ..1000x Iterations elapsed: 6.396281 secs
>
> 2. with FLASH_ACR_PRFTEN:
> ..1000x Iterations elapsed: 5.618634 secs
>
> 3. with 4ws (do not do it at home, at your own
> risk, you may brick
> your board!):
> ..1000x Iterations elapsed: 5.258016 secs
>
> P.
>
>
> --
> Jak se vyhnout nachlazení a dalším zdravotním
> potížím v nepříjemném
> podzimním období? Čtěte speciál Zdraví na podzim
> na
> http://web.volny.cz/data/click.php?id=1290
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>


--
Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
podzimním období? Čtěte speciál Zdraví na podzim na
http://web.volny.cz/data/click.php?id=1290



_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

2011/11/21 pito <[hidden email]>:
> ..it would be a nice option to have for compiling eLua interpreter
> into ram (ie in CCM for stm32f4). Flash seems to be quite slow :)..

It's certainly be possible to force some more things into RAM.  One
easy thing you can do is partially or fully disable LTR:
http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html

However, some of the performance difference you'd see form this you
could get by using "local" variables selectively within the program to
reduce flash lookups.

Also note that the CCM is only for data, the MCU won't execute any
code in that RAM.  You could certainly use it for storing data
structures and bytecode though.

On a related note: I've turned on prefetch by default in the branch,
removed a double call to the clock setup code that was being done, and
double-checked the PLL setup, which I'm now pretty sure is correct
after getting a bit confused by the example code that ST ships.  I'm
not completely sure why they don't enable prefetch by default, there
is a related errata, but I'm not sure exactly what affect it has since
it seems like there's definitely a performance benefit to turning
prefetch on, and they enable it by default on STM32F2xx parts.  The
related Errata is:
"The ART Accelerator prefetch queue instruction is not supported.
This limitation does not prevent the ART Accelerator from using the
cache enable/disable
capability and the selection of the number of wait states according to
the system frequency"

> p.
>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "pito" <[hidden email]>
> Komu: [hidden email]
> Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on
> Datum: 21.11.2011 - 10:53:47
>
>> Hi,
>> It seems to me the flash prefetch is not ON
>> currently (in
>> system_stm32f4xx.c line 396), the line shall be:
>>
>> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN
>> |FLASH_ACR_DCEN |
>> FLASH_ACR_LATENCY_5WS;
>>
>> A benchmark I do shows:
>>
>> 1. without FLASH_ACR_PRFTEN set:
>> ..1000x Iterations elapsed: 6.396281 secs
>>
>> 2. with FLASH_ACR_PRFTEN:
>> ..1000x Iterations elapsed: 5.618634 secs
>>
>> 3. with 4ws (do not do it at home, at your own
>> risk, you may brick
>> your board!):
>> ..1000x Iterations elapsed: 5.258016 secs
>>
>> P.
>>
>>
>> --
>> Jak se vyhnout nachlazení a dalším zdravotním
>> potížím v nepříjemném
>> podzimním období? Čtěte speciál Zdraví na podzim
>> na
>> http://web.volny.cz/data/click.php?id=1290
>>
>>
>>
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>>
>
>
> --
> Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
> podzimním období? Čtěte speciál Zdraví na podzim na
> http://web.volny.cz/data/click.php?id=1290
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on



2011/11/28 James Snyder <[hidden email]>
2011/11/21 pito <[hidden email]>:
> ..it would be a nice option to have for compiling eLua interpreter
> into ram (ie in CCM for stm32f4). Flash seems to be quite slow :)..

It's certainly be possible to force some more things into RAM.  One
easy thing you can do is partially or fully disable LTR:
http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html

However, some of the performance difference you'd see form this you
could get by using "local" variables selectively within the program to
reduce flash lookups.

Also note that the CCM is only for data, the MCU won't execute any
code in that RAM.  You could certainly use it for storing data
structures and bytecode though.

On a related note: I've turned on prefetch by default in the branch,
removed a double call to the clock setup code that was being done, and
double-checked the PLL setup, which I'm now pretty sure is correct
after getting a bit confused by the example code that ST ships.  I'm
not completely sure why they don't enable prefetch by default, there
is a related errata, but I'm not sure exactly what affect it has since
it seems like there's definitely a performance benefit to turning
prefetch on, and they enable it by default on STM32F2xx parts.  T
he
related Errata is:
"The ART Accelerator prefetch queue instruction is not supported.
This limitation does not prevent the ART Accelerator from using the
cache enable/disable
capability and the selection of the number of wait states according to
the system frequency"

I don't understand. So it doesn't exist, but turning it on makes a difference ? :)

Best,
Bogdan
 

> p.
>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "pito" <[hidden email]>
> Komu: [hidden email]
> Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on
> Datum: 21.11.2011 - 10:53:47
>
>> Hi,
>> It seems to me the flash prefetch is not ON
>> currently (in
>> system_stm32f4xx.c line 396), the line shall be:
>>
>> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN
>> |FLASH_ACR_DCEN |
>> FLASH_ACR_LATENCY_5WS;
>>
>> A benchmark I do shows:
>>
>> 1. without FLASH_ACR_PRFTEN set:
>> ..1000x Iterations elapsed: 6.396281 secs
>>
>> 2. with FLASH_ACR_PRFTEN:
>> ..1000x Iterations elapsed: 5.618634 secs
>>
>> 3. with 4ws (do not do it at home, at your own
>> risk, you may brick
>> your board!):
>> ..1000x Iterations elapsed: 5.258016 secs
>>
>> P.
>>
>>
>> --
>> Jak se vyhnout nachlazení a dalším zdravotním
>> potížím v nepříjemném
>> podzimním období? Čtěte speciál Zdraví na podzim
>> na
>> http://web.volny.cz/data/click.php?id=1290
>>
>>
>>
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>>
>
>
> --
> Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
> podzimním období? Čtěte speciál Zdraví na podzim na
> http://web.volny.cz/data/click.php?id=1290
>
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

Well, I'm not completely sure.  I haven't tried replicating pito's
results, but I think I'll try and see how much time variance I get
when I run some benchmarks.  One might figure that those timing
differences could be chalked up to using a low frequency system timer,
but the difference there is many cycles at the current 16 Hz rate that
the system timer is set for.

The fact that they have greyed out the prefetch feature in their clock
configuration tool, and they've said here that the "prefetch
instruction" isn't supported on RevA parts would seem to suggest that
that setting shouldn't do anything. I'm not completely clear if
there's a difference between any meaning of a "prefetch queue
instruction" and simply enabling prefetch as a feature in the flash
access control register.

Without further detail, it might be in our best interests to disable
it for RevA parts even though there's no guidance provided on whether
any unexpected results might occur from attempting to enable it?  The
errata just says that the prefetch instruction is unsupported, that
there's no workaround and it will be fixed in the next silicon rev.
It would have been a bit clearer, if they were referring to this
issue, if they had said that "instruction prefetch" was unsupported.

The description of "instruction prefetch" is basically what you would
expect of such a feature in the "flash memory interface" manual
(http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf):
"Each Flash memory read operation provides 128 bits from either four
instructions of 32 bits
or 8 instructions of 16 bits according to the program launched. So, in
case of sequential
code, at least four CPU cycles are needed to execute the previous read
instruction line.
Prefetch on the I-Code bus can be used to read the next sequential
instruction line from the
Flash memory while the current instruction line is being requested by
the CPU. Prefetch is
enabled by setting the PRFTEN bit in the FLASH_ACR register. "

*shrug*

2011/11/28 Bogdan Marinescu <[hidden email]>:

>
>
> 2011/11/28 James Snyder <[hidden email]>
>>
>> 2011/11/21 pito <[hidden email]>:
>> > ..it would be a nice option to have for compiling eLua interpreter
>> > into ram (ie in CCM for stm32f4). Flash seems to be quite slow :)..
>>
>> It's certainly be possible to force some more things into RAM.  One
>> easy thing you can do is partially or fully disable LTR:
>> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
>>
>> However, some of the performance difference you'd see form this you
>> could get by using "local" variables selectively within the program to
>> reduce flash lookups.
>>
>> Also note that the CCM is only for data, the MCU won't execute any
>> code in that RAM.  You could certainly use it for storing data
>> structures and bytecode though.
>>
>> On a related note: I've turned on prefetch by default in the branch,
>> removed a double call to the clock setup code that was being done, and
>> double-checked the PLL setup, which I'm now pretty sure is correct
>> after getting a bit confused by the example code that ST ships.  I'm
>> not completely sure why they don't enable prefetch by default, there
>> is a related errata, but I'm not sure exactly what affect it has since
>> it seems like there's definitely a performance benefit to turning
>> prefetch on, and they enable it by default on STM32F2xx parts.  T
>>
>> he
>> related Errata is:
>> "The ART Accelerator prefetch queue instruction is not supported.
>> This limitation does not prevent the ART Accelerator from using the
>> cache enable/disable
>> capability and the selection of the number of wait states according to
>> the system frequency"
>
> I don't understand. So it doesn't exist, but turning it on makes a
> difference ? :)
> Best,
> Bogdan
>
>>
>> > p.
>> >
>> > ----- PŮVODNÍ ZPRÁVA -----
>> > Od: "pito" <[hidden email]>
>> > Komu: [hidden email]
>> > Předmět: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch on
>> > Datum: 21.11.2011 - 10:53:47
>> >
>> >> Hi,
>> >> It seems to me the flash prefetch is not ON
>> >> currently (in
>> >> system_stm32f4xx.c line 396), the line shall be:
>> >>
>> >> FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN
>> >> |FLASH_ACR_DCEN |
>> >> FLASH_ACR_LATENCY_5WS;
>> >>
>> >> A benchmark I do shows:
>> >>
>> >> 1. without FLASH_ACR_PRFTEN set:
>> >> ..1000x Iterations elapsed: 6.396281 secs
>> >>
>> >> 2. with FLASH_ACR_PRFTEN:
>> >> ..1000x Iterations elapsed: 5.618634 secs
>> >>
>> >> 3. with 4ws (do not do it at home, at your own
>> >> risk, you may brick
>> >> your board!):
>> >> ..1000x Iterations elapsed: 5.258016 secs
>> >>
>> >> P.
>> >>
>> >>
>> >> --
>> >> Jak se vyhnout nachlazení a dalším zdravotním
>> >> potížím v nepříjemném
>> >> podzimním období? Čtěte speciál Zdraví na podzim
>> >> na
>> >> http://web.volny.cz/data/click.php?id=1290
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> eLua-dev mailing list
>> >> [hidden email]
>> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >>
>> >
>> >
>> > --
>> > Jak se vyhnout nachlazení a dalším zdravotním potížím v nepříjemném
>> > podzimním období? Čtěte speciál Zdraví na podzim na
>> > http://web.volny.cz/data/click.php?id=1290
>> >
>> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> >
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

In reply to this post by BogdanM
In the doc RM0090 1315p. long ref manual I found only this text
related to one of the main marketing gadgets - the ART (p.57):

...Thanks to the ART Accelerator™, the CPU can operate up to 168 MHz
frequency without wait states, thereby increasing the overall system
speed and efficiency (see Table 3).
To release the processor 210 DMIPS performance at this frequency,
the accelerator implements an instruction prefetch queue and branch
cache, which enables program execution from Flash memory at up to
168 MHz without wait states.
-----------------
If the prefetch enable does not influence the systick timer speed
somehow, than the 13% speedup result is there. Maybe even bigger as
the benchmark I use is only a simple sieve.
p.

----- PŮVODNÍ ZPRÁVA -----
Od: "Bogdan Marinescu" <[hidden email]>
Komu: "eLua Users and Development List (www.eluaproject.net)"
<[hidden email]>
Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
on
Datum: 28.11.2011 - 18:56:41

> 2011/11/28 James Snyder <[hidden email]>
>
> > 2011/11/21 pito <[hidden email]>:
> > > ..it would be a nice option to have for
> > > compiling eLua interpreter
> > > > > into ram (ie in CCM for stm32f4). Flash seems
> > > to be quite slow :)..
> > > >
> > It's certainly be possible to force some more
> > things into RAM.  One
> > > easy thing you can do is partially or fully
> > disable LTR:
> > > http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
> > >
> > However, some of the performance difference
> > you'd see form this you
> > > could get by using "local" variables selectively
> > within the program to
> > > reduce flash lookups.
> >
> > Also note that the CCM is only for data, the MCU
> > won't execute any
> > > code in that RAM.  You could certainly use it
> > for storing data
> > > structures and bytecode though.
> >
> > On a related note: I've turned on prefetch by
> > default in the branch,
> > > removed a double call to the clock setup code
> > that was being done, and
> > > double-checked the PLL setup, which I'm now
> > pretty sure is correct
> > > after getting a bit confused by the example code
> > that ST ships.  I'm
> > > not completely sure why they don't enable
> > prefetch by default, there
> > > is a related errata, but I'm not sure exactly
> > what affect it has since
> > > it seems like there's definitely a performance
> > benefit to turning
> > > prefetch on, and they enable it by default on
> > STM32F2xx parts.  T
> >
> he
> > related Errata is:
> > "The ART Accelerator prefetch queue instruction
> > is not supported.
> > > This limitation does not prevent the ART
> > Accelerator from using the
> > > cache enable/disable
> > capability and the selection of the number of
> > wait states according to
> > > the system frequency"
> >
>
> I don't understand. So it doesn't exist, but
> turning it on makes a
> difference ? :)
>
> Best,
> Bogdan
>
>
> >
> > > p.
> > >
> > > ----- PŮVODNÍ ZPRÁVA -----
> > > Od: "pito" <[hidden email]>
> > > Komu: [hidden email]
> > > Předmět: [eLua-dev] STM32f4DSCY - MCU settings
> > > - flash prefetch on
> > > > > Datum: 21.11.2011 - 10:53:47
> > >
> > >> Hi,
> > >> It seems to me the flash prefetch is not ON
> > >> currently (in
> > >> system_stm32f4xx.c line 396), the line shall
> > >> be:
> > >> > >>
> > >> FLASH->ACR = FLASH_ACR_PRFTEN |
> > >> FLASH_ACR_ICEN
> > >> > >> |FLASH_ACR_DCEN |
> > >> FLASH_ACR_LATENCY_5WS;
> > >>
> > >> A benchmark I do shows:
> > >>
> > >> 1. without FLASH_ACR_PRFTEN set:
> > >> ..1000x Iterations elapsed: 6.396281 secs
> > >>
> > >> 2. with FLASH_ACR_PRFTEN:
> > >> ..1000x Iterations elapsed: 5.618634 secs
> > >>
> > >> 3. with 4ws (do not do it at home, at your
> > >> own
> > >> > >> risk, you may brick
> > >> your board!):
> > >> ..1000x Iterations elapsed: 5.258016 secs
> > >>
> > >> P.
> > >>
> > >>
> > >> --
> > >> Jak se vyhnout nachlazení a dalším zdravotním
> > >> potížím v nepříjemném
> > >> podzimním období? Čtěte speciál Zdraví na
> > >> podzim
> > >> > >> na
> > >> http://web.volny.cz/data/click.php?id=1290
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> > >> eLua-dev mailing list
> > >> [hidden email]
> > >> https://lists.berlios.de/mailman/listinfo/elua-dev
> > >> > >>
> > >
> > >
> > > --
> > > Jak se vyhnout nachlazení a dalším zdravotním
> > > potížím v nepříjemném
> > > > > podzimním období? Čtěte speciál Zdraví na
> > > podzim na
> > > > > http://web.volny.cz/data/click.php?id=1290
> > >
> > >
> > >
> > > _______________________________________________
> > > > > eLua-dev mailing list
> > > [hidden email]
> > > https://lists.berlios.de/mailman/listinfo/elua-dev
> > > > >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
>


--
Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

In reply to this post by jbsnyder
fyi - the st-link utility I've just connected to the disco board
shows in the Device Information following:
Device:  STM32F4xx
Device ID:  0x411
Revision ID:  Rev B
Flash size:  Uknown

----- PŮVODNÍ ZPRÁVA -----
Od: "James Snyder" <[hidden email]>
Komu: "eLua Users and Development List (www.eluaproject.net)"
<[hidden email]>
Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
on
Datum: 28.11.2011 - 20:01:43

> Well, I'm not completely sure.  I haven't tried
> replicating pito's
> results, but I think I'll try and see how much
> time variance I get
> when I run some benchmarks.  One might figure that
> those timing
> differences could be chalked up to using a low
> frequency system timer,
> but the difference there is many cycles at the
> current 16 Hz rate that
> the system timer is set for.
>
> The fact that they have greyed out the prefetch
> feature in their clock
> configuration tool, and they've said here that the
> "prefetch
> instruction" isn't supported on RevA parts would
> seem to suggest that
> that setting shouldn't do anything. I'm not
> completely clear if
> there's a difference between any meaning of a
> "prefetch queue
> instruction" and simply enabling prefetch as a
> feature in the flash
> access control register.
>
> Without further detail, it might be in our best
> interests to disable
> it for RevA parts even though there's no guidance
> provided on whether
> any unexpected results might occur from attempting
> to enable it?  The
> errata just says that the prefetch instruction is
> unsupported, that
> there's no workaround and it will be fixed in the
> next silicon rev.
> It would have been a bit clearer, if they were
> referring to this
> issue, if they had said that "instruction
> prefetch" was unsupported.
>
> The description of "instruction prefetch" is
> basically what you would
> expect of such a feature in the "flash memory
> interface" manual
> (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf):
> > "Each Flash memory read operation provides 128
> bits from either four
> instructions of 32 bits
> or 8 instructions of 16 bits according to the
> program launched. So, in
> case of sequential
> code, at least four CPU cycles are needed to
> execute the previous read
> instruction line.
> Prefetch on the I-Code bus can be used to read the
> next sequential
> instruction line from the
> Flash memory while the current instruction line is
> being requested by
> the CPU. Prefetch is
> enabled by setting the PRFTEN bit in the FLASH_ACR
> register. "
>
> *shrug*
>
> 2011/11/28 Bogdan Marinescu
> <[hidden email]>:
> >
> >
> > 2011/11/28 James Snyder
> > <[hidden email]>
> > >>
> >> 2011/11/21 pito <[hidden email]>:
> >> > ..it would be a nice option to have for
> >> > compiling eLua interpreter
> >> > >> > into ram (ie in CCM for stm32f4). Flash seems
> >> > to be quite slow :)..
> >> > >>
> >> It's certainly be possible to force some more
> >> things into RAM.  One
> >> >> easy thing you can do is partially or fully
> >> disable LTR:
> >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
> >> >>
> >> However, some of the performance difference
> >> you'd see form this you
> >> >> could get by using "local" variables
> >> selectively within the program to
> >> >> reduce flash lookups.
> >>
> >> Also note that the CCM is only for data, the
> >> MCU won't execute any
> >> >> code in that RAM.  You could certainly use it
> >> for storing data
> >> >> structures and bytecode though.
> >>
> >> On a related note: I've turned on prefetch by
> >> default in the branch,
> >> >> removed a double call to the clock setup code
> >> that was being done, and
> >> >> double-checked the PLL setup, which I'm now
> >> pretty sure is correct
> >> >> after getting a bit confused by the example
> >> code that ST ships.  I'm
> >> >> not completely sure why they don't enable
> >> prefetch by default, there
> >> >> is a related errata, but I'm not sure exactly
> >> what affect it has since
> >> >> it seems like there's definitely a performance
> >> benefit to turning
> >> >> prefetch on, and they enable it by default on
> >> STM32F2xx parts.  T
> >> >>
> >> he
> >> related Errata is:
> >> "The ART Accelerator prefetch queue instruction
> >> is not supported.
> >> >> This limitation does not prevent the ART
> >> Accelerator from using the
> >> >> cache enable/disable
> >> capability and the selection of the number of
> >> wait states according to
> >> >> the system frequency"
> >
> > I don't understand. So it doesn't exist, but
> > turning it on makes a
> > > difference ? :)
> > Best,
> > Bogdan
> >
> >>
> >> > p.
> >> >
> >> > ----- PŮVODNÍ ZPRÁVA -----
> >> > Od: "pito" <[hidden email]>
> >> > Komu: [hidden email]
> >> > Předmět: [eLua-dev] STM32f4DSCY - MCU
> >> > settings - flash prefetch on
> >> > >> > Datum: 21.11.2011 - 10:53:47
> >> >
> >> >> Hi,
> >> >> It seems to me the flash prefetch is not ON
> >> >> currently (in
> >> >> system_stm32f4xx.c line 396), the line shall
> >> >> be:
> >> >> >> >>
> >> >> FLASH->ACR = FLASH_ACR_PRFTEN |
> >> >> FLASH_ACR_ICEN
> >> >> >> >> |FLASH_ACR_DCEN |
> >> >> FLASH_ACR_LATENCY_5WS;
> >> >>
> >> >> A benchmark I do shows:
> >> >>
> >> >> 1. without FLASH_ACR_PRFTEN set:
> >> >> ..1000x Iterations elapsed: 6.396281 secs
> >> >>
> >> >> 2. with FLASH_ACR_PRFTEN:
> >> >> ..1000x Iterations elapsed: 5.618634 secs
> >> >>
> >> >> 3. with 4ws (do not do it at home, at your
> >> >> own
> >> >> >> >> risk, you may brick
> >> >> your board!):
> >> >> ..1000x Iterations elapsed: 5.258016 secs
> >> >>
> >> >> P.
> >> >>
> >> >>
> >> >> --
> >> >> Jak se vyhnout nachlazení a dalším
> >> >> zdravotním
> >> >> >> >> potížím v nepříjemném
> >> >> podzimním období? Čtěte speciál Zdraví na
> >> >> podzim
> >> >> >> >> na
> >> >> http://web.volny.cz/data/click.php?id=1290
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> >> >> eLua-dev mailing list
> >> >> [hidden email]
> >> >> https://lists.berlios.de/mailman/listinfo/elua-dev
> >> >> >> >>
> >> >
> >> >
> >> > --
> >> > Jak se vyhnout nachlazení a dalším zdravotním
> >> > potížím v nepříjemném
> >> > >> > podzimním období? Čtěte speciál Zdraví na
> >> > podzim na
> >> > >> > http://web.volny.cz/data/click.php?id=1290
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > >> > eLua-dev mailing list
> >> > [hidden email]
> >> > https://lists.berlios.de/mailman/listinfo/elua-dev
> >> > >> >
> >> _______________________________________________
> >> eLua-dev mailing list
> >> [hidden email]
> >> https://lists.berlios.de/mailman/listinfo/elua-dev
> >> >
> >
> > _______________________________________________
> > eLua-dev mailing list
> > [hidden email]
> > https://lists.berlios.de/mailman/listinfo/elua-dev
> > >
> >
> _______________________________________________
> eLua-dev mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/elua-dev
>


--
Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

2011/11/28 pito <[hidden email]>:
> fyi - the st-link utility I've just connected to the disco board
> shows in the Device Information following:
> Device:  STM32F4xx
> Device ID:  0x411
> Revision ID:  Rev B
> Flash size:  Uknown

I see the same, although from the packaging it looks like it's a RevA,
and it does appear to have the incorrect Device ID (0x411) matching
the errata for Rev A (which says it should match the Dev ID for a
STM32F2), which should be 0x413 instead.

I'm not sure whether the Rev B is accurate or not?

The relevant rev/dev id from mine is:
0x20006411 @ 0xE0042000

Rev ID: 0x2000
Dev ID: 0x411

>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "James Snyder" <[hidden email]>
> Komu: "eLua Users and Development List (www.eluaproject.net)"
> <[hidden email]>
> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
> on
> Datum: 28.11.2011 - 20:01:43
>
>> Well, I'm not completely sure.  I haven't tried
>> replicating pito's
>> results, but I think I'll try and see how much
>> time variance I get
>> when I run some benchmarks.  One might figure that
>> those timing
>> differences could be chalked up to using a low
>> frequency system timer,
>> but the difference there is many cycles at the
>> current 16 Hz rate that
>> the system timer is set for.
>>
>> The fact that they have greyed out the prefetch
>> feature in their clock
>> configuration tool, and they've said here that the
>> "prefetch
>> instruction" isn't supported on RevA parts would
>> seem to suggest that
>> that setting shouldn't do anything. I'm not
>> completely clear if
>> there's a difference between any meaning of a
>> "prefetch queue
>> instruction" and simply enabling prefetch as a
>> feature in the flash
>> access control register.
>>
>> Without further detail, it might be in our best
>> interests to disable
>> it for RevA parts even though there's no guidance
>> provided on whether
>> any unexpected results might occur from attempting
>> to enable it?  The
>> errata just says that the prefetch instruction is
>> unsupported, that
>> there's no workaround and it will be fixed in the
>> next silicon rev.
>> It would have been a bit clearer, if they were
>> referring to this
>> issue, if they had said that "instruction
>> prefetch" was unsupported.
>>
>> The description of "instruction prefetch" is
>> basically what you would
>> expect of such a feature in the "flash memory
>> interface" manual
>> (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf):
>> > "Each Flash memory read operation provides 128
>> bits from either four
>> instructions of 32 bits
>> or 8 instructions of 16 bits according to the
>> program launched. So, in
>> case of sequential
>> code, at least four CPU cycles are needed to
>> execute the previous read
>> instruction line.
>> Prefetch on the I-Code bus can be used to read the
>> next sequential
>> instruction line from the
>> Flash memory while the current instruction line is
>> being requested by
>> the CPU. Prefetch is
>> enabled by setting the PRFTEN bit in the FLASH_ACR
>> register. "
>>
>> *shrug*
>>
>> 2011/11/28 Bogdan Marinescu
>> <[hidden email]>:
>> >
>> >
>> > 2011/11/28 James Snyder
>> > <[hidden email]>
>> > >>
>> >> 2011/11/21 pito <[hidden email]>:
>> >> > ..it would be a nice option to have for
>> >> > compiling eLua interpreter
>> >> > >> > into ram (ie in CCM for stm32f4). Flash seems
>> >> > to be quite slow :)..
>> >> > >>
>> >> It's certainly be possible to force some more
>> >> things into RAM.  One
>> >> >> easy thing you can do is partially or fully
>> >> disable LTR:
>> >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
>> >> >>
>> >> However, some of the performance difference
>> >> you'd see form this you
>> >> >> could get by using "local" variables
>> >> selectively within the program to
>> >> >> reduce flash lookups.
>> >>
>> >> Also note that the CCM is only for data, the
>> >> MCU won't execute any
>> >> >> code in that RAM.  You could certainly use it
>> >> for storing data
>> >> >> structures and bytecode though.
>> >>
>> >> On a related note: I've turned on prefetch by
>> >> default in the branch,
>> >> >> removed a double call to the clock setup code
>> >> that was being done, and
>> >> >> double-checked the PLL setup, which I'm now
>> >> pretty sure is correct
>> >> >> after getting a bit confused by the example
>> >> code that ST ships.  I'm
>> >> >> not completely sure why they don't enable
>> >> prefetch by default, there
>> >> >> is a related errata, but I'm not sure exactly
>> >> what affect it has since
>> >> >> it seems like there's definitely a performance
>> >> benefit to turning
>> >> >> prefetch on, and they enable it by default on
>> >> STM32F2xx parts.  T
>> >> >>
>> >> he
>> >> related Errata is:
>> >> "The ART Accelerator prefetch queue instruction
>> >> is not supported.
>> >> >> This limitation does not prevent the ART
>> >> Accelerator from using the
>> >> >> cache enable/disable
>> >> capability and the selection of the number of
>> >> wait states according to
>> >> >> the system frequency"
>> >
>> > I don't understand. So it doesn't exist, but
>> > turning it on makes a
>> > > difference ? :)
>> > Best,
>> > Bogdan
>> >
>> >>
>> >> > p.
>> >> >
>> >> > ----- PŮVODNÍ ZPRÁVA -----
>> >> > Od: "pito" <[hidden email]>
>> >> > Komu: [hidden email]
>> >> > Předmět: [eLua-dev] STM32f4DSCY - MCU
>> >> > settings - flash prefetch on
>> >> > >> > Datum: 21.11.2011 - 10:53:47
>> >> >
>> >> >> Hi,
>> >> >> It seems to me the flash prefetch is not ON
>> >> >> currently (in
>> >> >> system_stm32f4xx.c line 396), the line shall
>> >> >> be:
>> >> >> >> >>
>> >> >> FLASH->ACR = FLASH_ACR_PRFTEN |
>> >> >> FLASH_ACR_ICEN
>> >> >> >> >> |FLASH_ACR_DCEN |
>> >> >> FLASH_ACR_LATENCY_5WS;
>> >> >>
>> >> >> A benchmark I do shows:
>> >> >>
>> >> >> 1. without FLASH_ACR_PRFTEN set:
>> >> >> ..1000x Iterations elapsed: 6.396281 secs
>> >> >>
>> >> >> 2. with FLASH_ACR_PRFTEN:
>> >> >> ..1000x Iterations elapsed: 5.618634 secs
>> >> >>
>> >> >> 3. with 4ws (do not do it at home, at your
>> >> >> own
>> >> >> >> >> risk, you may brick
>> >> >> your board!):
>> >> >> ..1000x Iterations elapsed: 5.258016 secs
>> >> >>
>> >> >> P.
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jak se vyhnout nachlazení a dalším
>> >> >> zdravotním
>> >> >> >> >> potížím v nepříjemném
>> >> >> podzimním období? Čtěte speciál Zdraví na
>> >> >> podzim
>> >> >> >> >> na
>> >> >> http://web.volny.cz/data/click.php?id=1290
>> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> >> >> eLua-dev mailing list
>> >> >> [hidden email]
>> >> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Jak se vyhnout nachlazení a dalším zdravotním
>> >> > potížím v nepříjemném
>> >> > >> > podzimním období? Čtěte speciál Zdraví na
>> >> > podzim na
>> >> > >> > http://web.volny.cz/data/click.php?id=1290
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > >> > eLua-dev mailing list
>> >> > [hidden email]
>> >> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> > >> >
>> >> _______________________________________________
>> >> eLua-dev mailing list
>> >> [hidden email]
>> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>>
>
>
> --
> Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
> pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
> portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
BogdanM BogdanM
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on



2011/11/28 James Snyder <[hidden email]>
2011/11/28 pito <[hidden email]>:
> fyi - the st-link utility I've just connected to the disco board
> shows in the Device Information following:
> Device:  STM32F4xx
> Device ID:  0x411
> Revision ID:  Rev B
> Flash size:  Uknown

I see the same, although from the packaging it looks like it's a RevA,
and it does appear to have the incorrect Device ID (0x411) matching
the errata for Rev A (which says it should match the Dev ID for a
STM32F2), which should be 0x413 instead.

I'm not sure whether the Rev B is accurate or not?
 
From what I could gather, if turning on the prefetch actually makes a difference, it must be a Rev B. This kind of confusion is pretty normal for new chips.

Best,
Bogdan


The relevant rev/dev id from mine is:
0x20006411 @ 0xE0042000

Rev ID: 0x2000
Dev ID: 0x411

>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "James Snyder" <[hidden email]>
> Komu: "eLua Users and Development List (www.eluaproject.net)"
> <[hidden email]>
> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
> on
> Datum: 28.11.2011 - 20:01:43
>
>> Well, I'm not completely sure.  I haven't tried
>> replicating pito's
>> results, but I think I'll try and see how much
>> time variance I get
>> when I run some benchmarks.  One might figure that
>> those timing
>> differences could be chalked up to using a low
>> frequency system timer,
>> but the difference there is many cycles at the
>> current 16 Hz rate that
>> the system timer is set for.
>>
>> The fact that they have greyed out the prefetch
>> feature in their clock
>> configuration tool, and they've said here that the
>> "prefetch
>> instruction" isn't supported on RevA parts would
>> seem to suggest that
>> that setting shouldn't do anything. I'm not
>> completely clear if
>> there's a difference between any meaning of a
>> "prefetch queue
>> instruction" and simply enabling prefetch as a
>> feature in the flash
>> access control register.
>>
>> Without further detail, it might be in our best
>> interests to disable
>> it for RevA parts even though there's no guidance
>> provided on whether
>> any unexpected results might occur from attempting
>> to enable it?  The
>> errata just says that the prefetch instruction is
>> unsupported, that
>> there's no workaround and it will be fixed in the
>> next silicon rev.
>> It would have been a bit clearer, if they were
>> referring to this
>> issue, if they had said that "instruction
>> prefetch" was unsupported.
>>
>> The description of "instruction prefetch" is
>> basically what you would
>> expect of such a feature in the "flash memory
>> interface" manual
>> (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf):
>> > "Each Flash memory read operation provides 128
>> bits from either four
>> instructions of 32 bits
>> or 8 instructions of 16 bits according to the
>> program launched. So, in
>> case of sequential
>> code, at least four CPU cycles are needed to
>> execute the previous read
>> instruction line.
>> Prefetch on the I-Code bus can be used to read the
>> next sequential
>> instruction line from the
>> Flash memory while the current instruction line is
>> being requested by
>> the CPU. Prefetch is
>> enabled by setting the PRFTEN bit in the FLASH_ACR
>> register. "
>>
>> *shrug*
>>
>> 2011/11/28 Bogdan Marinescu
>> <[hidden email]>:
>> >
>> >
>> > 2011/11/28 James Snyder
>> > <[hidden email]>
>> > >>
>> >> 2011/11/21 pito <[hidden email]>:
>> >> > ..it would be a nice option to have for
>> >> > compiling eLua interpreter
>> >> > >> > into ram (ie in CCM for stm32f4). Flash seems
>> >> > to be quite slow :)..
>> >> > >>
>> >> It's certainly be possible to force some more
>> >> things into RAM.  One
>> >> >> easy thing you can do is partially or fully
>> >> disable LTR:
>> >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
>> >> >>
>> >> However, some of the performance difference
>> >> you'd see form this you
>> >> >> could get by using "local" variables
>> >> selectively within the program to
>> >> >> reduce flash lookups.
>> >>
>> >> Also note that the CCM is only for data, the
>> >> MCU won't execute any
>> >> >> code in that RAM.  You could certainly use it
>> >> for storing data
>> >> >> structures and bytecode though.
>> >>
>> >> On a related note: I've turned on prefetch by
>> >> default in the branch,
>> >> >> removed a double call to the clock setup code
>> >> that was being done, and
>> >> >> double-checked the PLL setup, which I'm now
>> >> pretty sure is correct
>> >> >> after getting a bit confused by the example
>> >> code that ST ships.  I'm
>> >> >> not completely sure why they don't enable
>> >> prefetch by default, there
>> >> >> is a related errata, but I'm not sure exactly
>> >> what affect it has since
>> >> >> it seems like there's definitely a performance
>> >> benefit to turning
>> >> >> prefetch on, and they enable it by default on
>> >> STM32F2xx parts.  T
>> >> >>
>> >> he
>> >> related Errata is:
>> >> "The ART Accelerator prefetch queue instruction
>> >> is not supported.
>> >> >> This limitation does not prevent the ART
>> >> Accelerator from using the
>> >> >> cache enable/disable
>> >> capability and the selection of the number of
>> >> wait states according to
>> >> >> the system frequency"
>> >
>> > I don't understand. So it doesn't exist, but
>> > turning it on makes a
>> > > difference ? :)
>> > Best,
>> > Bogdan
>> >
>> >>
>> >> > p.
>> >> >
>> >> > ----- PŮVODNÍ ZPRÁVA -----
>> >> > Od: "pito" <[hidden email]>
>> >> > Komu: [hidden email]
>> >> > Předmět: [eLua-dev] STM32f4DSCY - MCU
>> >> > settings - flash prefetch on
>> >> > >> > Datum: 21.11.2011 - 10:53:47
>> >> >
>> >> >> Hi,
>> >> >> It seems to me the flash prefetch is not ON
>> >> >> currently (in
>> >> >> system_stm32f4xx.c line 396), the line shall
>> >> >> be:
>> >> >> >> >>
>> >> >> FLASH->ACR = FLASH_ACR_PRFTEN |
>> >> >> FLASH_ACR_ICEN
>> >> >> >> >> |FLASH_ACR_DCEN |
>> >> >> FLASH_ACR_LATENCY_5WS;
>> >> >>
>> >> >> A benchmark I do shows:
>> >> >>
>> >> >> 1. without FLASH_ACR_PRFTEN set:
>> >> >> ..1000x Iterations elapsed: 6.396281 secs
>> >> >>
>> >> >> 2. with FLASH_ACR_PRFTEN:
>> >> >> ..1000x Iterations elapsed: 5.618634 secs
>> >> >>
>> >> >> 3. with 4ws (do not do it at home, at your
>> >> >> own
>> >> >> >> >> risk, you may brick
>> >> >> your board!):
>> >> >> ..1000x Iterations elapsed: 5.258016 secs
>> >> >>
>> >> >> P.
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jak se vyhnout nachlazení a dalším
>> >> >> zdravotním
>> >> >> >> >> potížím v nepříjemném
>> >> >> podzimním období? Čtěte speciál Zdraví na
>> >> >> podzim
>> >> >> >> >> na
>> >> >> http://web.volny.cz/data/click.php?id=1290
>> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> >> >> eLua-dev mailing list
>> >> >> [hidden email]
>> >> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Jak se vyhnout nachlazení a dalším zdravotním
>> >> > potížím v nepříjemném
>> >> > >> > podzimním období? Čtěte speciál Zdraví na
>> >> > podzim na
>> >> > >> > http://web.volny.cz/data/click.php?id=1290
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > >> > eLua-dev mailing list
>> >> > [hidden email]
>> >> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> > >> >
>> >> _______________________________________________
>> >> eLua-dev mailing list
>> >> [hidden email]
>> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>>
>
>
> --
> Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
> pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
> portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on



--
James Snyder
Biomedical Engineering
Northwestern University
ph: (847) 448-0386

On Nov 28, 2011, at 14:15, Bogdan Marinescu <[hidden email]> wrote:



2011/11/28 James Snyder <[hidden email]>
2011/11/28 pito <[hidden email]>:
> fyi - the st-link utility I've just connected to the disco board
> shows in the Device Information following:
> Device:  STM32F4xx
> Device ID:  0x411
> Revision ID:  Rev B
> Flash size:  Uknown

I see the same, although from the packaging it looks like it's a RevA,
and it does appear to have the incorrect Device ID (0x411) matching
the errata for Rev A (which says it should match the Dev ID for a
STM32F2), which should be 0x413 instead.

I'm not sure whether the Rev B is accurate or not?
 
From what I could gather, if turning on the prefetch actually makes a difference, it must be a Rev B. This kind of confusion is pretty normal for new chips.

I could imagine them wanting to fix the problem ASAP, and perhaps rushing things out since its one of their large selling points on the platform. Also I can confirm the ~13% performance difference on my own hardware, using a lightly modified sieve.lua, I think from the programming language shoot out then modified to include timing and to only run up to 256, 1000x


Best,
Bogdan


The relevant rev/dev id from mine is:
0x20006411 @ 0xE0042000

Rev ID: 0x2000
Dev ID: 0x411

>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "James Snyder" <[hidden email]>
> Komu: "eLua Users and Development List (www.eluaproject.net)"
> <[hidden email]>
> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
> on
> Datum: 28.11.2011 - 20:01:43
>
>> Well, I'm not completely sure.  I haven't tried
>> replicating pito's
>> results, but I think I'll try and see how much
>> time variance I get
>> when I run some benchmarks.  One might figure that
>> those timing
>> differences could be chalked up to using a low
>> frequency system timer,
>> but the difference there is many cycles at the
>> current 16 Hz rate that
>> the system timer is set for.
>>
>> The fact that they have greyed out the prefetch
>> feature in their clock
>> configuration tool, and they've said here that the
>> "prefetch
>> instruction" isn't supported on RevA parts would
>> seem to suggest that
>> that setting shouldn't do anything. I'm not
>> completely clear if
>> there's a difference between any meaning of a
>> "prefetch queue
>> instruction" and simply enabling prefetch as a
>> feature in the flash
>> access control register.
>>
>> Without further detail, it might be in our best
>> interests to disable
>> it for RevA parts even though there's no guidance
>> provided on whether
>> any unexpected results might occur from attempting
>> to enable it?  The
>> errata just says that the prefetch instruction is
>> unsupported, that
>> there's no workaround and it will be fixed in the
>> next silicon rev.
>> It would have been a bit clearer, if they were
>> referring to this
>> issue, if they had said that "instruction
>> prefetch" was unsupported.
>>
>> The description of "instruction prefetch" is
>> basically what you would
>> expect of such a feature in the "flash memory
>> interface" manual
>> (http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/PROGRAMMING_MANUAL/DM00023388.pdf):
>> > "Each Flash memory read operation provides 128
>> bits from either four
>> instructions of 32 bits
>> or 8 instructions of 16 bits according to the
>> program launched. So, in
>> case of sequential
>> code, at least four CPU cycles are needed to
>> execute the previous read
>> instruction line.
>> Prefetch on the I-Code bus can be used to read the
>> next sequential
>> instruction line from the
>> Flash memory while the current instruction line is
>> being requested by
>> the CPU. Prefetch is
>> enabled by setting the PRFTEN bit in the FLASH_ACR
>> register. "
>>
>> *shrug*
>>
>> 2011/11/28 Bogdan Marinescu
>> <[hidden email]>:
>> >
>> >
>> > 2011/11/28 James Snyder
>> > <[hidden email]>
>> > >>
>> >> 2011/11/21 pito <[hidden email]>:
>> >> > ..it would be a nice option to have for
>> >> > compiling eLua interpreter
>> >> > >> > into ram (ie in CCM for stm32f4). Flash seems
>> >> > to be quite slow :)..
>> >> > >>
>> >> It's certainly be possible to force some more
>> >> things into RAM.  One
>> >> >> easy thing you can do is partially or fully
>> >> disable LTR:
>> >> >> http://www.eluaproject.net/doc/v0.8/en_arch_ltr.html
>> >> >>
>> >> However, some of the performance difference
>> >> you'd see form this you
>> >> >> could get by using "local" variables
>> >> selectively within the program to
>> >> >> reduce flash lookups.
>> >>
>> >> Also note that the CCM is only for data, the
>> >> MCU won't execute any
>> >> >> code in that RAM.  You could certainly use it
>> >> for storing data
>> >> >> structures and bytecode though.
>> >>
>> >> On a related note: I've turned on prefetch by
>> >> default in the branch,
>> >> >> removed a double call to the clock setup code
>> >> that was being done, and
>> >> >> double-checked the PLL setup, which I'm now
>> >> pretty sure is correct
>> >> >> after getting a bit confused by the example
>> >> code that ST ships.  I'm
>> >> >> not completely sure why they don't enable
>> >> prefetch by default, there
>> >> >> is a related errata, but I'm not sure exactly
>> >> what affect it has since
>> >> >> it seems like there's definitely a performance
>> >> benefit to turning
>> >> >> prefetch on, and they enable it by default on
>> >> STM32F2xx parts.  T
>> >> >>
>> >> he
>> >> related Errata is:
>> >> "The ART Accelerator prefetch queue instruction
>> >> is not supported.
>> >> >> This limitation does not prevent the ART
>> >> Accelerator from using the
>> >> >> cache enable/disable
>> >> capability and the selection of the number of
>> >> wait states according to
>> >> >> the system frequency"
>> >
>> > I don't understand. So it doesn't exist, but
>> > turning it on makes a
>> > > difference ? :)
>> > Best,
>> > Bogdan
>> >
>> >>
>> >> > p.
>> >> >
>> >> > ----- PŮVODNÍ ZPRÁVA -----
>> >> > Od: "pito" <[hidden email]>
>> >> > Komu: [hidden email]
>> >> > Předmět: [eLua-dev] STM32f4DSCY - MCU
>> >> > settings - flash prefetch on
>> >> > >> > Datum: 21.11.2011 - 10:53:47
>> >> >
>> >> >> Hi,
>> >> >> It seems to me the flash prefetch is not ON
>> >> >> currently (in
>> >> >> system_stm32f4xx.c line 396), the line shall
>> >> >> be:
>> >> >> >> >>
>> >> >> FLASH->ACR = FLASH_ACR_PRFTEN |
>> >> >> FLASH_ACR_ICEN
>> >> >> >> >> |FLASH_ACR_DCEN |
>> >> >> FLASH_ACR_LATENCY_5WS;
>> >> >>
>> >> >> A benchmark I do shows:
>> >> >>
>> >> >> 1. without FLASH_ACR_PRFTEN set:
>> >> >> ..1000x Iterations elapsed: 6.396281 secs
>> >> >>
>> >> >> 2. with FLASH_ACR_PRFTEN:
>> >> >> ..1000x Iterations elapsed: 5.618634 secs
>> >> >>
>> >> >> 3. with 4ws (do not do it at home, at your
>> >> >> own
>> >> >> >> >> risk, you may brick
>> >> >> your board!):
>> >> >> ..1000x Iterations elapsed: 5.258016 secs
>> >> >>
>> >> >> P.
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jak se vyhnout nachlazení a dalším
>> >> >> zdravotním
>> >> >> >> >> potížím v nepříjemném
>> >> >> podzimním období? Čtěte speciál Zdraví na
>> >> >> podzim
>> >> >> >> >> na
>> >> >> http://web.volny.cz/data/click.php?id=1290
>> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> >> >> eLua-dev mailing list
>> >> >> [hidden email]
>> >> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Jak se vyhnout nachlazení a dalším zdravotním
>> >> > potížím v nepříjemném
>> >> > >> > podzimním období? Čtěte speciál Zdraví na
>> >> > podzim na
>> >> > >> > http://web.volny.cz/data/click.php?id=1290
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > >> > eLua-dev mailing list
>> >> > [hidden email]
>> >> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> > >> >
>> >> _______________________________________________
>> >> eLua-dev mailing list
>> >> [hidden email]
>> >> https://lists.berlios.de/mailman/listinfo/elua-dev
>> >> >
>> >
>> > _______________________________________________
>> > eLua-dev mailing list
>> > [hidden email]
>> > https://lists.berlios.de/mailman/listinfo/elua-dev
>> > >
>> >
>> _______________________________________________
>> eLua-dev mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/elua-dev
>>
>
>
> --
> Tradiční i moderní adventní a novoroční zvyky, sváteční jídlo a
> pití, výzdoba a dárky... - čtěte vánoční a silvestrovský speciál
> portálu VOLNÝ.cz na http://web.volny.cz/data/click.php?id=1301
>
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

> Also I can confirm the ~13% performance difference
> on my own hardware,

I am still not sure the prefetch cache works full speed as the 13%
we measured is not much, indeed (but it could be masked by some
processes within eLua however). When we did "prefetch on" on pic32mx
(retrobsd) the filesystem performance went up by ~30% (where the
most time we wait on sdcard's responses..)
p.


--
Žijte život gangstera ve velkém městě. Plňte mise pro kmotra,
bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na
http://web.volny.cz/data/click.php?id=1305

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
jbsnyder jbsnyder
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

2011/11/29 pito <[hidden email]>:
>> Also I can confirm the ~13% performance difference
>> on my own hardware,
>
> I am still not sure the prefetch cache works full speed as the 13%
> we measured is not much, indeed (but it could be masked by some
> processes within eLua however). When we did "prefetch on" on pic32mx
> (retrobsd) the filesystem performance went up by ~30% (where the
> most time we wait on sdcard's responses..)

I think it would be helpful to try one or more of the following:
1) Try this on one of ST's other MCUs that has known working prefetch
and toggle it

2) Simplify the test case to sequential code that definitely doesn't
branch and would always pull instructions from flash.  I haven't
evaluated what's going on at the C level in this code to know how
effective prefetch _should_ be.

3) Lower the clock to one where you can run with zero wait states and
compare performance between prefetch enabled with wait states and zero
wait states. Given that a previous test in which the wait state
setting was reduced resulted in a performance gain might suggest that
this would yield a significant difference.

Also note that this thing has an instruction cache and data cache,
which may or may not be helping significantly depending on the code
and how many hits and misses we might get.  I haven't traced through
what Lua would be doing on the hardware level in the inner loop of the
code example, but a quick test yielded this:

Count: 54, 1000x 5.656045 sec -- prefetch & caches on
Count: 54, 1000x 7.241642 sec -- prefetch on, caches off
Count: 54, 1000x 8.561787 sec -- prefetch off, caches off

There's about a 15% time reduction between w/o and w/ prefetch, and
about 21% for caches being enabled.

None of this seems to meet with the marketing speak that says that
prefetch should make it behave like zero wait state, but It's
difficult to know if what we see is related to a hardware
implementation deficiency or the nature of the test case.

> p.
>
>
> --
> Žijte život gangstera ve velkém městě. Plňte mise pro kmotra,
> bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na
> http://web.volny.cz/data/click.php?id=1305
>
_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

For the analysis:
This is from the chibios rtos benchmark I ran on stm32VL discoboard,
stm32f100 overclocked to 56MHz, 0(zero)ws by default:
----------------------------------------------------------------------------
--- Test Case 11.8 (Benchmark, round robin context switching)
--- Score : 429800 ctxswc/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.9 (Benchmark, I/O Queues throughput)
--- Score : 501116 bytes/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.10 (Benchmark, virtual timers set/reset)
--- Score : 689800 timers/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.11 (Benchmark, semaphores wait/signal)
--- Score : 1106436 wait+signal/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.12 (Benchmark, mutexes lock/unlock)
--- Score : 642236 lock+unlock/S
--- Result: SUCCESS

This is from chibios rtos test I ran on stm32f4discoboard default
chibios settings (settings most probably 168MHz, 5ws, pref/caches
on):
----------------------------------------------------------------------------
--- Test Case 11.8 (Benchmark, round robin context switching)
--- Score : 1286180 ctxswc/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.9 (Benchmark, I/O Queues throughput)
--- Score : 1669036 bytes/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.10 (Benchmark, virtual timers set/reset)
--- Score : 2022172 timers/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.11 (Benchmark, semaphores wait/signal)
--- Score : 3010608 wait+signal/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.12 (Benchmark, mutexes lock/unlock)
--- Score : 1799912 lock+unlock/S
--- Result: SUCCESS

This is what Giovanni has run on his stm32f4discobard(a different
compiler):
----------------------------------------------------------------------------
--- Test Case 11.8 (Benchmark, round robin context switching)
--- Score : 1367420 ctxswc/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.9 (Benchmark, I/O Queues throughput)
--- Score : 1844568 bytes/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.10 (Benchmark, virtual timers set/reset)
--- Score : 2151998 timers/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.11 (Benchmark, semaphores wait/signal)
--- Score : 2685712 wait+signal/S
--- Result: SUCCESS
----------------------------------------------------------------------------
--- Test Case 11.12 (Benchmark, mutexes lock/unlock)
--- Score : 1886020 lock+unlock/S
--- Result: SUCCESS

The stm32f100(@56MHz,0ws) against stm32f4(168MHz,5ws,ART) - it seems
the results are ~1:3 (56/168 0ws/0ws).
You may double check..
p.

----- PŮVODNÍ ZPRÁVA -----
Od: "James Snyder" <[hidden email]>
Komu: "pito" <[hidden email]>
Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
on
Datum: 29.11.2011 - 20:32:55

> 2011/11/29 pito <[hidden email]>:
> >> Also I can confirm the ~13% performance
> >> difference
> >> >> on my own hardware,
> >
> > I am still not sure the prefetch cache works
> > full speed as the 13%
> > > we measured is not much, indeed (but it could be
> > masked by some
> > > processes within eLua however). When we did
> > "prefetch on" on pic32mx
> > > (retrobsd) the filesystem performance went up by
> > ~30% (where the
> > > most time we wait on sdcard's responses..)
>
> I think it would be helpful to try one or more of
> the following:
> 1) Try this on one of ST's other MCUs that has
> known working prefetch
> and toggle it
>
> 2) Simplify the test case to sequential code that
> definitely doesn't
> branch and would always pull instructions from
> flash.  I haven't
> evaluated what's going on at the C level in this
> code to know how
> effective prefetch _should_ be.
>
> 3) Lower the clock to one where you can run with
> zero wait states and
> compare performance between prefetch enabled with
> wait states and zero
> wait states. Given that a previous test in which
> the wait state
> setting was reduced resulted in a performance gain
> might suggest that
> this would yield a significant difference.
>
> Also note that this thing has an instruction cache
> and data cache,
> which may or may not be helping significantly
> depending on the code
> and how many hits and misses we might get.  I
> haven't traced through
> what Lua would be doing on the hardware level in
> the inner loop of the
> code example, but a quick test yielded this:
>
> Count: 54, 1000x 5.656045 sec -- prefetch & caches
> on
> Count: 54, 1000x 7.241642 sec -- prefetch on,
> caches off
> Count: 54, 1000x 8.561787 sec -- prefetch off,
> caches off
>
> There's about a 15% time reduction between w/o and
> w/ prefetch, and
> about 21% for caches being enabled.
>
> None of this seems to meet with the marketing
> speak that says that
> prefetch should make it behave like zero wait
> state, but It's
> difficult to know if what we see is related to a
> hardware
> implementation deficiency or the nature of the
> test case.
>
> > p.
> >
> >
> > --
> > Žijte život gangstera ve velkém městě. Plňte
> > mise pro kmotra,
> > > bojujte s ostatními hráči z celého světa.
> > Zahrajte si hru Mafia na
> > > http://web.volny.cz/data/click.php?id=1305
> >
>


--
Videokurzy MS Office zdarma! Portál VOLNÝ.cz přináší online výuková
videa, která vás rychle, názorně a zábavnou formou naučí ovládat
programy Excel, Word a PowerPoint. Seriál najdete na
http://web.volny.cz/data/click.php?id=1293


_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on

f100 f4_p f4_g
429800 1286180 1367420
501116 1669036 1844568
689800 2022172 2151998
1106436 3010608 2685712
642236 1799912 1886020

2.993 3.182
3.331 3.681
2.932 3.120
2.721 2.427
2.803 2.937


----- PŮVODNÍ ZPRÁVA -----
Od: "pito" <[hidden email]>
Komu: [hidden email], [hidden email],
[hidden email]
Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings - flash prefetch
on
Datum: 29.11.2011 - 21:09:07

> For the analysis:
> This is from the chibios rtos benchmark I ran on
> stm32VL discoboard,
> stm32f100 overclocked to 56MHz, 0(zero)ws by
> default:
> ----------------------------------------------------------------------------
> > --- Test Case 11.8 (Benchmark, round robin context
> switching)
> --- Score : 429800 ctxswc/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.9 (Benchmark, I/O Queues
> throughput)
> --- Score : 501116 bytes/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.10 (Benchmark, virtual timers
> set/reset)
> --- Score : 689800 timers/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.11 (Benchmark, semaphores
> wait/signal)
> --- Score : 1106436 wait+signal/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.12 (Benchmark, mutexes
> lock/unlock)
> --- Score : 642236 lock+unlock/S
> --- Result: SUCCESS
>
> This is from chibios rtos test I ran on
> stm32f4discoboard default
> chibios settings (settings most probably 168MHz,
> 5ws, pref/caches
> on):
> ----------------------------------------------------------------------------
> > --- Test Case 11.8 (Benchmark, round robin context
> switching)
> --- Score : 1286180 ctxswc/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.9 (Benchmark, I/O Queues
> throughput)
> --- Score : 1669036 bytes/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.10 (Benchmark, virtual timers
> set/reset)
> --- Score : 2022172 timers/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.11 (Benchmark, semaphores
> wait/signal)
> --- Score : 3010608 wait+signal/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.12 (Benchmark, mutexes
> lock/unlock)
> --- Score : 1799912 lock+unlock/S
> --- Result: SUCCESS
>
> This is what Giovanni has run on his
> stm32f4discobard(a different
> compiler):
> ----------------------------------------------------------------------------
> > --- Test Case 11.8 (Benchmark, round robin context
> switching)
> --- Score : 1367420 ctxswc/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.9 (Benchmark, I/O Queues
> throughput)
> --- Score : 1844568 bytes/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.10 (Benchmark, virtual timers
> set/reset)
> --- Score : 2151998 timers/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.11 (Benchmark, semaphores
> wait/signal)
> --- Score : 2685712 wait+signal/S
> --- Result: SUCCESS
> ----------------------------------------------------------------------------
> > --- Test Case 11.12 (Benchmark, mutexes
> lock/unlock)
> --- Score : 1886020 lock+unlock/S
> --- Result: SUCCESS
>
> The stm32f100(@56MHz,0ws) against
> stm32f4(168MHz,5ws,ART) - it seems
> the results are ~1:3 (56/168 0ws/0ws).
> You may double check..
> p.
>
> ----- PŮVODNÍ ZPRÁVA -----
> Od: "James Snyder" <[hidden email]>
> Komu: "pito" <[hidden email]>
> Předmět: Re: [eLua-dev] STM32f4DSCY - MCU settings
> - flash prefetch
> on
> Datum: 29.11.2011 - 20:32:55
>
> > 2011/11/29 pito <[hidden email]>:
> > >> Also I can confirm the ~13% performance
> > >> difference
> > >> >> on my own hardware,
> > >
> > > I am still not sure the prefetch cache works
> > > full speed as the 13%
> > > > we measured is not much, indeed (but it
> > > > could be
> > > > > > masked by some
> > > > processes within eLua however). When we did
> > > "prefetch on" on pic32mx
> > > > (retrobsd) the filesystem performance went
> > > > up by
> > > > > > ~30% (where the
> > > > most time we wait on sdcard's responses..)
> >
> > I think it would be helpful to try one or more
> > of
> > > the following:
> > 1) Try this on one of ST's other MCUs that has
> > known working prefetch
> > and toggle it
> >
> > 2) Simplify the test case to sequential code
> > that
> > > definitely doesn't
> > branch and would always pull instructions from
> > flash.  I haven't
> > evaluated what's going on at the C level in this
> > code to know how
> > effective prefetch _should_ be.
> >
> > 3) Lower the clock to one where you can run with
> > zero wait states and
> > compare performance between prefetch enabled
> > with
> > > wait states and zero
> > wait states. Given that a previous test in which
> > the wait state
> > setting was reduced resulted in a performance
> > gain
> > > might suggest that
> > this would yield a significant difference.
> >
> > Also note that this thing has an instruction
> > cache
> > > and data cache,
> > which may or may not be helping significantly
> > depending on the code
> > and how many hits and misses we might get.  I
> > haven't traced through
> > what Lua would be doing on the hardware level in
> > the inner loop of the
> > code example, but a quick test yielded this:
> >
> > Count: 54, 1000x 5.656045 sec -- prefetch &
> > caches
> > > on
> > Count: 54, 1000x 7.241642 sec -- prefetch on,
> > caches off
> > Count: 54, 1000x 8.561787 sec -- prefetch off,
> > caches off
> >
> > There's about a 15% time reduction between w/o
> > and
> > > w/ prefetch, and
> > about 21% for caches being enabled.
> >
> > None of this seems to meet with the marketing
> > speak that says that
> > prefetch should make it behave like zero wait
> > state, but It's
> > difficult to know if what we see is related to a
> > hardware
> > implementation deficiency or the nature of the
> > test case.
> >
> > > p.
> > >
> > >
> > > --
> > > Žijte život gangstera ve velkém městě. Plňte
> > > mise pro kmotra,
> > > > bojujte s ostatními hráči z celého světa.
> > > Zahrajte si hru Mafia na
> > > > http://web.volny.cz/data/click.php?id=1305
> > >
> >
>
>
> --
> Videokurzy MS Office zdarma! Portál VOLNÝ.cz
> přináší online výuková
> videa, která vás rychle, názorně a zábavnou formou
> naučí ovládat
> programy Excel, Word a PowerPoint. Seriál najdete
> na
> http://web.volny.cz/data/click.php?id=1293
>
>
>


--
Žijte život gangstera ve velkém městě. Plňte mise pro kmotra,
bojujte s ostatními hráči z celého světa. Zahrajte si hru Mafia na
http://web.volny.cz/data/click.php?id=1305

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev
Pito Pito
Reply | Threaded
Open this post in threaded view
|

Re: STM32f4DSCY - MCU settings - flash prefetch on


The average is 1 : 3.012
I see a new benchmarks for f4 on Giovanni's page:
http://www.chibios.org/dokuwiki/doku.php?id=chibios:metrics
see "Latest Test Reports"..
You may compare with other mcu's there as well.
p.

> 2.993 3.182
> 3.331 3.681
> 2.932 3.120
> 2.721 2.427
> 2.803 2.937
>



--
VOLNÝ Klub, limitované kolekce značek jako North Face, Diesel a
další se slevami 30 - 80 %.
http://web.volny.cz/data/click.php?id=1306

_______________________________________________
eLua-dev mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/elua-dev