Hello everybody,
I've programmed a lua intepretor on a 16 bits system, and I find a big optimisation to read opcodes. I saw with obj2asm how compilers (I tested 3: digital mars, borland C++, open watcom) code to read the opcodes, and it wasn't optimised at all for 16 bits system. With my code, the opcodes are read more than 10 times faster on a 16 bits system (exept if your compilers are better than mines) My lua interpretor runs 2 times faster! To have an optimised code outpout: In lopcodes.h struct f16 { unsigned int word1; // instead of reading 1 long (2 16 bits registers 'emulating' like 1 32 bits (non optimised at all because it use a loop to scrolling one per one bits) unsigned int word2; // it divide the opcode in two 16 bits registers with one scrolling per register (no loops) }; #define GETARG_A(i) (cast(int, ((((struct f16 *)&i)-> word1)>>POS_A) & MASK1(SIZE_A,0))) #define GETARG_B(i) cast(int, ((((struct f16 *)&i)-> word2)>>(POS_B-16) & MASK1(SIZE_B,0))) #define GETARG_C(i) (cast(int, ( ( (((struct f16 *)&i)-> word2)<<(16-POS_C)) | (( ((struct f16 *)&i)-> word1)>>(POS_C)) ) & MASK1(SIZE_C,0))) #define GETARG_Bx(i) (cast(int, ( ( (((struct f16 *)&i)-> word2)<<(16-POS_C)) | (( ((struct f16 *)&i)-> word1)>>(POS_C)) ) & MASK1(SIZE_Bx,0))) #define GETARG_sBx(i) (GETARG_Bx(i)-MAXARG_sBx) I posted this code to le you increase perfomance of eLua for 16 bits system. If you use the code I would like to be inform. vebveb _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Bittencourt |
Hi vebveb!
Can you please explain this "cast" function that you are using? --Pedro Bittencourt On Mon, Aug 23, 2010 at 7:38 AM, veb veb <[hidden email]> wrote:
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
I'm guessing it's just this:
#define cast (type, arg) (type)(arg) Thanks for this. I'll try to run your patch on some of our Thumb(2) ports and see how it goes. Unfortunately we don't have a benchmark system in place right now, but we can improvise something. Best, Bogdan On Mon, Aug 23, 2010 at 2:50 PM, Pedro Bittencourt <[hidden email]> wrote: > Hi vebveb! > > Can you please explain this "cast" function that you are using? > > --Pedro Bittencourt > > > On Mon, Aug 23, 2010 at 7:38 AM, veb veb <[hidden email]> wrote: >> >> Hello everybody, >> >> I've programmed a lua intepretor on a 16 bits system, and I find a big >> optimisation to read opcodes. >> >> I saw with obj2asm how compilers (I tested 3: digital mars, borland C++, >> open watcom) code to read the opcodes, and it wasn't optimised at all for 16 >> bits system. >> >> With my code, the opcodes are read more than 10 times faster on a 16 bits >> system (exept if your compilers are better than mines) >> My lua interpretor runs 2 times faster! >> >> >> To have an optimised code outpout: >> In lopcodes.h >> >> >> struct f16 >> { >> unsigned int word1; // instead of reading 1 long (2 16 bits >> registers 'emulating' like 1 32 bits (non optimised at all because it use a >> loop to scrolling one per one bits) >> unsigned int word2; // it divide the opcode in two 16 bits >> registers with one scrolling per register (no loops) >> }; >> >> #define GETARG_A(i) (cast(int, ((((struct f16 *)&i)-> word1)>>POS_A) & >> MASK1(SIZE_A,0))) >> >> #define GETARG_B(i) cast(int, ((((struct f16 *)&i)-> word2)>>(POS_B-16) >> & MASK1(SIZE_B,0))) >> >> #define GETARG_C(i) (cast(int, ( ( (((struct f16 *)&i)-> >> word2)<<(16-POS_C)) | (( ((struct f16 *)&i)-> word1)>>(POS_C)) ) & >> MASK1(SIZE_C,0))) >> >> #define GETARG_Bx(i) (cast(int, ( ( (((struct f16 *)&i)-> >> word2)<<(16-POS_C)) | (( ((struct f16 *)&i)-> word1)>>(POS_C)) ) & >> MASK1(SIZE_Bx,0))) >> >> >> #define GETARG_sBx(i) (GETARG_Bx(i)-MAXARG_sBx) >> >> I posted this code to le you increase perfomance of eLua for 16 bits >> system. >> >> If you use the code I would like to be inform. >> >> vebveb >> >> >> _______________________________________________ >> eLua-dev mailing list >> [hidden email] >> https://lists.berlios.de/mailman/listinfo/elua-dev >> > > > _______________________________________________ > eLua-dev mailing list > [hidden email] > https://lists.berlios.de/mailman/listinfo/elua-dev > > eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
In reply to this post by Bittencourt
Hello,
To explain how it read an opcode, I will explain with an example: On A 16 bits system: Normal GETARG_A: #define GETARG_A(i) (cast(int, ((i)>>POS_A) & MASK1(SIZE_A,0))) -> #define GETARG_A(i) (int) ((i)>>8) & 0xFF) So, if we have (in binary): 11100011 11011011 10011001 10001000 it is rotated bit per bit to: (this part is slow, because it use a loop with a scrolling 1 per 1 bit (8*2 scrolling here) 00000000 11100011 11011011 10011001 ((i)>>8) and it becomes: 00000000 00000000 00000000 10011001 (& 0xFF) and it is cast to an int (16 bits) 00000000 10011001 Optimised GETARG_A: #define GETARG_A(i) (cast(int, ((((struct f16 *)&i)-> word1)>>POS_A) & MASK1(SIZE_A,0))) -> #define GETARG_A(i) (int) (((register_right)>>8) & 0xFF) So, if we have (in binary): 11100011 11011011 10011001 10001000 it is rotated with one scrolling only: (register_right is 16 bits) 00000000 10011001 ((register_right)>>8) and it becomes: 00000000 10011001 (& 0xFF) (we don't need it for GETARG_A) and it is cast to an int if it wasn't (16 bits) 00000000 10011001 (we don't need it for GETARG_A) Instead of scrolling everything in an slow way, we scroll in one time only what we need. It is fast the same way for the other GETARG_B, ... (but a bit more complex because we use register_left and register_right) From: [hidden email] Date: Mon, 23 Aug 2010 08:50:35 -0300 To: [hidden email] Subject: Re: [eLua-dev] optimisation for 16 bits system Hi vebveb! Can you please explain this "cast" function that you are using? --Pedro Bittencourt On Mon, Aug 23, 2010 at 7:38 AM, veb veb <[hidden email]> wrote:
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Bittencourt |
Ok Veb,
Thanks for the explanation =) But I was asking only about the cast, if was just the define Bogdan guessed or something more elaborated. thanks, --Pedro Bittencourt On Mon, Aug 23, 2010 at 9:24 AM, veb veb <[hidden email]> wrote:
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
No, the cast( int, ...) is just a cast to an int (it returns an int)
From: [hidden email] Date: Mon, 23 Aug 2010 09:33:18 -0300 To: [hidden email] Subject: Re: [eLua-dev] optimisation for 16 bits system Ok Veb, Thanks for the explanation =) But I was asking only about the cast, if was just the define Bogdan guessed or something more elaborated. thanks, --Pedro Bittencourt On Mon, Aug 23, 2010 at 9:24 AM, veb veb <[hidden email]> wrote:
_______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev _______________________________________________ eLua-dev mailing list [hidden email] https://lists.berlios.de/mailman/listinfo/elua-dev |
Free forum by Nabble | Edit this page |