It would be nice if you would give short code examples within a code tag. Thank you in advance
ML Optimization discussion (split from: ROM calls and other tricks)
Moderator: Moderators
ML Optimization discussion (split from: ROM calls and other tricks)
viewtopic.php?t=676&start=30
It would be nice if you would give short code examples within a code tag. Thank you in advance
It would be nice if you would give short code examples within a code tag. Thank you in advance
Last edited by Mike on Sat Oct 24, 2020 2:36 pm, edited 1 time in total.
Valid rule today as earlier: 1 Byte = 8 Bits
-._/classes instead of masses\_.-
-._/classes instead of masses\_.-
- J.E.E.K.
- Vic 20 Drifter
- Posts: 23
- Joined: Wed Jan 25, 2017 12:31 pm
- Website: http://klasek.at/8bit
- Location: AT
Re: ML Optimization discussion (split from: ROM calls and other tricks)
For all points or just the ones I added?
Last edited by Mike on Sat Oct 24, 2020 2:38 pm, edited 1 time in total.
Re: ML Optimization discussion (split from: ROM calls and other tricks)
Maybe for all the points.
Valid rule today as earlier: 1 Byte = 8 Bits
-._/classes instead of masses\_.-
-._/classes instead of masses\_.-
- chysn
- Vic 20 Scientist
- Posts: 1204
- Joined: Tue Oct 22, 2019 12:36 pm
- Website: http://www.beigemaze.com
- Location: Michigan, USA
- Occupation: Software Dev Manager
Re: ML Optimization tips and tricks
It seems like ML optimization and ROM tips could be separate topics.
One solid optimization that I always try to look out for is when a subroutine sets up data and then ends in a subroutine call, like this:
JSR is really just a JMP that puts a return address (-1) on the stack, and RTS is really just a JMP that sets PC to a stack address+1; so you save one byte of code, two bytes of stack space, and three cycles by ending such a subroutine with
One solid optimization that I always try to look out for is when a subroutine sets up data and then ends in a subroutine call, like this:
Code: Select all
sub: ; Subroutine in which something is done
jsr $cb1e ; and then some call is made to act on data
rts
Code: Select all
sub: ; Subroutine in which something is done
jmp $cb1e ; etc.
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5
WIP: MIDIcast BASIC extension
he/him/his
WIP: MIDIcast BASIC extension
he/him/his
- Mike
- Herr VC
- Posts: 5130
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: ML Optimization discussion (split from: ROM calls and other tricks)
IMO, the list of techniques that wimoos posted (link) is pretty much self-evident. It is you who would have to come up with a specific question about an own coding problem where you don't know how to apply one of the techniques listed, then you stand good chances that someone here will show you how. Conversely I doubt a random collection of code examples explaining these techniques would be helpful in case.Noizer wrote:It seems Wimoos hasn‘t time left to further explain his points? Oh, probably you have to look at his last posting
- chysn
- Vic 20 Scientist
- Posts: 1204
- Joined: Tue Oct 22, 2019 12:36 pm
- Website: http://www.beigemaze.com
- Location: Michigan, USA
- Occupation: Software Dev Manager
Re: ML Optimization tips and tricks
Yes, that's right, thanks!
Last edited by chysn on Thu Nov 05, 2020 7:24 am, edited 1 time in total.
- Mike
- Herr VC
- Posts: 5130
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: ML Optimization discussion (split from: ROM calls and other tricks)
Wimoos already mentioned tail call optimization in his post: viewtopic.php?t=676&start=22 ... and to quote from there:
You'll have to be wary about two cases, where this technique cannot (easily) be applied:
- the RTS may itself be a labeled (jump-)target, or
- the called sub-routine uses the return address on stack. PRIMM is a good example. If you call PRIMM with JMP, wrong data will be printed (likely, from the return address one level higher) and same return address will be advanced somewhat uncontrolled, likely leading to a crash.
Re: ML Optimization tips and tricks
J.E.E.K. wrote: ↑Sat Oct 24, 2020 9:02 amThis list could be extended by
- Hold flag in bit 7 of a location, so you can check the flag by "BIT flag" followed by "BMI/BPL destination" keeping A untouched, clear the flag with "LSR flag". If carry is set by an condition before, use "ROR flag" to set the flag.
- Operations without temporarily save the accumulator to memory:
- To subtract the value in A from location's value ( (mem)-A ) use EOR #$FF; SEC; ADC mem (negate A with two's complement and simply add).
- Merge some bits from A into a memory location using EOR mem; AND #mask_bits_taken_from_acc; EOR mem; STA mem.
Valid rule today as earlier: 1 Byte = 8 Bits
-._/classes instead of masses\_.-
-._/classes instead of masses\_.-
- chysn
- Vic 20 Scientist
- Posts: 1204
- Joined: Tue Oct 22, 2019 12:36 pm
- Website: http://www.beigemaze.com
- Location: Michigan, USA
- Occupation: Software Dev Manager
Re: ML Optimization tips and tricks
Set a flag (3 or 4 bytes)Noizer wrote: ↑Sat Oct 31, 2020 5:40 am[*]Hold flag in bit 7 of a location, so you can check the flag by "BIT flag" followed by "BMI/BPL destination" keeping A untouched, clear the flag with "LSR flag". If carry is set by an condition before, use "ROR flag" to set the flag.I‘m still waiting for some code examples, any chance ?
Code: Select all
sec
ror flag
Code: Select all
bit flag
bmi if_set ; ~or~ bpl if_unset
Code: Select all
lsr flag
For the other bullet point, "Operations without temporarily save the accumulator to memory," the code is right there in the nested bullet points.
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5
WIP: MIDIcast BASIC extension
he/him/his
WIP: MIDIcast BASIC extension
he/him/his
Re: ML Optimization tips and tricks
Thanx for you efforts. I certainly will use some of the code snipplets as needed.
Now here are my different approaches on ML optimizations of MHO:
- Use of self modifying code -- load everything you have to process from outer loops
- Use of tables for all operations where
z = x operation y — why slowly calculate values if you can simple do a table lookup with at most two bytes indexing (x and y)
- Always starting to code keeping in mind execution time is not a bad choice — if you can do it fast, you can do it slow too
- avoiding of PHA / PLA, if you can use zeropage addresses for storing or indexing — saves cycles and values are still available, this beans counting is more annoying than the necessary and instructive cycle counting
- use of unintended opcodes, very handy and mind freeing, new coding possibilities with almost zero overhead — example Zeropage indirect addressing indexed in X
Valid rule today as earlier: 1 Byte = 8 Bits
-._/classes instead of masses\_.-
-._/classes instead of masses\_.-
- Mike
- Herr VC
- Posts: 5130
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: ML Optimization discussion (split from: ROM calls and other tricks)
Noizer, there is yet another dimension of optimization you left out in your list (BTW - been there, done that), and that is minimizing the time spent on writing the code and getting it right, to specification.
I'd presume quite a lot of fellow coders try to optimize their code for size or speed ... and never finish it.
Just to put things into perspective - I worked roughly 100 hours to implement MINIPAINT (spread over half a year in 2009, mostly on rainy afternoons and weekends), resulting in roughly 7 KB library code 'scripted' by a main program written in MINIGRAFIK-enhanced BASIC.
I optimized only a few routines of said library in any of the dimensions speed or size - the zoom routine of the main edit window and the cursor display are the most notable examples. The rest of the lot, around 50 JSR-callable sub-routines, is just straightforward code to get the job done.
So, what track record are you able to show? Finished programs I mean. No cheap talk.
I'd presume quite a lot of fellow coders try to optimize their code for size or speed ... and never finish it.
Just to put things into perspective - I worked roughly 100 hours to implement MINIPAINT (spread over half a year in 2009, mostly on rainy afternoons and weekends), resulting in roughly 7 KB library code 'scripted' by a main program written in MINIGRAFIK-enhanced BASIC.
I optimized only a few routines of said library in any of the dimensions speed or size - the zoom routine of the main edit window and the cursor display are the most notable examples. The rest of the lot, around 50 JSR-callable sub-routines, is just straightforward code to get the job done.
So, what track record are you able to show? Finished programs I mean. No cheap talk.
- chysn
- Vic 20 Scientist
- Posts: 1204
- Joined: Tue Oct 22, 2019 12:36 pm
- Website: http://www.beigemaze.com
- Location: Michigan, USA
- Occupation: Software Dev Manager
Re: ML Optimization discussion (split from: ROM calls and other tricks)
Most of the time, when we speak of optimization, we're talking about things that we do to improve code when it's already somewhat-functional. But a great deal of optimization is available in the design stage.
Consider a data structure for a game enemy that requires four properties: direction, x position, y position, hit points. It's tempting to put data related to an instance in consecutive memory locations.
The downside to this is that it can be costly to manage the index to the object, because there's math involved:
I've used this sort of data structure, which comes from my background in C, and my love of structs. Unfortunately, it's inefficient for 6502, and I've now switched to storing data with properties together instead of instances together:
And now all you need is an instance index to access properties
Notice that you do need to specify a maximum number of instances (MAX_OBJ) to allocate space. But in VIC-20 practice, this is easy enough.
Consider a data structure for a game enemy that requires four properties: direction, x position, y position, hit points. It's tempting to put data related to an instance in consecutive memory locations.
Code: Select all
OBJ_DATA = $03c0
DIRECTION = OBJ_DATA
X_POS = OBJ_DATA + 1
Y_POS = OBJ_DATA + 2
HP = OBJ_DATA + 3
Code: Select all
GetHP: asl ; Multiply object index by 4 to get the table index
asl
tax
lda HP,x
rts
Code: Select all
MAX_OBJ = 6
DIRECTION = $03c0
X_POS = DIRECTION + MAX_OBJ
Y_POS = X_POS + MAX_OBJ
HP = Y_POS + MAX_OBJ
Code: Select all
lda HP,x
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5
WIP: MIDIcast BASIC extension
he/him/his
WIP: MIDIcast BASIC extension
he/him/his
Re: ML Optimization discussion (split from: ROM calls and other tricks)
Dear Mike, some things just take time. And since it's a hobby, I alone decide when I invest time in it and when I want to make something public. I like to program cheaply in my quiet little room, I don't live for it and nobody pays me for it. When I feel like it, I write a little bit of my tiny experience in the world where you are and that's it.Mike wrote: ↑Wed Nov 04, 2020 3:56 pm Noizer, there is yet another dimension of optimization you left out in your list (BTW - been there, done that), and that is minimizing the time spent on writing the code and getting it right, to specification.
...
So, what track record are you able to show? Finished programs I mean. No cheap talk.
Valid rule today as earlier: 1 Byte = 8 Bits
-._/classes instead of masses\_.-
-._/classes instead of masses\_.-
- J.E.E.K.
- Vic 20 Drifter
- Posts: 23
- Joined: Wed Jan 25, 2017 12:31 pm
- Website: http://klasek.at/8bit
- Location: AT
Re: ML Optimization tips and tricks
Ad "subtraction":Noizer wrote: ↑Sat Oct 31, 2020 5:40 amJ.E.E.K. wrote: ↑Sat Oct 24, 2020 9:02 am [..]
- [..]
- Operations without temporarily save the accumulator to memory:
- To subtract the value in A from location's value ( (mem)-A ) use EOR #$FF; SEC; ADC mem (negate A with two's complement and simply add).
- Merge some bits from A into a memory location using EOR mem; AND #mask_bits_taken_from_acc; EOR mem; STA mem.
I‘m still waiting for some code examples, any chance ?
Request Memory from String Space (BASIC V2, C64)
The length of the request (bytes to allocate) is given in A. The heap pointer ($33/$34, the result is finally in A/Y) has to be moved to lower addresses:
Code: Select all
B4F7: 49 FF EOR #$FF
B4F9: 38 SEC
B4FA: 65 33 ADC $33 ; Pointer: Bottom of String space
B4FC: A4 34 LDY $34 ; Pointer: Bottom of String space
B4FE: B0 01 BCS $B501
B500: 88 DEY
In my Graphic-Extension code I am using a mode byte (savemo), which consists of bits from different sources:
Code: Select all
; put A's bit 4-7 into savemo
EOR savemo ; ********
AND #%11110000 ; ****0000
EOR savemo ; AAAAmmmm
STA savemo ;