My Favorite Instruction of 2023

Basic and Machine Language

Moderator: Moderators

Post Reply
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

My Favorite Instruction of 2023

Post by chysn »

In my 6502 projects this year, I made liberal use of an instruction that I've heretofore ignored:

Code: Select all

ldy TABLE_IX
ldx Data,y
It's the absolute,Y mode of LDX! Something might seem weird about loading an index register with an indexed table member. But I started out doing something like this

Code: Select all

ldy TABLE_IX
lda Data,y
tax
There's no need to involve the Accumulator here.

The beauty of this instruction (and its LDY counterpart) is that it acts as a translator between data tables. It's like Inception, dropping down into another level of reality. I used it for interfacing with other hardware, specifically a musical instrument that has its own way of organizing tabular data. It's a table of sixty (or thereabouts) data points, of about a dozen different types and ranges.

So when I designed my own software's tables, they needed to reference the instrument's native data layout. And it's LDX absolute,Y that glues the instrument's data structures and my data structures together!

Imagine a representation of a target system's data in memory, with each index in the table representing a specific attribute. Oscillator A frequency at index 0, Oscillator B frequency at index 1, and so on.

I want to create an abstraction for the idea of "Oscillator frequency," so I have a Types table in the EEPROM that stores the attributes for this kind of field (high/low ranges, pointer to a draw subroutine, and other stuff). Then, I have a third level that specifies physical attributes of each field: which page it's on, where it's placed on the screen, AND (this is the key part) a reference index to the target system's data table. If I know which field the user is currently editing, then sending the real data to a type-specific interface routine goes like this:

Code: Select all

ldy USER_CURRENT_FIELD_INDEX
ldx TARGET_DATA_INDICES,y
lda DRAW_SUBROUTINE_POINTERS_HIGH,y
pha
lda DRAW_SUBROUTINE_POINTERS_LOW,y
pha
lda NATIVE_DATA_TABLE,x
rts
Updating the native data table works in much the same way, allowing me to check ranges based on the type table and then write to the location gleaned via LDX absolute,Y.

It's a great instruction when you need to create more-complex-than-usual data structures.
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5

WIP: MIDIcast BASIC extension

he/him/his
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: My Favorite Instruction of 2023

Post by Mike »

My first thought when I read the topic title was you probably meant the BIT instruction, but then we two had already covered that in 2020. :mrgreen:

These are that kind of instructions I only remember about when I actually need them, to put it this way. It's a shame though they are missing the corresponding store instructions for absolute indexed.
chysn wrote:The beauty of [LDX ABS,Y] (and its LDY counterpart) is that it acts as a translator between data tables.
Yeah, LDY ABS,X features quite prominently in my port of Killer Comet to 'fold' two complicated nested FOR loops constructs in BASIC ...

Code: Select all

25 F=1:REM ERA METEOR
26 FORE=0TO44STEP22
27 FORD=C+ETOC+3+E:POKED,32 :F=F+1:NEXT:NEXT
[...]
29 F=1:REM DRAW METEOR
30 [...]:FORE=0TO44STEP22
35 FORD=C+ETOC+3+E:POKED,B(F):F=F+1:NEXT:NEXT
... into a six-instruction copy loop ...

Code: Select all

.119A LDX #$0F
.119C LDY $101F,X ; <-- !!!
.119F LDA $102F,X
.11A2 STA ($FB),Y
.11A4 DEX
.11A5 BNE $119C
... using these two tables to index the meteor 'contents' into the screen:

Code: Select all

>1010 20 A0 A0 A0 A0
>1015 20 A0 A0 A0 A0
>101A 20 A0 A0 A0 A0

>1020 00 01 02 03 04
>1025 16 17 18 19 1A
>102A 2C 2D 2E 2F 30
...

If you can spare 256 bytes for an identity table (all values $00..$FF stored in that order), you can use LDX table,Y or LDY table,X to synthesize the 'missing' TYX or TXY instructions ... ;)
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: My Favorite Instruction of 2023

Post by chysn »

Mike wrote: Mon Jan 01, 2024 6:59 am It's a shame though they are missing the corresponding store instructions for absolute indexed.
I haven't acutely felt the absence of the STi (i=some index register). Since LDi ABS,i frees up the Accumulator, it's more natural to use either STA (ZP),Y (as in your Comet loop) or STA ABS,X for these table translations. I did vaguely wish for it recently, but I don't remember the context.
If you can spare 256 bytes for an identity table (all values $00..$FF stored in that order), you can use LDX table,Y or LDY table,X to synthesize the 'missing' TYX or TXY instructions ... ;)
If there was already such a table in ROM for some reason*, that would be cool. I never have a spare 256 bytes!

________________
* Like finding ROM bytes to synthesize BIT #, or using character ROM for bit value positions (0, 2, 4, 8...).
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: My Favorite Instruction of 2023

Post by Mike »

chysn wrote:I never have a spare 256 bytes!
My reaction was roughly similar when I first read about it on nesdev.org - but one never knows ... perhaps some day this idea might come handy.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: My Favorite Instruction of 2023

Post by chysn »

Mike wrote: Mon Jan 01, 2024 9:59 am
chysn wrote:I never have a spare 256 bytes!
My reaction was roughly similar when I first read about it on nesdev.org - but one never knows ... perhaps some day this idea might come handy.
The thing that comes to mind is the kind of stuff you do with raster timing, where cycles need to be exact. But you seem to have done just fine without this technique.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: My Favorite Instruction of 2023

Post by chysn »

It's fun to think about how long a 6502 program would have to be to make up the 256 bytes and get a savings.

I mean, I'd love CMPX and CMPY, those are near the top of my wish list. But there'd need to be so many of them to make an identity table worthwhile.
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: My Favorite Instruction of 2023

Post by Mike »

chysn wrote:The thing that comes to mind is the kind of stuff you do with raster timing, where cycles need to be exact. But you seem to have done just fine without this technique.
If you refer to that example in my recent "raster paper", the heavy processing that takes place in the inner loops is somewhat untypical:

Code: Select all

[...]
.Frame1
 LDA &900E                                ;    Load current aux. colour/volume register,
 EOR aux_extbrd_1,Y                       ;    change the aux. colour,
 AND #&0F                                 ;    but keep the volume
 EOR aux_extbrd_1,Y                       ;    and
 STA &FB                                  ;    store in $FB for later use.
 LDA aux_extbrd_1,Y                       ;    Calculate next combination
 EOR bck_intbrd_1,Y                       ;    of exterior border colour
 AND #&0F                                 ;    plus background colour from ...
 STX &900F                                ; ** re-instate exterior border colour at right edge of display window (keeping the current background colour)
 EOR bck_intbrd_1,Y                       ;    ... the table data and 
 TAX                                      ;    keep in X for later use.
 LDA &FB                                  ;    Load $FB and
 STA &900E                                ; ** write $FB as new value of aux. colour/volume register (immediately before horizontal retrace).
 STX &900F                                ; ** Change to new exterior border colour and background colour during horizontal retrace.
 ]
 IF NOT ntsc THEN [OPT pass:CMP (&00,X):] ;    6 cycles extra delay for PAL
[OPT pass
 LDA bck_intbrd_1,Y                       ;    Load combination of background colour and 'interior border colour' for the %01 multicolour pixels and
 STA &900F                                ; ** write background/border register at left edge of display window.
 NOP                                      ;    Not much leeway here.
 INY                                      ;    Count ...
 CPY #&C1                                 ;    ... 192+1 ...
 BCC Frame1                               ;    ... lines.
 [...]
The "**" markers show where the stores to the VIC registers happen.

If anything, that might have called for LD% ABS,% (% := X | Y) ... but what I actually wanted was to keep the colour tables compact. The exterior border colour is stored with the auxiliary colour, and the 'interior border colour' which then actually serves as independent colour source for the %01 multicolour pixels is stored with the background colour. All those EOR and AND instructions merely serve to mask out the necessary bits for the colour registers while still keeping the volume register intact. Indeed I ran out of registers here, which explains the presence of STA $FB and LDA $FB ($FB is preserved on stack during the IRQ).

Most other cycle exact raster code that tokra and I had written so far looks more like a data pump, which mostly just does LDA/STA to shuffle data from cartridge memory to either screen RAM, colour RAM or VIC registers. In the extreme cases, that code is fully unrolled, with immediate LDA instructions (instead of reading off tables) to provide the relevant data.
I'd love CMPX and CMPY, those are near the top of my wish list.
Comparing two registers can also use either a zeropage temporary or self-modifying code for 6 cycles in each version.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: My Favorite Instruction of 2023

Post by chysn »

Mike wrote: Mon Jan 01, 2024 11:12 am
I'd love CMPX and CMPY, those are near the top of my wish list.
Comparing two registers can also use either a zeropage temporary or self-modifying code for 6 cycles in each version.
ZP is the approach I take, and it's what you'd need to use as the standard for determining whether an identity table is worthwhile. The default ZP usage is four bytes and six cycles, whereas the identity table is three bytes and 4-5 cycles. So you'd need to find 256 uses of the identity table to break even in memory. And that's a tall order. I'd need to have used identity tables in like 10% of instructions in my most recent project.

I guess the "256 bytes to spare" really comes into play. But I have a strong tendency to go right to the limit to add features, subtle niceties, and clear instructions/labels. If I have any "extra" memory, it's probably going into text.

I'm resolved to leave myself a couple hundred bytes right now to accommodate future firmware updates to the instrument. If I must, I can make the Help Screen more terse.
If you refer to that example in my recent "raster paper", the heavy processing that takes place in the inner loops is somewhat untypical
That's the one!
Post Reply