Denial

Posted: **Sat Feb 13, 2021 6:23 pm**

Is this the best-possible subroutine for flipping a single byte? By "flip" I mean that ABCDEFGH becomes HGFEDCBA and by "best" I mean the smallest-possible code:

Code: Select all

; Flip A and return flipped value in A
Flip:   sta ZP
        ldx #$08
-loop:  lsr ZP
        rol a
        dex
        bne loop
        rts

Posted: **Sun Feb 14, 2021 2:53 am**

How about:

Code: Select all

; Flip A and return flipped value in A
Flip:   sta ZP
        lda #$01
-loop:  lsr ZP
        rol a
        bcc loop
        rts

This is 1 byte shorter, 16 cycles faster, and doesn't touch the X-register.

Posted: **Sun Feb 14, 2021 6:14 am**

tlr wrote: ↑Sun Feb 14, 2021 2:53 am This is 1 byte shorter, 16 cycles faster, and doesn't touch the X-register.

That’s awesome, thank you!

And this technique, which is new for me, goes beyond this project; it can be used to constrain accumulator operations for lots of things.

Posted: **Sun Feb 14, 2021 7:26 am**

Wasnt there a compo with this theme a while ago.... on F64 perhaps? mmmmh

Posted: **Sun Feb 14, 2021 7:56 am**

chysn wrote: ↑Sun Feb 14, 2021 6:14 am
tlr wrote: ↑Sun Feb 14, 2021 2:53 am This is 1 byte shorter, 16 cycles faster, and doesn't touch the X-register.
That’s awesome, thank you!

And this technique, which is new for me, goes beyond this project; it can be used to constrain accumulator operations for lots of things.

There's also a variant of this:

Code: Select all

; Flip A and return flipped value in A
Flip:   sec
        rol a
-loop:  ror ZP
        asl a
        bne loop
        lda ZP
        rts

This is two cycles slower than the previous version but leaves the result in ZP so the lda can be omitted in cases it would be stored there anyway.

Posted: **Sun Feb 14, 2021 10:33 am**

tlr wrote: ↑Sun Feb 14, 2021 7:56 am This is two cycles slower than the previous version but leaves the result in ZP so the lda can be omitted in cases it would be stored there anyway.

In my game, It's likely that I will work in-place, probably via ROR absolute,X. Basically, the idea is to make a contiguous set of ten custom characters (80 bytes) turn the other direction. This construction lets me use X as a byte index instead of a bit count.

groepaz wrote: ↑Sun Feb 14, 2021 7:26 am Wasnt there a compo with this theme a while ago.... on F64 perhaps? mmmmh

It would be an interesting read. I did a Google search before posting my code, and didn't find anything smaller than mine. And right now, I can't imagine anything being smaller than @tlr's.

Posted: **Sun Feb 14, 2021 7:43 pm**

Not exactly the same thing, but has some similar requirements but flipping an 8x8 bit matrix along the diagonal is discussed here: http://forum.6502.org/viewtopic.php?f=2&t=6412

Posted: **Mon Feb 15, 2021 2:24 am**

If flipping a byte formed a time-critical part of preparing a display, I'd still be willing to throw a 256 byte table against it.

chysn wrote:And this technique, which is new for me, goes beyond this project; it can be used to constrain accumulator operations for lots of things.

This 'guard bit' technique is also used in the BASIC interpreter while multiplying two float mantissas, see the loop $DA5E..$DA89.

Posted: **Mon Feb 15, 2021 12:32 pm**

Mike wrote: ↑Mon Feb 15, 2021 2:24 am If flipping a byte formed a time-critical part of preparing a display, I'd still be willing to throw a 256 byte table against it.

I see what you mean. The operation I have in mind will take .0168 seconds, which would be rough in some cases. For my project, the switch is happening between half-innings in a baseball game. It's the part of the game where the spectators get up to use the restroom and get a beer. I'll probably even have to add another second of artificial delay. If it was time-critical, I'd probably just store my flipped characters in memory.

Mike wrote: ↑Mon Feb 15, 2021 2:24 am
chysn wrote:And this technique, which is new for me, goes beyond this project; it can be used to constrain accumulator operations for lots of things.
This 'guard bit' technique is also used in the BASIC interpreter while multiplying two float mantissas, see the loop $DA5E..$DA89.

I admire the usage here because it's totally organic. The iterator and the result are one value, and it's very elegant. I don't really understand this usage in the BASIC interpreter. It's storing the iterator accumulator in Y while the accumulator is used for other stuff, and then from Y back to accumulator to do the shift and the carry check. In other words, I don't understand why it's

Code: Select all

lda #$80
-loop tay
;yada yada
tya
lsr
bne loop

versus just regular old

Code: Select all

ldy #$08
-loop ;yada yada
dey
bne loop

Edit: Well, yeah, I get it on closer examination. Carry is checked first and the accumulator isn't always thrown out like I originally thought it was.

Denial

Flipping a Byte

Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte

Re: Flipping a Byte