Yes, as far as I know, 6510 basically more than 6502 with this feature (well, and also a 6 bit I/O port, on address 0/1 for data and direction). I'm not so smart with modern CMOS parts from WDC, but as far as I see, 65C816 wouldn't have the problem with the tri-state stuff. Also, it's interesting that though 65C02 is also available nowadays, it is simply not worth, in my opinion. Ie, the price is almost the same, and it seems 65C816 (in emulation mode) behaves like 6502 more, than the 65C02 itself ... The only problem with 65C816 - well, as far as I remember - that in every PHI1 state it emits the bank number on data bus (meant for the next PHI2 access as the high 8 bits of address besides of A0-A15 then, thus the 16Mbyte address space), which needs to be latched externally during PHI1 and shouldn't be allowed to "go" onto the main bus as it would cause bus conflict with eg VIC-I. My first idea was to remove the 6502 from VIC-20, and create a board with pins on its back, to be able to be inserted instead of the 6502. It should contain the glue logic, different pins, etc, whatever, and maybe also dedicated SRAM only available privately for the CPU itself. The SRAM on the VIC20 board would be only shared between CPU and VIC-I (btw, VIAs can be also bought - in CMOS version - capable of running even on 14Mhz clock - or higher as with 65C816 it's told that 14MHz is somewhat a "paper limit" only ... so VIAs - in theory - can be also replaced to modern parts, and the "new CPU" is free to access them in "fast mode" too ... anyway they are fed with the original clock in VIC20 mode, so it wouldn't introduce an incompatibility because of the different timer meanings etc then). I am not sure though, that the original "famous" bug of VIAs are fixed in the CMOS version which made the hardware shifting impossible on the serial IEC bus, by the way (it would be nice to allow to use fast/burst IEC between a modified VIC20 - with newer VIAs - and an 1570/71 which supports that feature with the C128, but it should work with everything has the bug-free shifting features what was the problematic parts of the original VIAs - again and always: as far as I know!).The main issue the developers of the VIC-20 originally faced, when putting 6502 and VIC-I into one machine was the 6502 could not tri-state its address bus. The VIC-I, on the other hand, plays nice.
About the RMW instructions: as far as I can see, the major compatibility problem with 4502/4510 (well, 65CE02 ... as for the "CPU core" name) would be used in the Commodore 65, that it does not write twice, and this feature is often used in softwares with I/O registers. I am not sure if it's a value for VIC20, or the 65C02 or 65C816 also includes this "problem" which would create incompatibility with the original 6502 though.
By the way, I always feel that 65xx handles stack and zero page (ok, "base page" more, if it can be relocated, it's not "zero" anymore) as a second class citizen. What I mean here, especially if you have plans to create a true multitasking OS (well, I have plan like this ...), it would be important to have own stack/zero page per processes. With 65C816 you can breathe, that you have relocatable stack/zero page, how nice. But again, like with the original 6502, these stuffs - again - meant not to be "important" ie, the CPU can address 16Mbyte of memory space, but still, zero page and stack can be in bank 0 (first 64K) only, so again the same problem, that it's kinda restricted


I'm not so sure about the original 6502. I guess it won't release the bus (as with high-Z state) even during the PHI1, that's why there are the LS buffers in VIC-20 to isolate from the bus?
Anyway. I think, the first try can be *only* to try to replace the 6502 with 65C816. Without any trick, more memory etc, and let's see if it works. Maybe almost only some wires needed and not so many active components. The bank number during PHI1 wouldn't be a problem, as the buffers isolates the CPU from the bus anyway during the PHI1. With only this project, we already able to utilize 65C816 extra features (even 16 bit mode, etc), though on the original clock only, restricted to 64Kbyte address space, etc. But softwares don't use illegal opcodes, including BASIC and KERNAL should work as nothing would happened ...
Hehe, yeah.You see, even with a simple computer like the VIC-20, there's already some engineering involved to get two bus masters working together as team. A simpler redesign might be doable with todays means - with dual-ported RAM. In that case, VIC could simply access its RAM as it wanted, and the CPU just can do the same. Also, the CPU can operate with another clock, and as VIC only reads, there can't be any write-collisions. Only register accesses need extra arbitration logic, i.e. the CPU must stretch its access cycle and allow VIC to snoop on the address bus for its own address range.






Yes, of course, a SID is wanted too, just in case


Btw, about the SID: I am not sure how it's out of specification (?), to include SID in VIC20, as the clock is somewhat higher than in a C64. Maybe it wouldn't be a problem (that it works) but the oscillator frequencies would be a bit out of the original values, compared to a C64, I think. My other idea was to use the OPL2 chip (eg, found in AdLib too, YM3812, if I remember correctly of the name). It wouldn't be a great surprise, as with C64 at least, the SFX sound expander cartridge used that (well, OPL1, just many people - ? - replaced that with OPL2). I've even written a C64 program which can play back OPL2 register events (recorded with DOSBOX eg from old PC games ....) on a C64. Though I could only test in VICE, as I don't have a real SFX cartridge (but it wouldn't be build one hard, as many AdLib and older SoundBlaster cards includes it with the right DAC chip alongside ...). Here is some kind of "demo" just in case, if you're interested, btw:
https://www.youtube.com/watch?v=umiL62CPObg