Page 1 of 1

Profiling

Posted: Wed Jan 12, 2011 2:35 pm
by Kweepa
I was thinking about adding a simple profiler extension to VICE. But before I do that, does anyone have a pointer to a ready-made solution?

My thought was to have a menu item to turn on and off profiling. When profiling:

static int clocksForInst[256];
static int clocks[NUM_CHUNKS][65536];
static int chunk = 0;
static int chunksize = 0;
#define CHUNK_SIZE 500000

clocks[chunk][pc] += clocksForInst[*pc];
++chunksize;
if (chunk >= CHUNK_SIZE) { ++chunk; chunksize = 0; }

And then use the cc65 map file to find the functions.
Obviously it wouldn't be hierarchical, but I can live with that.

Posted: Wed Jan 12, 2011 6:34 pm
by Mike
Kweepa wrote:static int clocks[NUM_CHUNKS][65536];
Such a big array should be allocated with malloc() on an 'int **' instead. What value range would be expected for NUM_CHUNKS?

Also, the logic behind your profiler escapes me a bit. Wouldn't it be sufficient to use something like 'bin[*pc]+=1.0;' on a single 64K array of doubles to gather statistical info about which instructions are called how often?

Another viable method for zeroing in on hot spots is activating the monitor with Alt-M for, say, 50 times. If there is a sub-routine with is called often, its address will show up prominently while the program is interrupted. Only when activity is correlated to the raster lines, this method doesn't work that good, because VICE enters the monitor only when a whole frame has been processed. But in that case, you could use the border colour to signal what routine is currently running - this would also work on a real VIC-20.

@mods: Could this thread, and this one, and this one please be moved into the 'Emulation and Cross Development' section?

Posted: Wed Jan 12, 2011 6:57 pm
by Boray
Mike wrote: @mods: Could this thread, and this one, and this one please be moved into the 'Emulation and Cross Development' section?
Done.

Posted: Wed Jan 12, 2011 7:00 pm
by Mike
Thanks! :)

Posted: Thu Jan 13, 2011 9:04 am
by Kweepa
For the non-pseudocode I'd probably allocate each chunk separately.

The idea is that each chunk represents about a second of run-time. These could then be combined later if you're not interested in the moment to moment timing. So MAX_CHUNKS would not really be constant, but depend on how long you ran the profile.

I have been using the raster timer, but it doesn't work very well for routines that last more than a frame, and it requires a lot of back and forth changing code.

Alt-M*50? I'd rather write the profiler :)

Posted: Thu Jan 13, 2011 1:41 pm
by Mike
Kweepa wrote:I have been using the raster timer, but it doesn't work very well for routines that last more than a frame,
Agreed.
and it requires a lot of back and forth changing code.
Why? That's what the preprocessor in C is for. Not only for defining constants - you can also compile code conditionally.
Alt-M*50? I'd rather write the profiler :)
That might also depend on one's habit how often the object file is (or needs to be) rebuilt.

In MINIPAINT, about the only two routines which are explicitly tuned for speed are the cursor display and the zoom routine, the latter which spends most of its time (~70%) in this tight loop:

Code: Select all

.4EF2  A5 A6     LDA $A6
.4EF4  06 A8     ASL $A8
.4EF6  2A        ROL A
.4EF7  06 A8     ASL $A8
.4EF9  2A        ROL A
.4EFA  06 A7     ASL $A7
.4EFC  2A        ROL A
.4EFD  06 A7     ASL $A7
.4EFF  2A        ROL A
.4F00  91 FB     STA ($FB),Y
.4F02  A5 A5     LDA $A5
.4F04  91 FD     STA ($FD),Y
.4F06  C8        INY
.4F07  C0 04     CPY #$04
.4F09  90 E7     BCC $4EF2
... which plots four high-res or two multi-colour 'chunky' pixels in each of its four iterations. A complete update of the zoomed view only needs 25 milliseconds.

Most other routines in MINIPAINT are just written in a way to get the job done. ;)

Posted: Wed Jan 26, 2011 3:00 pm
by Kananga
Incidentally, I am also working on a profiler for vic20emu that is integrated into the debugger. It shows execution count, cycles and average cycles in a tooltip for each code line:

Image

(Do you recognise the code in the debugger window, Mike? :) )

Posted: Thu Jan 27, 2011 1:06 am
by Kweepa
Awesome.
I haven't had time to do any work on my profiler. Is vic20emu something I can configure for full memory plus disk drive?

Posted: Thu Jan 27, 2011 1:23 am
by Kananga
Kweepa wrote:I haven't had time to do any work on my profiler. Is vic20emu something I can configure for full memory plus disk drive?
Full memory - yes (even FE3 superram mode works).
Disc support is also work in progress. You can load the directory ("$"), but the IEC emulation is lacking file load/save support. It is not difficult to implement, but sometimes wrinting or reading the IEC bus fails for reasons unknown to me.

Posted: Thu Jan 27, 2011 5:07 pm
by Mike
Kananga wrote:Do you recognise the code in the debugger window, Mike? :)
I do, for sure.

One of my tools includes this code snippet with all files it writes to mass storage. ;)

Posted: Thu Jan 27, 2011 5:22 pm
by Kananga
Mike wrote:
Kananga wrote:Do you recognise the code in the debugger window, Mike? :)
I do, for sure.

One of my tools includes this code snippet with all files it writes to mass storage. ;)
Right you may be!
Actually, I just ran the smurf prg, because it's my favourite.