Assembly BLAS routines for C64 / VIC-20
Posted: Wed Sep 20, 2023 6:38 am
Hello,
i like to use the BLAS routines in my finite elements method.
The listings can be found here.
https://eprints.maths.manchester.ac.uk/2029/
Of course the code is for the C64 Basic, I'll do the translation into the VC20 basic later.
I think, hope, the code is correct typed in, but it shows an error.
Is anyone here willing to watch the program? I just can't find the error.
BR
Sven
i like to use the BLAS routines in my finite elements method.
The listings can be found here.
https://eprints.maths.manchester.ac.uk/2029/
Of course the code is for the C64 Basic, I'll do the translation into the VC20 basic later.
I think, hope, the code is correct typed in, but it shows an error.
Is anyone here willing to watch the program? I just can't find the error.
BR
Sven
Code: Select all
SGEFA AND SGESL WITH BASIC+ML
USES THE 4K extension for the ASSEMBLY CODE
$C000 TO $CFFF
; SAXPY = $C037 = 49207
; ISAMAX = $C072 = 49266
; SSCAL = $C0CC = 49356
5 isamax=49266:sscal=49356:saxpy=49207
10 printchr$(147)
20 print"read data..."
30 gosub1000
40 i%=0:j%=0:k%=0:n%=0:l%=0:kp1%=0:nm1%=0:info%=0:job%=0
50 t=0:s=0
80 input"n: ";n%
120 dim a(n%,n%),b(n%),x(n%),ipvt(n%)
160 for i=1 to n%:for j=1 to n%
162 a(i%,j%)=-1+2*rnd(1)
168 next:next
170 for i=1 to n%:x(i%)=-1+2*rnd(1):next
190 for i=1 to n%:s=0
200 for j=1 to n%:s=s+a(i%,j%)*x(j%):next
210 b(i%)=s:next
220 printchr$(147):print"n = ";n%:print
250 ti$="000000"
260 gosub450
270 print"sgefa:";ti/60;"sec"
290 job%=0:ti$="000000"
300 gosub700
310 print"sgesl:";ti/60;"sec"
340 s=0:for i%=1 to n%
350 s=s+abs(b(i%)-x(i%)):next
360 print:print"one-norm of error = ":prints
380 print"----------------------"
390 end
450 info%=.:nm1%=n%-1
460 if nm1% <1 then 670
480 for k%=1 to nm1%
490 kp1%=k%+1
500 q%=n%-k%+1:sys isamax,q%,a(k%,k%),l%:l%=l%+k%-1
530 ipvt(k%)=l%
540 if a(l%,k%)=0 then info%=k%:goto650
550 if l%<>k% then t=a(l%,k%):a(l%,k%)=a(k%,k%):a(k%,k%)=t
570 t=-1/a(k%,k%)
580 q%=n%-k%:sys sscal,q%,t,a(kp1%,k%)
600 for j%=kp1% to n%
610 t=a(l%,j%):if l%<>k% then a(l%,j%)=a(k%,j%):a(k%,j%)=t
620 q%=n%-k%:sys saxpy,q%,t,a(kp1%,k%),a(kp1%,j%)
650 next
670 ipvt%(n%)=n%
680 ifa(n%,n%)= 0 then info%=n%
690 return
700 nm1%=n%-1
710 if job%<>0 then 900
720 if nm1%<1 then 850
730 for k%=1 to nm1%
740 l%=ipvt%(k%):t=a(l%,0)
750 if l%<>k% then a(l%,0)=a(k%,0):a(k%,0)=t
760 q%=n%-k%:sys saxpy,q%,t,a(k%+1,k%),a(k%+1,0)
770 next
850 for k%=n% to 1 step-1
860 a(k%,0)=a(k%,0)/a(k%,k%):t=-a(k%,0)
870 q%=k%-1:sys saxpy,q%,t,a(1,k%),a(1,0)
880 next
900 return
1000 rem generated ml loader
1010 sa = 49152
1020 for n = 0 to 355
1030 read a% : poke sa+n,a%: next n
1050 return
1060 data 32,250,192,32,58,193,32,8
1070 data 193,165,247,5,248,240,29,165
1080 data 249,164,250,32,162,187,32,88
1090 data 188,169,92,160,0,32,103,184
1100 data 32,199,187,32,22,193,32,91
1110 data 193,32,9,192,32,253,174,32
1120 data 139,176,170,32,212,187,96,32
1130 data 250,192,32,80,193,32,69,193
1140 data 32,58,193,165,247,5,248,240
1150 data 40,165,253,164,254,32,162,187
1160 data 165,251,164,252,32,40,186,165
1170 data 249,164,250,32,103,184,166,249
1180 data 164,250,32,212,187,32,34,193
1190 data 32,22,193,32,91,193,76,67
1200 data 192,96,32,250,192,32,58,193
1210 data 32,8,193,165,247,133,251,133
1220 data 253,165,248,133,252,133,254,230
1230 data 253,208,2,230,254,165,247,5
1240 data 248,240,41,165,249,164,250,32
1250 data 162,187,32,88,188,169,92,160
1260 data 0,32,91,188,48,13,240,11
1270 data 165,247,133,251,165,248,133,252
1280 data 32,199,187,76,22,193,32,91
1290 data 193,76,141,192,56,165,253,229
1300 data 251,168,165,254,229,252,32,145
1310 data 179,76,44,192,32,250,192,32
1320 data 69,193,32,58,193,165,247,5
1330 data 248,240,30,165,249,164,250,32
1340 data 162,187,165,251,164,252,32,40
1350 data 186,166,249,164,250,32,212,187
1360 data 32,22,193,32,91,193,76,213
1370 data 192,96,32,253,174,32,138,173
1380 data 32,247,183,132,247,133,248,96
1390 data 162,4,169,0,149,92,149,97
1400 data 202,16,249,133,102,96,24,165
1410 data 249,105,5,133,249,144,2,230
1420 data 250,96,24,165,251,105,5,133
1430 data 251,144,2,230,252,96,24,165
1440 data 253,105,5,133,253,144,2,230
1450 data 254,96,32,253,174,32,139,176
1460 data 133,249,132,250,96,32,253,174
1470 data 32,139,176,133,251,132,252,96
1480 data 32,253,174,32,139,176,133,253
1490 data 132,254,96,165,247,208,2,198
1500 data 248,198,247,96
Code: Select all
; ########################################################
; ASSEMBLY LANGUAGE BLAS ROUTINES FOR THE COMMODORE 64.
; TO BE CALLED FROM A C64 BASIC PROGRAM.
; LISTING OF SASUM, SAXPY, ISAMAX, SSCAL ONLY.
; ########################################################
; KERNAL ROUTINES FROM VIC-20 AND C64
; VIC-20 $C000 - $EA7A
; C64 $A000 - $BFFF
; $BBFC C64
; $DBFC VIC-20
;
; $AEFD C64
; $CEFD VIC-20
; NOTATION:
; FP1, FP2 = FLOATING POINT ACCUMULATORS 1, 2
; MEM.AY := '(A, Y)' =FL. PT. NUMBER AT ADDRESS A+256*Y
; MEM.XY := '(X, Y)' = FL.PT. NUMBER AT ADDRESS X+256*Y
;
; ADDRESSES OF (ROM) ROUTINES IN THE BASIC INTERPRETER:
;
EVAL = $AD8A ; GETS & EVALUATES NUMERIC EXPRESSION FROM TEXT,
; RESULT PLACED IN FP1.
COMMA = $AEFD ; CHECK FOR COMMA
INTEGER = $B7F7 ; FP1 -> INTEGER AT (Y,A)
INTFLP = $B391 ; FP1 := FLOAT((Y, A))
PTRGET = $B08B ; GETS NAME AND POINTER TO A VARIABLE, RETURNS
; WITH (A,Y) POINTING TO EXPONENT (OF FIRST ELEMENT IN ARRAY),
; FOR NUMERIC VARIABLE.
LOADFP1 = $BBA2 ; FP1 = MEM.AY
SAVEFP1 = $BBD4 ; MEM.XY = FP1
ADD = $B86A ; FP1 := FP1+FP2
MULT = $BA2B ; FP1 := FP1*FP2
ABS = $BC58 ; FP1 := ABS(FP1) = LSR $66
SQRT = $BF71 ; FP1 := SQRT(FP1)
COMPARE = $BC5B ; COMPARE FP1 WITH MEM.AY
; A=0 IF EQUAL, A=1 IF FP1 > MEM.AY, A=$FF if FP1 < MEM.AY
ADDMEM = $B867 ; FP1 = FP1 + MEM.AY
MULTMEM = $BA28 ; FP1 = FP1 * MEM.AY
STORE = $5C ; FP3 : $5C-$60
FP1TOSTORE = $BBC7 ; FP3 := FP1
NLOW = $F7
NHIGH = $F8
PLOW1 = $F9 ; POINTER (PTR1)
PHIGH1 = $FA
PLOW2 = $FB ; POINTER (PTR2)
PHIGH2 = $FC
PLOW3 = $FD ; POINTER (PTR3)
PHIGH3 = $FE
FP1 = $61 ; FP1 : $61-$66
FP2 = $69 ; FP2 : $69-$6E
;--------------------------------------------------------------------------
* = $C000 ; ASSEMBLE CODE IN SPARE 4K BLOCK STARTING AT $C000
; -------------------------------------------------------------------------
; REAL FUNCTION SASUM (N,SX)
;
; SUM OF ABSOLUTE VALUES OF A VECTOR
;
; SYS ASUM,N,SX(),S
;
SASUM
JSR GETN ; EVALUATE 1ST PARAMETER
JSR GET1 ; (PTR) -> SX()
JSR ZEROSTORE ; SUM:=0
LOOPSA
LDA NLOW ; N=0?
ORA NHIGH
BEQ FINSA ; IF SO, FINISHED
LDA PLOW1 ; SET UP ACCUM.
LDY PHIGH1 ; AND Y-REG.
JSR LOADFP1 ; THEN CALL ROM ROUTINE
JSR ABS ; FP1 := ABS(FP1)
LDA #<STORE
LDY #>STORE
JSR ADDMEM ; FP1 := FP1 + SUM
JSR FP1TOSTORE ; SUM := FP1
JSR BUMP1 ; CALL SUBROUTINES AT
JSR NEQNM1 ; END OF LISTING.
JSR LOOPSA ; LOOP BACK
FINSA
JSR COMMA ; STORE RESULT (FP1) IN
JSR PTRGET ; VARIABLE. THIS CODE CALLED BY
TAX ; ISAMAX ALSO.
JSR SAVEFP1
RTS
; -------------------------------------------
; SUBROUTINE SAXPY (N,SA,SX,SY)
;
; VECTOR=VECTOR+CONST*VECTOR: SY() := SY()+SA*SX()
;
; SYS AXPY,N,SA,SX(),SY()
;
SAXPY
JSR GETN
JSR GET3 ; (PTR3) -> SA
JSR GET2 ; (PTR2) -> SX()
JSR GET1 ; (PTR1) -> SY()
LOOPSAX
LDA NLOW ; N=0?
ORA NHIGH
BEQ FINSAX
LDA PLOW3 ; FP1:=SA
LDY PHIGH3
JSR LOADFP1
LDA PLOW2 ; FP1:=FP1*SX()
LDY PHIGH2
JSR MULTMEM
LDA PLOW1 ; FP1:=FP1+SY()
LDY PHIGH1
JSR ADDMEM
LDX PLOW1 ; SY():=FP1
LDY PHIGH1
JSR SAVEFP1
JSR BUMP2 ; MOVE PTRS TO NEXT
JSR BUMP1 ; ELTS OF SX & SY.
JSR NEQNM1
JMP LOOPSAX
FINSAX
RTS
; -------------------------------------------
; INTEGER FUNCTION ISAMAX (N,SX)
;
; FIND INDEX OF ELT WITH LARGEST ABSOLUTE VALUE IN VECTOR X
;
; SYS ISAMAX,N,SX(),K
;
ISAMAX
JSR GETN
JSR GET1 ; (PTR1) -> SX()
JSR ZEROSTORE ; CURRENT MAX := 0
LDA NLOW
STA PLOW2
STA PLOW3
LDA NHIGH ; PTR2 = N+1-INDEX OF
STA PHIGH2 ; CURRENT MAX ELT.
STA PHIGH3 ; PTR3 = SAVED VALUE OF N
INC PLOW3 ; PLUS 1.
BNE LOOPMAX
INC PHIGH3
LOOPMAX
LDA NLOW ; N=0?
ORA NHIGH
BEQ FINMAX
LDA PLOW1
LDY PHIGH1
JSR LOADFP1 ; FP1 := SX()
JSR ABS ; FP1 := ABS(FP1)
LDA #<STORE
LDY #>STORE
JSR COMPARE ; FP1 & BIGGEST SO FAR
BMI LTE ; IF FP1 < ...
BEQ LTE ; IF FP1 = ...
LDA NLOW ; FP1 is BIGGER
STA PLOW2
LDA NHIGH
STA PHIGH2 ; UPDATE INDEX OF MAX ELT
JSR FP1TOSTORE ; SAVE CURRENT MAX ELT
LTE
JMP BUMP1
JSR NEQNM1
JMP LOOPMAX
FINMAX
SEC ; INDEX := N+1-PTR2
LDA PLOW3
SBC PLOW2
TAY
LDA PHIGH3
SBC PHIGH2
JSR INTFLP ; CONVERT RESULT TO
JMP FINSA ; FL-PT. % GOTO SASUM
; -------------------------------------------
; SUBROUTINE SSCAL (N,SA,SX)
;
; SCALE VECTOR BY A CONSTANT: SX():= SA*SX()
;
; SYS SCAL,N,SA,SX()
;
SSCAL
JSR GETN
JSR GET2 ; (PTR2) -> SA
JSR GET1 ; (PTR1) -> SX()
LOOPSC
LDA NLOW ; N=0?
ORA NHIGH
BEQ FINSC
LDA PLOW1
LDY PHIGH1
JSR LOADFP1 ; FP1 := SX()
LDA PLOW2
LDY PHIGH2
JSR MULTMEM ; FP1 := FP1*SA
LDX PLOW1
LDY PHIGH1
JSR SAVEFP1 ; SX() := FP1
JSR BUMP1
JSR NEQNM1
JMP LOOPSC
FINSC
RTS
; -------------------------------------------
; ROUTINE TO EVALUATE THE PARAMETER N AND STORE THE RESULT
; AS A 16-BIT INTEGER IN (NLOW, NHIGH).
;
GETN
JSR COMMA
JSR EVAL
JSR INTEGER
STY NLOW
STA NHIGH
RTS
; -------------------------------------------
; SET TO ZERO FP3 AND FP1, THE LATTER SO THAT SASUM, SDOT AND
; SNRM2 RETURN 0 WHEN N=0.
;
ZEROSTORE
LDX #4 ; 5 ELTS TO ZEROSTORE
LDA #0
LOOPZ1
STA STORE,X
STA FP1,X
DEX
BPL LOOPZ1 ; BRANCH IF X>=0
STA FP1+5
RTS
; -------------------------------------------
; THE FOLLOWING ROUTINES MOVE A POINTER ONTO THE NEXT ARRAY ELEMENT
;
BUMP1
CLC ; BUMP PTR1 BY 5
LDA PLOW1
ADC #5
STA PLOW1
BCC FIN1
INC PHIGH1
FIN1
RTS
BUMP2
CLC ; BUMP PTR2 BY 5
LDA PLOW2
ADC #5
STA PLOW2
BCC FIN2
INC PHIGH2
FIN2
RTS
BUMP3
CLC ; BUMP PTR3 BY 5
LDA PLOW3
ADC #5
STA PLOW3
BCC FIN3
INC PHIGH3
FIN3
RTS
; -------------------------------------------
; THE FOLLOWING ROUTINES SEARCH FOR A
; NUMERIC VARIABLE (SIMPLE VAR, OF ARRAY ELEMENT) AND
; STORE A POINTER TO THE FIRST BYTE OF THE
; FLOATING POINT NUMBER IN (PTR1), (PTR2) OR (PTR3).
;
GET1
JSR COMMA
JSR PTRGET
STA PLOW1
STY PHIGH1
RTS
GET2
JSR COMMA
JSR PTRGET
STA PLOW2
STY PHIGH2
RTS
GET3
JSR COMMA
JSR PTRGET
STA PLOW3
STY PHIGH3
RTS
; -------------------------------------------
NEQNM1
LDA NLOW ; N := N-1
BNE NM1
DEC NHIGH
NM1
DEC NLOW
RTS
; -------------------------------------------