Hacking GCC and BINUTILS
Guess it would be easier to ask 'what's already ok with gcc for the Amiga?' ... but even then a correct answer is hard. Well the current issue I'm talking about that using C++ you can end up with non working executables.
And I am not sure, if I'll succeed ...
... where I am at?
Native Amiga-GCC
Few weeks ago I managed to build a first version which really creates assembly, object and executable files. native Amiga-GCC.
And it's kind a slow.
Well, it wont be very much faster, but what, if all these executables were resident? Then - with enough memory - one could preload the executables and should gain some time...
simple resident executable in C with libnix
The libnix library is the only one I'm using. There is a comparison page for the libraries: https://github.com/bebbo/amiga-gcc/wiki/Libraries Even the old data shows, that libnix is not the worst choice. Guess I should run the tests again, and update that site, but that's a different topic.
To create a resident program you simply have to specify -resident or -resdient32.
No sooner said than done: and say hello to the guru.
What happened? The autostart feature hit it hard, since the code to init the libraries references the library bases directly linked:
extern __far struct lib {
struct Library *base;
char const *name;
};
But now, the code must refer the data segment of the current instance.
So instead of referencing the memory, a pointer to a getter which returns the address from the running instance is provided:
struct Library ** (*get)();
The code needs adjustment to work with this change, and also sfdc should generate the auto open code that matches. I also added
for that new behaviour and ensured that libstubs.a is also generated for each multilib variant.
And - hurray - the simple resident program is starting.
simple resident executable in C++ with libnix
So how could this be different? Applying g++ to a hello-world now still yields a working program, but other programs fail badly...
DUH
After some experiments I managed to create a smaller program that fails. The important part
An object with constructor and destructor. And after searching the culprit: The linker is not able to treat the constructors properly. This is handled with the old stab mechanism.
Since there is no quick solution to fix the binutils linker, I decided to switch from stabs to sections. This requires the use of an end file to terminate the list sections with a zero, but the more complex version is working.
The objdump of the generated file proves that linking is proper now:
00002124 ___INIT_LIST__:
.long 0x00000000,0x0000023c,0x00000058,0x0000053c,0x00000031,0x00000598,0x0000007b,0x000005f4
.long 0x00000030,0x00000894,0x0000004e,0x00001c5e,0x00000062
00002158 __term0__INIT_LIST__:
.long 0x00000000
0000215c ___EXIT_LIST__:
.long 0x00000000,0x00000466,0x00000058,0x000005c0,0x0000007b,0x000006b4,0x00000030,0x000008de
.long 0x0000004e,0x00001e74,0x00000062
00002188 __term0__EXIT_LIST__:
.long 0x00000000
0000218c ___CTOR_LIST__:
.long 0x00000598,0x000020f4
00002194 __term0__CTOR_LIST__:
.long 0x00000000
00002198 ___DTOR_LIST__:
.long 0x00000000,0x0000210c
000021a0 __term0__DTOR_LIST__:
.long 0x00000000
000021a4 ___LIB_LIST__:
.long 0x000005f4,0x00000100
000021ac __term0__LIB_LIST__:
.long 0x00000000
Since these lists all moved into the .text segment, the startup codes needed some care too - as the linker script - to put it all together.
and gcc still fails
The crash happens no longer during startup. It's ... hold the line ... investigating ...
... ok, it's a bsr somewhere.
Using resident or baserel the gcc compiler also decides to emit jbsr which may result into a pc relative call. While it's 1 tick slower (I guess) it needs less relocation. But why does it fail?
Different architectures have different approaches what to write something into the offset which gets adjusted while linking. The Amiga wants zeroes in there. But sometimes there aren't zeroes. I considered this already fixed (modified it a while ago), but that workaround does not work reliably. This time I went to hack the linker and poke it to zero:
int off = 0;
if (r->howto->type == H_PC32)
off = bfd_getb_signed_32 (((bfd_byte *)data)+r->address);
else if (r->howto->type == H_PC16)
off = bfd_getb_signed_16 (((bfd_byte *)data)+r->address);
...
relocation = ... - off;
Now if there is an offset, it gets eliminated. And
gcc is starting:
vamos -Hdisable -s100 -m32000 -C40 -- gcc/xgcc -v
Using built-in specs.
COLLECT_GCC=xgcc
Target: m68k-amigaos
Configured with: /home/stefan/amiga-gcc/projects/gcc/configure --host=m68k-amigaos --prefix=/opt/amiga68 --enable-languages=c,c++,objc --enable-version-specific-runtime-libs --disable-libssp --disable-shared --disable-lto --disable-host-shared --disable-libstdcxx --disable-nls --with-gmp-include=/home/stefan/amiga-gcc/amiga/gcc/../gmp --with-gmp-lib=/home/stefan/amiga-gcc/amiga/gcc/../gmp/.libs --with-mpfr-include=/home/stefan/amiga-gcc/amiga/gcc/../mpfr-2.4.2 --with-mpfr-lib=/home/stefan/amiga-gcc/amiga/gcc/../mpfr/.libs --with-mpc-include=/home/stefan/amiga-gcc/amiga/gcc/../mpc-0.8.1/src --with-mpc-lib=/home/stefan/amiga-gcc/amiga/gcc/../mpc/src/.libs
Thread model: single
gcc version 6.5.0b 220317174649 (GCC)
Also resident is working as expected: once loaded the startup time is reduced.
and cc1 still fails
I'm into it and I know it's again a linker issue, where accessing some static function in a .text section from a different section (created via template) results into the wrong offset.
After following some wrong paths, the current guess is: This is related to local symbols not visible in .o files...
... another wrong guess...
start over: the object contains
Disassembly of section .text._ZNK10hash_tableI14int_cst_hasher11xcallocatorE13alloc_entriesEj:
...
5a: 2f02 move.l d2,-(sp)
5c: 61ff 0001 36ee bsr.l 1374c 1374c __Z21ggc_cleared_vec_allocIP9tree_nodeEPT_j
and inside of the text section
Sections:
Idx Name Size
0 .text 00019334
...
0001374c 0001374c __Z21ggc_cleared_vec_allocIP9tree_nodeEPT_j:
1374c: 2f0d move.l a5,-(sp)
and the final file
8e2b76: 2f02 move.l d2,-(sp)
8e2b78: 61ff ffc7 79b6 bsr.l 55a530 55a530 __Z13make_pass_vrpPN3gcc7contextE+0x34bc
which points to a wrong location. The object file was placed here:
.text 0x000000000055a530 0x19334 libbackend.a(tree.o)
adding the offset 0001374c yields 56DC7C and the code is really there:
56dc7c: 2f0d move.l a5,-(sp)
=> the emitted jump offset 000136ee plus the own address 5c plus 2 bytes for the bsr opcode yield 137AC which **is** the correct offset into the .text segment.
Erasing the offset is not always valid.
Next idea: keep the offset, if the symbol is a section symbol...
8e2b78: 61ff ffc8 b102 bsr.l 56dc7c 56dc7c __Z25gt_clear_caches_gt_tree_hv+0x228
that address is now correct - and the rest? I reduced applying that approach only within the same object file, since only there the offset is correct.
cc1 is starting now, but not yet working. But it's resident^^
... just a little insight how I botcher around...
illegal addresses
Next stop: bogus code...
00011990 00011990 .L4379:
11990: 307c 0005 movea.w #5,a0
11994: d1e8 0008 adda.l 8(a0),a0
11998: e9d0 4001 bfextu (a0),0,1,d4
accessing the memory at address 00000005 isn't correct. It seems that this code
shorten = (((((orig_op0)->typed.type))->base.u.bits.unsigned_flag)
|| (((enum tree_code) (op1)->base.code) == INTEGER_CST
&& !integer_all_onesp (op1)));
translates into
...
move.w #5,a0
add.l (8,a0),a0
bfextu (a0){#0:#1},d6
cmp.l d6,d6
...
before reload the insn looks like (if this is the right one)
(insn 1128 1127 1129 91
(set (subreg:SI (reg:QI 551) 0)
(zero_extract:SI (mem:QI (plus:SI (mem/f:SI (plus:SI (reg/v/f:SI 470 [ orig_op0 ])
(const_int 8 [0x8])) [0 orig_op0_965(D)->typed.type+0 S4 A16])
(const_int 5 [0x5])) [13 *_535+5 S1 A8])
(const_int 1 [0x1])
(const_int 0 [0]))) gcc/c/c-typeck.c:10753 359 {*extzv_bfextu_mem2}
(nil))
and reload converts it to
(insn 5322 1127 5324 91 (set (reg:SI 9 a1)
(mem/f/c:SI (plus:SI (reg/f:SI 13 a5)
(const_int 16 [0x10])) [22 orig_op0+0 S4 A16])) gcc/c/c-typeck.c:10753 39 {*movsi_m68k}
(nil))
(insn 5324 5322 5325 91 (set (reg:SI 9 a1)
(const_int 5 [0x5])) gcc/c/c-typeck.c:10753 39 {*movsi_m68k}
(nil))
(insn 5325 5324 1128 91 (set (reg:SI 9 a1)
(plus:SI (reg:SI 9 a1)
(mem/f:SI (plus:SI (reg:SI 9 a1)
(const_int 8 [0x8])) [0 orig_op0_965(D)->typed.type+0 S4 A16]))) gcc/c/c-typeck.c:10753 135 {*addsi3_internal}
(expr_list:REG_EQUIV (plus:SI (mem/f:SI (plus:SI (reg:SI 9 a1)
(const_int 8 [0x8])) [0 orig_op0_965(D)->typed.type+0 S4 A16])
(const_int 5 [0x5]))
(nil)))
(insn 1128 5325 1129 91 (set (reg:SI 4 d4 [orig:551+-3 ] [551])
(zero_extract:SI (mem:QI (reg:SI 9 a1) [13 *_535+5 S1 A8])
(const_int 1 [0x1])
(const_int 0 [0]))) gcc/c/c-typeck.c:10753 359 {*extzv_bfextu_mem2}
(nil))
The insns were ok if not always a1 would be used. It's reload again...
(insn 350 349 351 76 (set (reg/v:SI 83 [ unsigned_op1 ])
(zero_extract:SI (mem:QI (plus:SI (mem/f:SI (plus:SI (reg/v/f:SI 302 [ op1 ])
(const_int 8 [0x8])) [0 op1_460(D)->typed.type+0 S4 A16])
(const_int 5 [0x5])) [32 *_123+5 S1 A8])
(const_int 1 [0x1])
(const_int 0 [0]))) /home/stefan/amiga-gcc/projects/gcc/gcc/c/c-typeck.c:4815 374 {*extzv_bfextu_mem2}
(nil))
... change this and that ...
move.l (16,a5),a1
move.l (8,a1),a0
lea (5,a0),a1
bfextu (a1){#0:#1},d4
which looks correct now, but the result should be
move.l (16,a5),a1
bfextu ([8,a1],5){#0,#1},d4
ok, these insns aren't well defined in the machine description...
... extending the usable memory address (plus undo the reload change) and voilá:
move.l (20,a5),a0
bfextu ([8,a0],5){#0:#1},d6
move.l (28,a5),a0
bfextu ([8,a0],5){#0:#1},d1
cmp.l d6,d1
this asm makes more sense.
rinse and repeat
After rebuilding everything start to test gcc and
vamos -Hdisable -s100 -m32000 -C40 -- gcc/xgcc
xgcc: fatal error: no input files
compilation terminated.
18:36:13.500 exec: ERROR: deallocate: block outside of mem header!
18:36:13.503 exec: ERROR: deallocate: block outside of mem header!
18:36:13.511 exec: ERROR: deallocate: block outside of mem header!
18:36:13.551 exec: ERROR: deallocate: block outside of mem header!
18:36:13.554 exec: ERROR: deallocate: block outside of mem header!
18:36:13.563 exec: ERROR: deallocate: block outside of mem header!
and the program hangs.
After searching a while, I tracked that one down as an issue in vamos, fixed it. But the output is the same - only the program exits porperly. Let's use some more debug output:
13:28:54.176 instr: INFO: PC=00001626 SR=----- USP=000af330 ISP=00000700 MSP=00000780
13:28:54.176 instr: INFO: D0=6763632f D1=0010222d D2=000af368 D3=0007b024 D4=0000007b D5=0000007b D6=00000000 D7=00000067
13:28:54.176 instr: INFO: A0=00100574 A1=00102229 A2=00100574 A3=0010222d A4=000b75f2 A5=000af370 A6=0000133c A7=000af330
13:28:54.176 instr: INFO: F0=0 F1=0 F2=0 F3=0 F4=0 F5=0 F6=0 F7=0 N=1 Z=0 I=0 NAN=0
13:28:54.176 instr: INFO: @0015e8 +00003e exec.library(Traps) 001626 PyTrap #$021 ; base_func
13:28:54.176 exec: DEBUG: DEALLOC: mh_addr=100574, blk_addr=102229, num_bytes=1734566704
13:28:54.176 exec: DEBUG: read: [MH@100574:(100594+003fe0=104574):free=001118,MC=100594]
13:28:54.176 exec: ERROR: deallocate: block outside of mem header!
The memory gets trashed somewhere. Note that d0 contains 6763632f which is 'GCC:'
So switch to the more informative debug malloc...
... and there is more info
18:47:54.430 instr: INFO: b'__ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI9vec_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_ha':
18:47:54.430 instr: INFO: @0020b0 +078ed0 xgcc_0:code 07af88 move.l A5, -(A7)
...
18:47:54.432 instr: INFO: @0020b0 +078ede xgcc_0:code 07af96 move.l (A0), D0
18:47:54.432 instr: INFO: PC=0007af98 SR=--Z-- USP=000af270 ISP=00000700 MSP=00000780
18:47:54.432 instr: INFO: D0=00000000 D1=00000000 D2=000af2ac D3=00000062 D4=0000007b D5=0000007b D6=00000000 D7=00000067
there is 00000000 in d0 which is used as pointer later...
grep "ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI9vec_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_ha" gcc/*.o
grep: gcc/vec.o: binary file matches
the truncated name plus the code leads to
00000000 00000000 __ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI9vec_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE8iterator5slideEv:
0: 2f0d move.l a5,-(sp)
00000002 00000002 .LCFI108:
2: 2a4f movea.l sp,a5
00000004 00000004 .LCFI109:
4: 2f0a move.l a2,-(sp)
00000006 00000006 .LCFI110:
6: 206d 0008 movea.l 8(a5),a0
0000000a 0000000a .LBB2022:
a: 2268 0004 movea.l 4(a0),a1
0000000e 0000000e .L308:
e: 2010 move.l (a0),d0
10: b3c0 cmpa.l d0,a1
which is called shortly after __exitcpp
maybe initialization isn't done properly?
uhm, yes, _CTOR_LIST is in the data segment:
m68k-amigaos-objdump -D xgcc | less
...
000178b4 000178b4 ___CTOR_LIST__:
178b4: 0007 0b88 ori.b #-120,d7
178b8: 0000 0000 ori.b #0,d0
... and bogus. Why? Ask the map file:
.data 0x0000000000091fe4 0x8 /opt/amiga/m68k-amigaos/libnix/lib/libb32/libm020/libm881/libstubs.a(__ctor_list__.o)
0x0000000000091fe4 __CTOR_LIST__
ok, had to remove some old files and fix a typo:
vamos -Hdisable -s100 -m32000 -C40 -- gcc/xgcc
xgcc: fatal error: no input files
compilation terminated.
smooth. With cc1 there is some wierd output and winuaeenforcer reports hits...
... took some effort but hunted it down as well: It was a use after free in the original gcc code. The same code is still present in the recent gcc versions...
Now cc1 runs and can be made resident which saves some time (I guess)