Tuning the m68k Cost Model for the 68000

A Practical Guide to Optimizing GCC’s Instruction Heuristics

The Motorola 68000 is a beautifully simple CPU — but its instruction timings, addressing modes, and encoding sizes vary enough that GCC’s default heuristics often produce sub‑optimal code. To address this, amiga‑gcc introduces a new cost‑table system that allows precise tuning of the optimizer without recompiling the compiler itself.
While the tuning scripts are running, I update the chart this and then, it can be viewed here.
This article describes:

the new unified cost table
how external cost files work
how mutation‑based search finds better values
what the logs shows
why this matters specifically for the 68000

The New Unified Cost Table

Traditionally, GCC hard‑codes instruction costs inside C functions. For amiga‑gcc, this has been replaced with a single table‑driven cost model:

one table for speed
one table for size

To achieve this, the cost values have been replaced with lookups into the cost table.
This makes the cost model:

easier to tune
easier to compare
easier to export/import

External Cost Files

To avoid recompiling GCC for every experiment, the cost tables can be loaded from a text file at compiler startup. This feature will be removed once a good enough setting was found.
But for now this enables:

rapid iteration
automated search
easy sharing of tuned profiles

A typical file looks like:

cost_shift_bit=5
cost_shift_rest=9
cost_logic_reg_const=14
...

The compiler reads these values and overrides the built‑in defaults.

Why Tuning Matters for the 68000

The 68000 has:

slow multi‑bit shifts
slow memory accesses
cheap quick‑immediates
cheap addq/subq
expensive longword operations
no barrel shifter
no 32‑bit multiply/divide unit

This means the optimizer must be guided differnetly for speed and size:

discourage multi‑bit shifts or force these
discourage memory‑heavy sequences or make them cheap
encourage register arithmetic if speed matters
encourage quick immediates
avoid unnecessary `ext.w` / `ext.l` chains

A tuned cost model can reduce code size by 10% and improve speed by 5% on real hardware.
Right now no setting was found where all benchmarks benefit, but the overall looks quite good.

Mutation‑Based Search

To find optimal values, a mutation‑driven search process is used:
1. Load the current best cost table. 2. Randomly mutate a small number of parameters. 3. Rebuild benchmarks using the new table. 4. Measure total runtime or binary size. 5. If the score improves → keep the new table. 6. Otherwise → discard it. 7. Repeat tens of thousands of times.
This evolutionary approach converges surprisingly well.

Example Log Snippet

Here is a real mutation event from the search where I started with all values at 0:

=== Iteration 749 / 99999 ===
Ändere 6 Parameter: cost_shift_reg_w=5 cost_mem_plus_reg_disp=4 cost_set_clr_reg=0 cost_const_int_l=17 cost_shift_reg_w=2 cost_const_int_q=5
Neuer Score: 1672439452 (beste: 1672439456)
+++ Übernehme (neue bester Score 1672439452) +++

I started this several times, and you get different results which are close together but not the same. Maybe I run it really long and check what happens.

Interpretation:

six parameters mutated
new score is better (lower is better)
the new table becomes the current best

This process repeats continuously.

Stable vs. Unstable Parameters

During tuning, some parameters quickly prove stable:

changing them never improves results
they belong in the “excluded” list
they form the backbone of the cost model

Other parameters are highly sensitive:

shifts
memory costs
addressing modes
logic operations
subregs

These are the ones the search mutates most often.

Current Status

My searches are still running, and the score continues to improve.
Once the score plateaus for several thousand iterations, the search can be stopped and the best table extracted.

Conclusion

The new cost‑table system transforms the way amiga‑gcc optimizes code for the 68000:

no recompilation needed
fast iteration
automated tuning
reproducible results
architecture‑aware heuristics

Combined with mutation‑based search, this approach produces:

smaller binaries
faster execution
more predictable code generation
better use of 68000‑specific strengths

After all the updated costs will be comitted into the code base.

rev: 1.6