Lets actually try Hybrid Emulation

bbond007
Top Contributor
Posts: 519
Joined: Tue May 26, 2020 5:06 am
Has thanked: 85 times
Been thanked: 198 times

Re: Lets actually try Hybrid Emulation

Unread post by bbond007 »

dec_alpha wrote: Tue Dec 28, 2021 4:10 pm My memory isn’t the best, but watching TBL Tint, TBL GOA and other old AGA demos, I’m pretty sure they ran smoother on a 030@50, but I suppose that could also be related to the speed of the AGA implementation or a number of other things.
Did said 030@50 have a Motorola 68882 FPU?
dec_alpha wrote: Tue Dec 28, 2021 4:10 pm I think I recall people saying that the 030 and 040 provides benefits other than just the clock, not as much love for the 060, but I guess it must have more instructions per clock or something, otherwise why would it perform tasks faster :P
For actual hardware, regarding CPU Compatibility with most stuff (like whdload), 030 and 060 are both preferred over 040. Also 040's tended to run hot.
dec_alpha
Posts: 7
Joined: Wed Jan 06, 2021 3:17 pm
Has thanked: 3 times

Re: Lets actually try Hybrid Emulation

Unread post by dec_alpha »

bbond007 wrote: Wed Dec 29, 2021 12:55 am Did said 030@50 have a Motorola 68882 FPU?
Good call, you're totally right, I used to have a Blizzard 1230 with the FPU (I didn't know the FPU was actually optional, had to look it up)
bbond007 wrote: Wed Dec 29, 2021 12:55 am For actual hardware, regarding CPU Compatibility with most stuff (like whdload), 030 and 060 are both preferred over 040. Also 040's tended to run hot.
Yeah, looks like the internet agrees with you on that, I do remember the 040 got a bit hot (I had it water cooled for a while...), but the speed compared to stock 020 oh my god :)
Did run quite a few of demos on it without any issues that I can remember, but I guess WHDload might do some weird stuff to enable kickrom 1.3 software to run on 3.1 hardware.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I've just been trying out newer qemu. qemu 6.1.1 works fine, but 6.2.0 fails with an alignment trap. They seem to have made a fair number of changes for the Mac M1 so perhaps its a regression related to that, trying to find out.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Looks to be this revision... I'll see if I can understand why.

git bisect bad
fa947a667fceab02f9f85fc99f54aebcc9ae6b51 is the first bad commit
commit fa947a667fceab02f9f85fc99f54aebcc9ae6b51
Author: Richard Henderson <richard.henderson@linaro.org>
Date: Thu Jul 29 10:45:10 2021 -1000

hw/core: Make do_unaligned_access noreturn

While we may have had some thought of allowing system-mode
to return from this hook, we have no guests that require this.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

include/hw/core/tcg-cpu-ops.h | 3 ++-
target/alpha/cpu.h | 4 ++--
target/arm/internals.h | 2 +-
target/hppa/cpu.c | 7 ++++---
target/microblaze/cpu.h | 2 +-
target/mips/tcg/tcg-internal.h | 4 ++--
target/nios2/cpu.h | 4 ++--
target/ppc/internal.h | 4 ++--
target/riscv/cpu.h | 2 +-
target/s390x/s390x-internal.h | 4 ++--
target/sh4/cpu.h | 4 ++--
target/xtensa/cpu.h | 4 ++--
12 files changed, 23 insertions(+), 21 deletions(-)
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Well, putting that qemu issue aside (reported to qemu mailing list) it other runs 'as before' in qemu 6.2.0.

I see the kernel has been updated again, I'd better update the modules too.

Then we're at least back to where it was before :-)
User avatar
LamerDeluxe
Top Contributor
Posts: 1160
Joined: Sun May 24, 2020 10:25 pm
Has thanked: 798 times
Been thanked: 257 times

Re: Lets actually try Hybrid Emulation

Unread post by LamerDeluxe »

Great to hear, looking forward to any developments on this.
mahen
Posts: 185
Joined: Sun May 24, 2020 8:25 pm
Has thanked: 19 times
Been thanked: 6 times

Re: Lets actually try Hybrid Emulation

Unread post by mahen »

Just wanted to finish the offtopic by adding one more offtopic question : as others stated here, apart from the OK performance of the Minimig core (and the fast D-CACHE must be disabled to retain max compatibility), I feel the core is very solid / reliable.

The only strange issues I have are :
- sometimes, after RTG modes changed, I get a corrupted screen (like, all gray with some lines / plain rectangles). And I have to turn off the MiSTer completely for it to work again
- I recently noticed that my OS 3.2.1 Shell autocompletion would sometimes freeze the shell when using / tweaking RoadShow
(also I noticed global integer scale must be disabled for the minimig core in the .ini and it must be selected in the OSD, otherwise some games like SOTB3 or Uridium were extremely tiny, like a 2x scaling factor)

Does it ring any of you a bell ?
Maybe I should open a new thread about those few remaining issues !
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I think something really cool could be done with AROS + DE10. Native CPU running most things on AROS, modern optimised amiga chipset, with enhancements, including tg68 for running old binaries transparently. Anyway different project really...

Going back to qemu, I'm doing some testing (on my x64 box) of the qemu overhead of mixing data/code pages. With the goal being of adding some caching layer to qemu to bring this down. I sadly never received a reply from the qemu developers (for this part, the m68k maintainer was VERY helpful) so will have a go at this myself...

With running this code with i in same page vs in a different page:
void afunc(int * i)
{
for ((*i)=0;(*i)!=0xfffff;++(*i))
{
}
return;
}

markw@Eraze:~/amiga/test_loop_elf$ ./main_native
Running with page offset to variable
Elapsed:0.004997s
Running with 8 byte offset to variable
Elapsed:0.110974s
Overhead of data in code page: 22x
i.e. quite an overhead even on real x64 hardware

markw@Eraze:~/amiga/test_loop_elf$ ./main_68k_elf
Running with page offset to variable
Elapsed:0.003115s
Running with 8 byte offset to variable
Elapsed:8.505803s
Overhead of data in code page:2731x
i.e. MUCH larger via qemu. Interestingly the loop is faster than native code!
kolla
Posts: 188
Joined: Sat Jun 13, 2020 7:56 am
Has thanked: 17 times
Been thanked: 33 times

Re: Lets actually try Hybrid Emulation

Unread post by kolla »

AROS works fine on MiSTer, at least it did 3-4 years ago :)

Would help to have RTG driver for aros.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Bit of an adventure today, learning about _GLOBAL_OFFSET_TABLE_ in m68k elf files and amiga hunks... I was trying to build the dhrystone binary with the non-amiga gcc compiler then memcpy it into an amiga binary. Didn't quite get there, still crashing for some reason. It all started because I was trying to reproduce a simple example on my x64 box since it'll be easier to debug qemu on there, then got carried away:)

The good news: the 'user qemu' overhead of 2700x I mentioned in my last post is more like 20x-40x in the 'machine' mode of qemu. So not too bad vs a real X64 processor. I see this ratio both running linux m68k and with the test binary in the amigaos (both on the mister).

So significant (but lower) overhead when sharing code and data pages. Both for stack vs code and data/bss vs code. This is ok for linux, but bad for amiga OS, due to the memory layout.

How to improve:
i) Stack: Running StackAttack massively reduces stack vs code clashes.
ii) Data/bss: We need an OS loader patch ideally to load data/bss hunks to different pages to the code hunks. Any coders willing to tackle this or know of an existing tool?
iii) Patch qemu: Not yet attempted
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Last night I found out the amiga loader allocated memory with the standard allocator for each hunk. So I patched an allocator patch to allocate an extra 4096 and return page aligned memory. With this patch I get 188000 (real) dhrystones.

Ideally we'd only apply this to the binary loader, not general allocation, I guess.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

4k alignment patch - experimental - attached.
Run it in startup sequence along with the stack attack.
allocp.tar.gz
(4.79 KiB) Downloaded 123 times
Run drystone_amiga -l 1 (increase the 1 for more accuracy). This is a dhyrstones binary unlike sys info.
dhrystone_amiga.gz
(10.5 KiB) Downloaded 121 times
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Some qemu benchmarks now, I need to run ssspeed also with the mister plain core to compare.
IMG_9529.JPG
IMG_9529.JPG (3.22 MiB) Viewed 11846 times
IMG_9528-2.JPG
IMG_9528-2.JPG (1.26 MiB) Viewed 11825 times
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

The same for vanilla mister (tg68k). Note the faster chip mem/io speed.
IMG_9530-2.JPG
IMG_9530-2.JPG (1.77 MiB) Viewed 11823 times
IMG_9531-2.JPG
IMG_9531-2.JPG (3.72 MiB) Viewed 11823 times
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Finally for musashi - fair bit slower than TG68:
IMG_9532.JPG
IMG_9532.JPG (1.75 MiB) Viewed 11830 times
IMG_9533.JPG
IMG_9533.JPG (3.32 MiB) Viewed 11830 times
mahen
Posts: 185
Joined: Sun May 24, 2020 8:25 pm
Has thanked: 19 times
Been thanked: 6 times

Re: Lets actually try Hybrid Emulation

Unread post by mahen »

Hi !
Thanks for your further investigation on the subject :))
Do you think it will be eventually possible to use this soft 68k and still maintain the USB stack as responsive as it is for low-latency gaming purposes ?
Or this would be something we only enable for demos & apps...
Cheers :-)
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I'd say you are best off using the hard 68k where this isn't needed. The timings are much more consistent. Though lets see where we get to with this.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

The next goal with this is to tighten up interrupt handling...
Try this adf for example with tg68k, musashi and qemu. Its funny!
https://github.com/dirkwhoffmann/vAmiga ... /basicint1
I'm surprised its quite so bad, with qemu I'm using real interrupts from the fpga to linux, which wakes a process waiting on an ioctl.
With musashi its polled every 10 cycles.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Interrupt latency not looking promising for qemu:
See quote below from here:
https://www.mail-archive.com/search?l=q ... newest&f=1

> What I am trying to do is inject an I/O interrupt in the middle of a
> translation block.

You can't. QEMU will only ever check for and take interrupts
at the end of a TB.‘
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Just wired up reset properly, since I got sick of switching back and forth to linux to restart qemu!

Also did some measurements of bus latency and irq latency from the arm, so taking qemu out of the picture. To see the baseline.

Now trying to get rtg working. I figure I need the scalar for rtg, but even if I boot with just vga_scaler on I get nothing on screen (i.e. still pal mode, not in rtg yet). Which is odd, since it definitely syncs etc since I get the menu! Just no amiga video.

---
/media/fat# ./bustest
chipram:0xb6ca2000 fastram:0x9eca2000 rtg:0x9dca2000 hardware:0x9cfa2000 rom:0x9cea2000 irqs:0xb6fbd000
chipram:2MB (aga mode with data cache on)
chipram:read 1+0:4.633180MB/s <1+0 = 1 byte, 0 byte offset (was going to do misaligned accesses too)
chipram:read 2+0:8.534389MB/s
chipram:read 4+0:10.167923MB/s
chipram:write 1+0:3.665286MB/s
chipram:write 2+0:7.349861MB/s
chipram:write 4+0:10.062994MB/s
fastram:384MB
fastram:read 1+0:281.980354MB/s
fastram:read 2+0:391.526310MB/s
fastram:read 4+0:607.119424MB/s
fastram:write 1+0:404.759721MB/s
fastram:write 2+0:921.528489MB/s
fastram:write 4+0:1342.469585MB/s
rom:1MB
rom:read 1+0:5.481645MB/s
rom:read 2+0:5.645985MB/s
rom:read 4+0:5.693269MB/s
rom:write 1+0:3.380651MB/s
rom:write 2+0:3.343911MB/s
rom:write 4+0:3.484940MB/s
rom_copy:1MB
rom_copy:read 1+0:33.233632MB/s
rom_copy:read 2+0:75.420469MB/s
rom_copy:read 4+0:136.407039MB/s
rom_copy:write 1+0:169.262018MB/s
rom_copy:write 2+0:651.890482MB/s
rom_copy:write 4+0:1053.740780MB/s
rtg:16MB
rtg:read 1+0:280.465573MB/s
rtg:read 2+0:389.218644MB/s
rtg:read 4+0:605.670591MB/s
rtg:write 1+0:402.738623MB/s
rtg:write 2+0:912.981455MB/s
rtg:write 4+0:1340.707223MB/s
---
/media/fat# ./irqtest
chipram:0xb6c57000 fastram:0x9ec57000 rtg:0x9dc57000 hardware:0x9cf57000 rom:0x9ce57000 irqs:0xb6f97000
ioctl thread:14389
Waiting for IRQs:9090909
ENA:0
ENA:4004
0:0.010961 <- what?
1:0.000023 <- 23ms, i.e. 1/3 scanline. Though this is when it arrives at qemu, not when it processes it!
2:0.000023
3:0.000018
4:0.000024
5:0.000017
6:0.000025
7:0.000017
8:0.000023
9:0.000017
10:0.000022
11:0.000017
12:0.000020
13:0.000017
14:0.000023
15:0.000035
16:0.000020
17:0.000017
18:0.000020
19:0.000017
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

The vga_scaler=1 not working was down to how the core is started. I program it then start the MiSTer binary. If I flash it using the 'Core' option in the mister binary then I get video as well as the menu, if I flash it my way I only get the menu.

Anyway RTG works fine with a core I build from the original git repo, just not after the hybrid adjustments. So I know the software side is fine and the scalar is fine (if I start it using the Core menu). So next up, why doesn't RTG itself work.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Aha, need to reset after programming so that vga_scaler works. Like this:
killall MiSTer
cp Minimig.rbf fpga_config_file.rbf
./fpga_rbf_load
pushd /
/media/fat/MiSTer 2>&1 > /media/fat/MiSTer.log &
popd

./fpga_rbf_load reset_fpga <- *add this*
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

A more reliable order is to load, reset, then start the mister binary (with some delays).

Anyway I did some experiments with RTG. If I start workbench with qemu then select prefs and a mister RTG screen mode then I get a black screen. While in that state I stop qemu and start another binary to check the rtg regs at 0xb80100 - they are all zero! So I poke in some values and I do get a display of sorts (enable, format, stride, hsize, vsize all wrong, but enough). So it seems the hardware side works to that degree, just odd the driver doesn't manage to set the rtg regs even as far as enable.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Since mushasi is so slow I hadn't bothered much with it. I just added the rtg to its memory map - and it works! So this seems to be a qemu specific issue rather than a core issue - which I guess I'd also confirmed via testing rtg from linux, but this goes further since its using P96 from amigaos too.

Now the next question is why doesn't rtg work only from qemu? As I mentioned before it doesn't even get as far as setting the registers at 0xb80100.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

So with qemu rtg works in 68020 mode, just not 68040 mode.

So, I guess 68040 mode can wait. Back to trying to figure out an irq latency improvement.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

IRQ latency: seeing typically 2 scan lines, with peaks up to 3 frames! This is latency from linux receiving the irq to qemu processing it.
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: Lets actually try Hybrid Emulation

Unread post by Caldor »

Good to see all the activity here again :) I am thinking first goal should be a stable solution, even if its still the 68020 or even the 68000, just to have a reference point for later stuff.

Or do we already have something that seems stable?
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Seems pretty stable.

Actually I do see it hang when left alone for a bit so need to look into that.
lordoftime79
Posts: 97
Joined: Sun Feb 14, 2021 6:29 pm
Has thanked: 1 time
Been thanked: 2 times

Re: Lets actually try Hybrid Emulation

Unread post by lordoftime79 »

Has this been made simpler to run now? I could never get it working so went awat from it but now i am finding i want it.
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: Lets actually try Hybrid Emulation

Unread post by Caldor »

lordoftime79 wrote: Tue Jan 11, 2022 11:19 am Has this been made simpler to run now? I could never get it working so went awat from it but now i am finding i want it.
I got it to work but I had to use Putty to connect to the MiSTer Linux system command line to be able to run the CPU emulation after running the core.

I think BBond was working on some scripts that would launch the core and then the emulation in a way that would make it work on the MiSTer alone. I never got that to work though.

I hope maybe Sorgelig could help with this at some point. He has mentioned being interested in a hybrid solution for this core I think.
Post Reply