Lets actually try Hybrid Emulation

robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

foft wrote: Tue May 18, 2021 6:45 pm Perhaps we can patch this, to add an extra 8k to each stack then grow downwards from there. So we never share code and stack.
http://aminet.net/package/util/boot/StackAttack2
Even unmodified it makes the stack significantly larger if there's loads of available memory - so it might already help. If you have more than 128 meg free the stack will be 128k - so that alone should be enough to make sure there's no page clash, yes?
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Since it grows down any code right after the stack allocation will clash, however large it is?

Still I might install it then put an 8k variable on the stack at the start of the dhrystones program.
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

Or just allocate a new stack in a completely different part of RAM?
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I just installed stackatack2 'as-is'. With my simple loop it worked every time (of about 5-6 tries) now. I tried (real) dhrystones and get about 55000. This isn't quite the 400000 still so I wonder if something else is going on there, anyway its much much better than I got before.

Anyway this seems worth improving. Though of course the issue remains for other programs with data near code, so if its possible to cut the overhead in qemu that'd be good. I guess the same tlb is often hit so caching the last might save a tree lookup for instance.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

What is the memory map for rtg? I’d like to get that working properly with this.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

foft wrote: Tue May 18, 2021 9:05 pm What is the memory map for rtg? I’d like to get that working properly with this.
Is it just in Z3 fast ram? So if I fix the caching and start using the 'correct' DDR3 area again rather than malloc'ed ram will RTG work again?
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

foft wrote: Wed May 19, 2021 7:21 am Is it just in Z3 fast ram? So if I fix the caching and start using the 'correct' DDR3 area again rather than malloc'ed ram will RTG work again?
Without actually checking, I think so, yes.
(MiSTer's RTG was based loosely on the rather hackish and area-constrained RTG solution I put together for TC64 and MiST. On those platforms the RTG region is certainly just an allocated chunk of Fast RAM - I believe, though I haven't checked, that it's the same on MiSTer.)
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Right, so to get that working again need to work out how to enable the caching.

in uboot we pass these kernel options
mem=511M memmap=513M$511M
i.e. use 511M and restart 513M.

When I started I was mmaping the 384MB fast from the 513MB region. Even though I didn't use O_SYNC on the open it seemed uncached.

Pass we can pass mem=1024M and tell the kernel to reserve this some other way.

Alternatively I guess I need to find the kernel api to set cache properties on a page. Then I can set up caching for the chip ram region too.

Also I guess qemu does something sensible to flush the pages when the host cpu cache settings are changed, though have no idea.

Some light background reading: https://elinux.org/Tims_Notes_on_ARM_memory_allocation
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Remembered there was a solution for this on gp2x.

mmuhack kernel module here (by squidge, modified by notaz)
https://notaz.gp2x.de/dev.php

Perhaps this can be adapted.
User avatar
Grabulosaure
Core Developer
Posts: 78
Joined: Sun May 24, 2020 7:41 pm
Location: Mesozoic
Has thanked: 3 times
Been thanked: 92 times
Contact:

Re: Lets actually try Hybrid Emulation

Unread post by Grabulosaure »

RTG on MiSTer is just some DDRAM area directly fetched by the scaler.
The framebuffer address is the rtg_base[31:0] signal = 0x2700 0000

But, it is mapped in the 68K memory map @ 0x200 0000, in an unused memory area outside ZIII space.
(RTG doesn't need to allocate any fastram memory)

For this hybrid version, mapping RTG into ZIII memory could be simpler though.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

The control regs are in that region too?
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

foft wrote: Wed May 19, 2021 1:14 pm The control regs are in that region too?
No, they're at 0xb80100 (Following the not-yet-implemented Akiko CD control regs. MiST and TC64 have the ChunkyToPlanar reg but not the rest of Akiko as yet.)
ByteMavericks
Posts: 53
Joined: Tue Oct 27, 2020 4:52 pm
Has thanked: 69 times
Been thanked: 11 times

Re: Lets actually try Hybrid Emulation

Unread post by ByteMavericks »

Robinsonb5, I should pick this up separately, but is it possible to port the blitter (etc) from fpgaarcade for performance?
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

ByteMavericks wrote: Wed May 19, 2021 8:37 pm Robinsonb5, I should pick this up separately, but is it possible to port the blitter (etc) from fpgaarcade for performance?
Probably - though the current implementation doesn't resemble the fpgaarcade on at all. I just used their driver as a skeleton for mine, and then mine was adapted for MiSTer.
It might also make more sense to subcontract blitter duty to the ARM, since it has more direct access to the DDR than the FPGA does?
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

So I took a look at the mmuhack. It was armv6 based and the cortex v9 is arm7.

I found this, which dumps the armv7 page tables. (I had to add it to the kernel module since it needs supervisor mode)
https://github.com/yifanlu/ARMv7_MMU_Dumper

Anyway I ended up using the kernel api. Its easy to add mmap to fops (for debugfs need to use debugfs_create_file_unsafe or it ignores it) then in there can use something this simple:
static int minimig_mmap_cached(struct file *filp, struct vm_area_struct * vma)
{
vma->vm_page_prot = pgprot_cached(vma->vm_page_prot);
printk("mmap of %lx into %lx, with cached\n",vma->vm_pgoff << PAGE_SHIFT,vma->vm_start);

return io_remap_pfn_range(vma,
vma->vm_start,
vma->vm_pgoff,
vma->vm_end - vma->vm_start,
vma->vm_page_prot);
}


Similarly for pgprot_writecombine, pgprot_uncached, pgprog_dmacoherent. I don't really know what they all map to on the armv7 hardware. Also for some reason pgprot_cached was not defined...

Anyway it works, I can map the DDR ram reserved for Z3 fastram and it runs at the same speed as using malloc.

I also tried a couple more things:
i) cached chipram mapping: Led to corrupted screen. I wonder is qemu does anything to host caching with CACR etc.
ii) mapped 16MB from 0x27000000 (phys) to 0x2000000 (amiga mem space) for rtg. When I select rtg modes I lose monitor sync. However, I double checked my rtg setup with the original core and in that case I just get a black screen (but keep sync). So I have a setup problem AND another problem I think. I had used the adf from here (viewtopic.php?p=12186#p12186).

I'm also wondering why disk access speed is 50% that with TG68. I'm postponing look at that until I figure out how to do proper chipram caching.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

qemu seems to have some cache flushing logic in the core. However I don't see any cache action taken on cacr changes or CINV,CPUSH. I guess the latter aren't used much on the Amiga anyway since its only 68040+.
ByteMavericks
Posts: 53
Joined: Tue Oct 27, 2020 4:52 pm
Has thanked: 69 times
Been thanked: 11 times

Re: Lets actually try Hybrid Emulation

Unread post by ByteMavericks »

I’ve automated downloading and installing the extensions for rtg (and networking): https://github.com/ByteMavericks/MinimigMiSTer
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

Remember that the CPU isn't the only thing that writes to chip RAM - you're going to need some kind of bus snooping if you want to cache chip RAM.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

robinsonb5 wrote: Fri May 21, 2021 8:23 pm Remember that the CPU isn't the only thing that writes to chip RAM - you're going to need some kind of bus snooping if you want to cache chip RAM.
Wouldn’t this be the case on the original hardware? So the cpu is unaware that chip ram has changed and the software has to handle it?

I guess there is a cache inhibit signal for hardware regs.
robinsonb5
Posts: 129
Joined: Fri Jun 19, 2020 8:54 pm
Has thanked: 13 times
Been thanked: 57 times

Re: Lets actually try Hybrid Emulation

Unread post by robinsonb5 »

foft wrote: Fri May 21, 2021 9:12 pm Wouldn’t this be the case on the original hardware? So the cpu is unaware that chip ram has changed and the software has to handle it?

I guess there is a cache inhibit signal for hardware regs.
There is - and I believe the system uses a combination of function code and address to disallow the data cache (if present - 68030+ only, of course) for chip RAM and the hardware regs.
I'm not sure of the exact mechanism; I do know that bus snooping was necessary in order to enable full caching on Chip RAM for the turbo mode on MiST's Minimig.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

Interesting. This is an advantage we have in the fpga, we can snoop Amiga dma writes.

On Amiga it sounds like data cache is disabled on chip ram - according to Gunnar here: http://www.apollo-core.com/knowledge.ph ... 5&z=-AGF3K

We already effectively have instruction cache since qemu compiles the instructions and caches them.

So.... back to the other questions, why is disk access slow? Interrupt latency? More data and code sharing pages?
Oh and what did I do wrong with rtg...
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

So rtg... With the official 20210305 it works. With my own unmodified build it failed. So perhaps the git code was bad for a few days, trying a clean version.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

foft wrote: Sat May 22, 2021 12:19 pm So rtg... With the official 20210305 it works. With my own unmodified build it failed. So perhaps the git code was bad for a few days, trying a clean version.
Confirmed that RTG doesn't work on a clean checkout. Perhaps because I'm using Quartus 20.1.1. So something else to look into before I try the hybrid one.

Actually I realize that I have 17.0 installed too, so will check with that!
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: Lets actually try Hybrid Emulation

Unread post by Caldor »

I have Quartus installed as well, so I can help compile a core if needed, but I will have to check what version of Quartus it is I have installed. I think it might be 17.
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I built it with Quartus 17 and ... it didn't work. Then I went back to the core that worked before and it stopped working! So not sure what is going on, perhaps related to using VGA instead of HDMI. I'll plug in my HDMI monitor next time I take a look (probably a couple of weeks, busy with other stuff...).
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: Lets actually try Hybrid Emulation

Unread post by Caldor »

I have been playing around with the Test 9 of this and it does seem more stable. Still cannot boot SysInfo 4.3 or 4.4 directly but I tried booting into AmigaOS 3.1.4 using the HDF you shared earlier, which I suppose has a better supported 040 library, and from there I could run SysInfo 4.4 directly. Shows the same score as Sysinfo 3.24, except now it shows MMU and FPU as being in use.

The speed also seems to be pretty much the same as the regular Minimig running with the 020 core. I tried running some games. I ran Dungeon Master, and it had some audio errors, but it ran at a pretty good speed, but eventually it crashed. Froze I think. I tried Scorched Earth 1.8 and it shows some of the first screens, then before reaching the main game screen it encounters an IO error. I tried Eye of the Beholder AGA, and the intro was pretty slow. The music and audio overall seemed to work well though. No missing sounds like in Castle Master. When the game was fully loaded it seemed to work. I could even load a game and save the game, I suspected maybe it had issues writing data to disk. But before I could exit the game it froze.

Given the Amiga system has pretty good performance in tests, but sometimes runs very slowly compared to even an Amiga 500, I suspect it has a bottleneck somewhere, possibly when sending data between QEMU and the MiSTer core on the FPGA? I hope its something that can be helped by increasing a cache somewhere or something.

But I am thinking maybe its best to first focus on an 020 CPU which seems less likely to run into FPU and MMU problems, and problems with being a 040 CPU needing 040 libraries when the core otherwise seems to expect an 020 CPU? The ultimate goal, would be to get support for MMU and FPU of course, but I think it would probably be easier to debug this solution and its performance and stability, with an 020 CPU, which would make it more directly comparable with the regular MiSTer core as well. Also if its the PiStorm QEMU software that is being used here, I hear they have only gotten a stable 020 core there as well.

Oh yeah, I also tried AmigaOS 3.2, it has a special disk with generic CPU libraries, but it did not seem to work. I have not tried the 040 libraries from the 3.1.4 HDF that can boot, but I figured I might as well wait and see if what the next release of this has to offer.
kolla
Posts: 188
Joined: Sat Jun 13, 2020 7:56 am
Has thanked: 17 times
Been thanked: 33 times

Re: Lets actually try Hybrid Emulation

Unread post by kolla »

I read in PiStorm updates about improved Musashi (20-30%)
I suspect the improvements are only in the pistorm git repo, and not gone upstream.

https://github.com/captain-amygdala/pis ... mmits/main

Did anyone build for Minimig_hybrid yet? :)
foft
Posts: 334
Joined: Thu Dec 03, 2020 11:05 am
Has thanked: 29 times
Been thanked: 120 times

Re: Lets actually try Hybrid Emulation

Unread post by foft »

I did try the musashi from pistorm earlier in the thread, are there are more recent changes?

I should upload my last changes to the kernel module and stuff.

While I've been doing other projects, has anyone been trying to improve this? I'll be back to it in a few weeks I reckon.
kolla
Posts: 188
Joined: Sat Jun 13, 2020 7:56 am
Has thanked: 17 times
Been thanked: 33 times

Re: Lets actually try Hybrid Emulation

Unread post by kolla »

We’re all eagerly awaiting your return, hehe ;)
User avatar
LamerDeluxe
Top Contributor
Posts: 1160
Joined: Sun May 24, 2020 10:25 pm
Has thanked: 798 times
Been thanked: 257 times

Re: Lets actually try Hybrid Emulation

Unread post by LamerDeluxe »

Yes, what a cliffhanger :)
Post Reply