Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[blog] MMU mini-series
#1
Hello PCSX2 followers,

It's been a while since the last developer blog entry. I would like to resume this old tradition.

I will present you a mini series on the MMU (memory management unit) and virtual memory. Jake and ZeroFrog already wrote some posts relating to this topic:
* http://pcsx2.net/developer-blog/218-so-m...-vtlb.html
* http://pcsx2.net/developer-blog/231-virtual-memory.html

This time, I will explain the goal of MMU and why it is mandatory for modern systems. The MMU is a cornerstone for stability and security. I will also explain why it could be costly for the performance on the native system.

On the next article, I will present you the PS2 address space and how the MMU is really used.


Let's begin with the root, the Operating System (OS). Hardware can run without any OS but it is often a necessary evil. The OS has 2 major goals:

1/ The first goal of the OS is to control privilege access of processes.
Maybe you want to keep a resource for the hardware or OS internal, for example a hardware watchdog. Maybe you only want to check the validity of a command. It could be as simple as only allowing a stop command after a start command.

Let's see a basic example: starting some hardware counters

   

As you can see, without any OS you could easily kill your system. The OS allows the separation of privileged and standard resources.

2/ The second goal is to share the resources between processes
You need to understand that processes are independent for coding simplicity and security.
Let's reuse our hardware counter example

   

This time we want to start the same counter from two different threads. Unfortunately process 1 will be corrupted by process 2. It's pretty bad, process 1 will likely crash. Again the OS is the solution. It replaces the corruption with an error that can be handled by the process.


Concluding this mini presentation, the OS is the gatekeeper of the hardware. The OS is a real god but be aware "with great power, comes great responsibility". System stability depends on the quality of the OS. On the contrary the process is isolated. A bad process will die alone, yes life is tough.




Now let's talk about virtualization. In our 2nd example, the OS throws an error because the counter is already used by another process. However our system can support several counters, it is a shame to wait for that first counter to be available. The process wants to start a counter, it doesn't care which one. So it might be possible to handle this situation better. The solution is virtualization.

Here's the idea, let's create a pool of virtual counters by process and then the OS will remap them to a real physical counter. As a bonus, we don't need to limit the number of counters based on the hardware. You can have 10 virtual counters for only 2 physical counters.

Let's redo the previous example with a virtualization layer in the OS.

   

As a side note, virtualization is a common solution for resource management. On the internet you have a logical IP address instead of using the Mac address directly. HDD access can be virtualized too with a tool like LVM http://en.wikipedia.org/wiki/Logical_Vol...28Linux%29 . Sometimes the whole machine is virtual, see VirtualBox or even PCSX2.




The hardware counter was an example but the subject of the blog is the MMU so let's get back to virtual memory. And yes you got it right, virtual memory is the same concept as we discussed earlier. The process will access virtual memory. The OS will translate it to the physical address.

I said that the OS will remap the virtual memory to physical memory but that is not exactly correct. Memory access is very common in a process, let's say every 2 or 5 instructions. Imagine if the kernel did a lookup of the physical address every 4 instructions, it would be slow as hell. Actually it is even worse because the program counter (PC) is also a virtual address. Software is deemed to be slow, it needs to be hardware accelerated.

Here comes the MMU that allows the CPU to do the conversion on the fly. The MMU uses a dedicated cache named TLB (Translation Lookup-aside Buffer). However a hardware cache is very costly, you can't store a physical address for each logical address, besides remember that the virtual memory could be bigger than the physical memory. So the memory address space is split in pages, often 4K but various CPUs can support several page sizes. The PS2 supports 7 sizes of pages from 4K to 16MB.

A TLB miss will cost you hundreds of CPU cycles and for good reason, a 4GB virtual space requires 1 million table entries of 4K-pages. A common trick to reduce cache misses is to increase the page size so you can fit "more memory" in the TLB cache. However there is a terrible issue, the waste of resources. One page is an atomic resource, that means either the page is free or the page is locked by the process. If the process allocates a single byte on a 16MB page, nearly 16MB is lost for other processes. It's not a real solution for the PC which uses hundreds of processes but that is very different for the PS2. We will discuss this next time.

See below a typical use model of the MMU:

   


The MMU is a nice and complex solution to handle memory access. But there is more. You can attach some flags to each page. For example on a modern PC, you have readable/writable/executable. The PS2 doesn't support all those flags, yes it's not modern anymore, however it has another class of flag that controls cache behavior: cacheable/uncached accelerated/uncached. Note for completeness sake that the PS2 also has read only flags. It is a powerful mechanism but we will discuss the implications on the next chapter.


Mini quiz:
* The first 512KB of RAM is not virtually mapped because it is at the location of the OS, do you know how the kernel will access this location?
* A hardware component needs to access the memory without the CPUs intervention. Do you know which one? Do you know how the addresses are handled?
Reply

Sponsored links

#2
Nice.
[Image: vwah44]
Gaming: Intel i7 3770k @ 4.2Ghz | R9 290 | 16GB RAM | 480GB(240GB+240GB RAID0) SSD | 3 TB HDD | 1 TB HDD | 500GB HDD
Server: AMD FX 6300 @ 4.4Ghz | GTX 670 | 16GB RAM | 240GB SSD | 320GB HDD
PCSX2 General Troubleshooting FAQ
Reply
#3
Thanks. Let's hope I found some time to do next parts Tongue2

A warm thanks to Bosit for his English review Smile
Reply
#4
Nice article, I'd really love to see the dev blogs updated once again.

I've worked with virtual memory in a few projects recently and it's no fun. As for your mini quiz...

Quote:The first 512KB of RAM is not virtually mapped because it is at the location of the OS, do you know how the kernel will access this location?

The PS2 (and MIPS in general) uses bits 31-29 of the address space to differentiate between user/superuser/kernel modes. PS2 games run everything in kernel mode, which allows for the low half of the address space (0x00000000 - 0x7fffffff) to be TLB mapped. The next 512 megabytes (0x80000000 - 0x9fffffff) are directly mapped to the low 512 megabytes of physical address space. The next 512 megabytes (0xa0000000 - 0xbfffffff) also map to the low 512 megabytes of physical address space and have slightly different usage (which I can't recall at the moment).

If your kernel is located at physical address 0x00000000, you would have to access it via 0x80000000 or 0xa0000000. In the case of the PS2, the kernel (BIOS) is located at physical address 0x1fc00000, meaning you access it at 0xbfc00000 in the address space. That also happens to be what the PS2's program counter (both EE and IOP) is initialized to upon start up.

Quote: A hardware component needs to access the memory without the CPUs intervention. Do you know which one? Do you know how the addresses are handled?
Are you talking about Direct Memory Access (DMA)?
Main Rig: i5 4670k, 16GB RAM, Nvidia 770 GTX, Windows 8.1/Arch
Main Laptop: Toshiba Kirabook. i5 4200U, 8 GB RAM, windows 8.1/Arch
Reply
#5
Quote:The PS2 (and MIPS in general) uses bits 31-29 of the address space to differentiate between user/superuser/kernel modes
That not exactly correct Tongue2 Address doesn't change the mode of the CPU.
Reply
#6
"11-25-2014"

I hope this isn't considered thread resurrection.
//But I have no idea how to answer your quiz questions.

ok I found it in one of the blogs (Reserved!)
Reply




Users browsing this thread: 1 Guest(s)