Aros/Developer/KernelBasics
Booting
editwhere is configuring serial debug handled now since most of the stuff that had been fixed on x86-64 appears to have been removed. arch/all-native/bootconsole. I moved this code into shared library because it is used both by bootstrap and kickstart.
if grub (for instance) is now loading modules at memory address 0 as you mentioned in another post, it will be trashing the rsdp pointer so we wont be able to locate the acpi tables in memory. Modules from address 0. This doesn't have anything to do with my changes. This is how GRUB works. It places modules into best-fitting memory region. Of course it should exclude all BIOS information from available address ranges. If your kickstart overwrites BIOS information, it's a bug in GRUB memory management. Previously the kickstart wasn't split into small parts. Big single module simply didn't fit into 640KB of conventional memory.
Also if the memory pages are being protected before this then we cant even scan for it anyhow. Only two regions are protected:
- One page from 0 address.
- Kickstart's read-only regions (.code + .rodata + struct KrnBootPrivate).
There are parts of acpi/smp config that must be done in kernel resource (mostly the table parsing etc, to configure the APs, since they need access to the structures/code used in kernel resource) I know this. ACPI parser seems to be called before exec/kernel bases creation.
Your problem with reboot seems to be associated with serial output code. I copied it from old x86-64 code, with as little changes as possible. To prove that it causes the bug, you can try to boot up without serial debug. It does appear to be - I can now boot a little further (but still crashes, likely due to my own changes) and see more debug output on screen, however adding the debug=serial:0@115200 line makes it lock up again at the same point (jumping into the kernel is the last displayed output).
Text mode output is also not tested. In fact i run well in VESA (framebuffer) mode set up by EFI GRUB. The difference with PC: Multiboot information contains no VESA mode and controller info, but only generic framebuffer data. VESA mode info is made up out of it by bootstrap, since AROS kickstart expects this. Perhaps there are some bugs in mode setup. First i would make sure that all output (text mode screen, VBE mode screen, and serial port) work in the bootstrap. If in all modes you reach "Leaving 32-bit environment", it's okay and libbootconsole works. Kickstart reinits the output, it uses the same code but another init routine. It parses boot taglist instead of multiboot info. You can also find a useful thing for you in test/boot. It's a small program to test libbootconsole. If it doesn't work, something is wrong there. Bugs in taglist parser can be debugged by initializing libbootconsole manually using hardcoded parameters, then dumping the taglist and related structures. This is how i debugged it.
And one more hint: serial output may generate interrupts. Perhaps this is the problem.
OK. After doing some slow debugging on native x86_64 (adding halt continually and rebuilding to see the gfx debug output ...) It seems the following problems exist so far .. kernel/kernel_startup.c
kernel_cstart() /* * Fill zero page with garbage in order to detect accesses to it in supervisor mode. * For user mode we will disable all access to it. */ MUNGE_BLOCK(NULL, 0xABADCAFE, PAGE_SIZE);
Causes a GPF, and after ..
core_ProtKernelArea(0, PAGE_SIZE, 0, 0, 0);
Pretty much any access to the memory list causes a page fault. I havent looked into it too much yet but either a> the memlist is located in the first page, or the memheader for the low memory is in the first page—so everything trying to allocate etc memory ends up hitting it and crashing.
Ok most of these issues seem to stem from using the first page of memory (or registering it when initialising the melist from mmap). I think it is best to just adjust the start address of regions getting registered to make sure they are > PAGE_SIZE.
As for the debug problem - it isnt just when we leave the bootstrap that serial debug is stopping - it also happens if there is more than 4k of output from the bootstrap. It appears the memory list that is passed from Grub -> Bootloader -> Kernel gets corrupted before it is passed into the kernel.resource. I will try and dig a bit deeper to see where this is occurring.
If I try in VMWare - booting with VESA ends up crashing in kernel.resource. however if I boot in VGA mode it displays only as much info as is output by serial debug (that is the first 4k of debug, or upto the point of leaving 32bit mode) - and nothing else seems to happen (ie.. vesa causes the virtual cpu to die or halt, vga seems to just sits in limbo).
Do you have any custom configure or compiler/linker flags that might upset the flags needed to achieve 64->32-bit cross-compilation? Does anyone know what "i386:x86-64 architecture" means? AFAIK, it means its 64bit. I suspect the smpboot file has been compiled as 64bit, while it is trying to link it in 32bit code. The only real differences I can see from the macros are rule_compile defaults to the target compiler, while rule_link_binary uses the kernel linker.
Soft reboot is currently broken on pc-i386. When I do a soft reset before a graphics mode is entered, I can see that no modules are found and booting finishes at a SAD prompt. I added a debug statement locally that prints out the address that exec_RomTagScanner() starts searching at: on QEMU, it's 0x07e70000 on initial boot, but 0x07fec170 on soft reboot. Is this normal? No, it's not normal. I need to add some code to i386 native exec, which saves a copy of boot message taglist from bootstrap. Without that, soft reboot will remain broken...
Scheduling
editExec's quantum value is the amount of time a task may run? And this value is based on the vblank rate? Yes it is. Every VBlank quantum is decremented. When it reaches zero, tasks are switched.
So the smallest quantum a task can have is 1/60th second? in fact 1 / 50. (if it does not relinquish control itself). And, at most, 60 "tasks" will get a chance to run if none relinquish control (and the quantum =1)?. 60 per second. Not maximum of 60 tasks in the system. Yes exactly .. 60 times per second "the next waiting" task will get a chance to run. It doesn't depend on tasks number. Just one task runs for 4 VBlanks. IFAIK it is even worse, since I tasks get quantum = 4 to start with, so its 15 task changes per second. Yes and no. If a task doesn't make any OS calls that make it sleep / wait it'll be low. For comparison, though, Linux used to default to up to 100 years ago, now it defaults 250 and is configurable to do up to 1000 time interrupts a second. Tasks can get longer timeslices than one quantum (one jiffy in Linux-speak...) though.
I have not had a chance to look into how the vblank interrupt timer is implemented, but I would suspect we should have some means for it to register "A" timer source (recording its precision somehow), and let exec choose which to use during boot from the available ones.
Additionally - it should be possible for other timer sources to register after exec has chosen one - and they should be able to replace the currently used one.
I was thinking along the lines of the timer in use having a hook that it stores (which basically sets up the next "event"), and each time the event triggers it, it calls this hook to setup again. If another timer (with better precision) loads it could then just forbid multitasking, replace the quantum value with one relevant to it, and replace the pointer to the existing hook with one used to setup its own events.
The next event by the previous timer source will then cause it to come to life..
I also think we would need to use some value (ala bogomips) as well as the granularity of the quantum to decide on a suitable real value for the quantum itself.
If my deductions are correct, the values used (and granularity of the quantum) dont sound particularly suited to modern processors - so I am wondering if there is a way to set a finer level of granularity for the quantum? Would be really cool. What now happens with GL or SDL demos is that when they are run, system becomes unresponsive (especially on tasks that involve IO, like opening drawers in Wanderer or loading other software.) (even moving windows is slower). When I however set the priority to -1, the framerame of those demos is only slightly impacted but system becomes fully usable again. You would need a timer of higher frequency. Note two things:
- You can't raise frequency of VBlank, otherwise you'll break delays in many software.
- Some architectures (currently only hosted) may have a high-frequency periodic timer. It is used to measure intervals by timer.device and also to emulate VBlank. However, for compatibility reasons, Quantum is still measured in VBlanks.
Native architectures (except Amiga) don't have such a timer because their hardware (at least supported by AROS) has only a single timer, which is controlled by timer.device, and timer.device emulates VBlank itself. Amiga features separate hardware VBlank timer, which can be an example of separate periodic timer, however its frequency can't be raised. i386-pc port has a problem (because of not merged code). It still uses old INTB_TIMERTICK hack, which is actually non-periodic (because of how timer.device works), and Elapsed time is decremented in its handler, not in VBlank handler. This way, quantum is actually floating on this port, depending on active timerequests. I would say this needs to be fixed. Quantum should be measured in some deterministic units. Standard Amiga scheduler measures it in VBlanks, so my common code uses the same units.
This means you'll have to emulate your high-frequency timer via timer.device in a manner similar to VBlank. I heard the word "HPET" (high-precision event timer), however i don't know what hardware features it. And again, at least on UNIX-hosted you WILL have to emulate it, because UNIX has only one timer (SIGALRM). You are free to experiment with this, it's undiscovered area. Remember that kernel.resource allows to have different schedulers (KrnSetScheduler() function), so you are welcome to implement own scheduler. HPET is the standard high-precision timer on x86 and x86_64, and is an evolution of the traditional PC architecture's 8254-based timer.
Wikipedia: "[http://software.intel.com/en-us/forums/showthread.php?t=52108 HPET] is meant to supplement and replace the 8254 programmable interval timer and the RTC's periodic interrupt function. Compared to these older timer circuits, the HPET has higher frequency (at least 10 MHz) and wider 64-bit counters (although they can be driven in 32-bit mode).[1]" There is also a specification for high-resolution POSIX timers, the IEEE 1003.1b REALTIME spec.
An alternative way of improving this is to add an option to temporarily boost a task that is returning from a wait on certain types of events. E.g. a task that is waiting on a port and wakes up to process a message from a high pri task could get assigned a larger quantum as a one off to handle it (such as processing messages from intuition / input handlers). Linux sort-of halfway tries to do this, but they're allergic to "special cases" and since in Linux the GUI is "just another process" it's extremely hard for them to do it in a way that doesn't have ugly corner cases, but if using knowledge of the source of an event you could make the system always (to some extent) prioritize user initiated events. It's still a good idea to reduce the length of the quantum on higher speed CPU's, though. And/or making it configurable.
Trying to clean up the LAPIC etc code in the x86_64 port atm (just in my local tree) so that I can start playing with it. My first "goal" is to finish configuring the AP's, so that the APIC's, GDT, TLB's, PTE's, etc are configured and interrupts are getting handled on the relevant cores. Please update your tree. I rewrote x86-64 kernel and merged the code. BTW, i remember there was acpi.resource in development. IIRC in some branch. What about moving ACPI stuff there from kernel.resource ? It's not necessary to run secondary cores from the same point as bootstrap one.
- Use XT timer for timer.device.
- Use HPET as high-frequency periodic timer in kernel.resource. VBlank will be emulated by kernel then.
Current support for kernel's periodic timer is in kernel_timer.c. I already told you about scheduler switching functions. Nothing more. If something more is needed, it should be developed by someone who will be implementing the thing. :)
Perhaps. However this particular task can be solved in a simpler way. You just need a periodic timer independent on timer.device. Assuming its frequency is a multiple of VBlank, you can use existing kernel.resource code. In hardware implementations of timer.device VBlank is emulated by repeatedly queuing a timerequest. kernel.resource's emulation is not implemented on those architectures. Additionally x86-64 timer.device ensures that it is off by setting KAttr_VBlankEnable to FALSE. In fact all implementations should do it. By the way you can try to experiment on hosted by raising its default periodic timer frequency (currently by default it is equal to VBlank) and using it as a source for scheduling. Invent some algorithm for exec.library to select it. You can ask kernel.resource about this interrupt using KrnGetSystemAttr() function, KAttr_TimerIRQ attribute. The frequency of this timer is put by kernel into SysBase->ex_EClockFrequency, you can rely on this fact. This is also used by generic timer.device implementation. It doesn't drive any hardware and simply counts timer interrupts.
Building pc-i386 on X86
editWell, I was using the Gimmearos script which does:
export CC=$gccversion" -m32" "../$srcdir/configure" --target=pc-i386 --with-portssources="$curdir/$portsdir"
The first line shouldn't be necessary. The standard configure script sets "-m32" if appropriate. While I can't see how this would cause your error, maybe there are other variables that are clobbered by gimmearos.
Have you tried changing ..
%rule_compile basename=smpbootstrap targetdir=$(OBJDIR)
to ..
%rule_compile basename=smpbootstrap targetdir=$(OBJDIR) compiler=kernel
?
Should there be -melf_i386 declared in config/make.tmpl for %rule_link_binary? not in the macro atleast but possibly passed in