[an error occurred while processing this directive]
WWDC Sessions: Day 5
* Writing AltiVec Code *
Motorola gave a presentation giving details about issues related to AltiVec development. Vectoring your code is a lot different than your traditional notion of scalar code. Sim G4 is the simulation tool to use to develop code before hardware arrives. They stressed it will be good to get all over AltiVec since it will be an important part of Apple's strategy. They went into a lot of the details of issues with vectoring code as well as dealing with the AltiVec unit management (making sure it is fully utilized, consistent,...).
the SDK on Apple's site is good place to start. Given all of the sessions provided, Apple wants to make AltiVec more compelling than MMX has been on the Wintel side. It is still a problem since no one knows when the hardware is coming out for sure, but it appears it will be sooner than later with the 3rd part CPU card vendor ready to go. Software that takes advantage of it will be a different story.
One thing they are looking into is a vectorized type library which would make it easy to provide vectorized versions of applications that could also work straight out of the box with non-AltiVec machines. It would make it a lot easier to have a common code base across both G3 and G4s. The issue with this is that one runs into some of the problems that existed with the move from m68k to PowerPC. There was a lot of stuff on the floating point side that just made no sense to do on the m68k for emulation (i.e. SoftFPU kind of things). Also the other way, m68k FPU binary interpreted from the PowerPC source would have had the same disparity in speed. So there are issues, but they are willing to look into possible solutions given the advantages this could provide for programmers desiring to support both the AltiVec and past worlds.
Other than that, get busy with Sim G4.
* What's New: Nanokernel *
This was the most interesting session of the day. Before the release of MacOS 8.6, I had the belief that nanokernel was just an Apple moniker for the microkernel in Mach. I was wrong. Nanokernel is still a fuzzy thing in MacOS but it primarily has to do with the low level management of the various MacOS managers and the scheduling of it all. But Nanokernel is an OS kernel in the same context as Mach in the MacOS X world. Nanokernel is the kernel for MacOS 8.6+ and Mach is the kernel for MacOS X. Nanokernel functionality will be provided by the Mach kernel for the BlueBox running in MacOS X Server. The MacOS cooperative world (I think) goes by the name of Blue Task.
Nanokernel deals with multitasking, tasks and their address spaces, shared resources (memory primarily), and the tasking architecture. A task in MacOS 8.x is a point of control, or a thread of execution. Unfortunately, the naming conventions on the Mach side are a little different. A task is a repository of resources and inside of a task can exist threads. So a task has virtual memory, ports and threads associated with it. Nanokernel tasks can be though of as threads in the sense they are points of execution and not resource bundles.
Multiprocessing stuff got some major rehauling in the MacOS 8.6 nanokernel. They still fully support the MP API from Apple/Daystar. There is a nanokernel running on each CPU in a MP configuration. It runs below the stack and each provides unique scheduler. Cooperative tasks occupy a single preemptive task. The Blue task is controlled by the Process Manager manage the scheduling/behavior of these apps. As a programmer, one cannot assume tasks have access to the same memory space as in MacOS. This paradigm is slowly going away in the MacOS 8.x world and will be gone with MacOS X where each task/threads bundle get their own virtual address space. So they are doing everything they can to make it easier to migrate forward as well as make the current model more robust. There are explicit methods of sharing state between MacOS tasks. The reliance on the MacOS common address space cannot be assumed in the future.
The difference between Blue Task and MP tasks is still a little unclear to me. I am unsure if all cooperative tasks running are in Blue Task scheduled cooperatively (from a preempt safe task "wrapper"_) with MP tasks existing separately. I think this makes sense, the nanokernel can schedule to BlueTask and the MP tasks separately. Synch methods provided between MP tasks your normal semaphores, message queues, event groups. Event groups is a new one where message sets can be sent (bits).
Two MP task architectures possible. Have a lot of parallel input and output queues feeding each one of the MP tasks. Or, can have one input queue feeding all of the MP tasks and outputting to either a single output queue or parallel output queues. Likely depends on your application specific usage patterns for which is more beneficial to use.
Big additions with MacOS 8.6. MP with virtual memory enabled. Can now handle page faults for MP with the VM backing provider. Should be compatible fully with existing MP code. The API is fully supported as unchanged. The only potential issue are some of the MP API debugging calls are gone, replaced with a better, more flexible mechanism. Not sure if this has been seen as a problem out there with people with MP machines. The MP footprint in RAM has been reduced from 2 MB to 100k. That is a nice big change. The MP API is fully supported in Carbon.
Power management comes a lot from what was done at the nanokernel level. The BlueTask now blocks when it is idle, can move the CPU to a lower power state. This is low hanging fruit and as stated in the keynote, the battery life can improve by around 37% now. They believe they can go after a lot more now so look for future continued improvement, probably not as drastic, but definitely more room (BlueTask itself is still relatively heavy and can be trimmed). As developers, there is also a lot that can be done. The nanokernel keys off of the provided sleep intervals in the app in the event loop. A lot of apps specify zero. This makes it impossible for the nanokernel to do much with the scheduling of these apps. Case and point, of 3 browsers tested, iCab was the only one that performed nicely for the rest of the system (guess what the other two are ...). This can be seen with the Peek-A-Boo shareware product. So a lot of developers can help Apple improve the savings by checking their code to see if they are using busy loops in their code which is accomplishing nothing. Apple is working to improve the rest of the MacOS code to allow for less busy loops from their side. They are really pushing this for portables which is a good sign. There is only so much that can be expected from new battery technologies and hand cranks :)
Full SMP capability. This is away from the current master/slave paradigm where one CPU runs the show and the other CPUs get dished out work. Have bidirectional processor signalling in hardware. Implementation will prevent the dreaded priority inversion scheduling problem which can case blocking etc. All CPUS can address all of physical memory.
Single CPU improvement seen as well given the changes in the underlying scheduling mechanisms. Have a 20us interprocess switching time. Can't get much better. Scheduling overhead is less than 1% of CPU cycles. Throughput control is dynamic. This means scheduling priority can be changed on the fly. They had a nice demo of some bouncing balls in panes (each pane a different task) implemented with the MP API. Could change the priority of each one of the panes and observed the change in performance relative to the other windows. They also had a nice task timing statistics tool. Was uStat or uTool or something. Not sure if it is public, but it graphically displayed the amount of CPU time being taken up by running tasks (MP tasks, but Peek-A-Boo cooperative tasks which is also similar in appearance and info). Another nice demo showed two windows with graphics changing. If the mouse was clicked on a menu item, one of the windows stopped updating and the MP task window continued to update, Very nice stuff.
Suspend on exception task model. More debugging support for MP tasks, but far from complete. High performance preemptive safe memory allocation. This is a big deal. It is 15 x faster than MacOS 8.5, and the preemptive part is really key for moving the MacOS memory model forward. It also allows MP tasks better access to memory. Toolbox is still not completely safe (until Carbon) in MacOS 8.x. AltiVec support also there.
To the future. File I/O preemptive tasks. Greater than 1 GB of RAM addressing. Areas support - sparce address space and guard pages idea, advantages seen somewhat I think in BlueBox on MacOS X Server). Better VM flexibility and speed. Better Time Manager - the current accuracy of the time manager is questionable at times, improvements in the works, MP API timing facilities are much finer grain and accurate. And more Power Management improvements.
Takeaways. MP hardware is coming (when is a different story). Reduce or eliminate busy loops to conserve power and allow the scheduler to do a better job with the resources available. More preemptive capabilities will be included.
A lot of the demos saw super linear speed-ups from the nanokernel. Of course, this theoretically is impossible, but given the nature of machine architectures, sometimes it makes sense. The reasons for the demo improvements primarily from decomposition of the problem into smaller chunks across the CPUs. In some cases, this meant the whole problem could fit in cache memory speeding the whole process across CPUs. Other cases, graphics calls were reduced to a smaller number of calls because the work was down in parallel and more could be drawn to the screen at one time. So "linear" speedups were observed given the improved nature of operations offered by problem decomposition.
Threads manager has not been touched and still has the same limitations.
Was a very good session about a pretty low level component of the MacOS. They are really doing some great work getting around a lot of the nasties accumulated around the MacOS subsystems over the years. The MP tasks can really provide some nice speed improvements today (with smart problem decomposition) and with the soon to be arriving MP hardware ...
* Java Feedback Session *
Everyone proclaimed the desire and need for Java 2. The MRJ team continued to not commit to even shipping a Java 2 runtime. I think Apple is going to right direction with their efforts. They have talked to their major clients and have gotten the sense that Java 2 development will not be significant for the next 1/2-1 year. Given this customer feedback, they believe the best service to the MacOS community is to currently get out MRJ 2.2 with bug fixes and performance (to close the 30% gap), provide a solid Java SDK 1.1.x base for people to use. That is what they are announcing and what they are focusing on for the short term.
Now there is also the common belief Apple has to be all over Java 2 since "everyone" is moving there. If they want the Java 2 apps developed for people to deploy in the year time frame, they need to get the tools out there. Apple knows this. I get the feeling it is more of a marketing/political declaration that they haven't announced any Java 2 compatible MRJ yet. They have demoed the 2D graphics stuff which is one of the major additions and from the mrj lists and WWDC sessions, you get the sense that more is there behind the scenes waiting for MRJ 2.2 to get out and more focused development.
The Sun Java drops are not a trivial recompile on either MacOS or on MacOS X Server (even with the BSD roots). Most of the problems revolve around the windowing (AWT, Swing) stuff. Especially with MacOS X Server's Display Postscript and the emerging Quartz model.
Apple asks for developer feedback on this one. But it does not help to just say I need Java 2. They want to know why Java 2 features are compelling above and beyond what is available in MRJ 2.x and Java 1.1. It will likely be the Java wave of the future. But there are still a lot of performance and language issues associated with getting Java 2 up and running even from the Sun side, ignoring significant platform specific issues. They are aware of the demand and I am pretty sure on offering something in this space. Have to keep in mind MRJ was nothing a year ago and the MRJ engineers have had some significant progress in the last year. I don't see them slowing down any.
Parallel to the Java 2 2d graphics stuff, it was mentioned QuickTime for Java might be a really nice solution to fit into MRJ 2.x and for Windows. QT has better drawing, sound, video, multimedia, ... everything than Java right now. For Mac and Windows, there are some really neat features that can be leveraged from the QuickTime space onto Java. For the graphics part of Java 2, that might provide a vastly superior alternative. Now Java 2 has a lot more than just that so, it is a piecemeal solution for a part.
* MacOS X Network and Communication Feedback Session *
This was after the Open Transport feedback session so I think a lot of the dirty laundry had already been aired. Most of the non-OT specific questions revolved around NetInfo and bugs/limitations existing in it since the NeXT days. SCalability and single user and repairing and backing up and ... were all issues that were brought up. Apple runs Apple on NetInfo so they are aware of a lot of the difficulties currently there. Their goal is to make the computer networking experience as easy and intuitive as the Mac experience. That will mean automatically configuring a lot of things without user intervention. IOKIt will provide a lot of the plug and play capabilities out of the box.
Open Transport, they are working on making it as easy for OT code in Carbon to just work.
IOKIt will provide a lot of the tools necessary to write drivers for serial, ethernet and the like. ATM will be supported on the hardware driver side from the FreeBSD 3 code. But IOKIt will have to be used (maybe the addition of a ATM driver family written) to allow for ATM usage by the networking services.
Apple took a lot of the feedback and made people aware some design decisions still have not been made or functionality implemented in the current DP 1 MacOS X Client release. DP2 is a big target to get everything in there to the extent that APIs and functionality will just be there.
Eric
|