Calendar
ArchivesCategoriesSyndicate This Blog |
Tuesday, April 24. 2007Shadows for everyone (well, not really)
So, everyone loves bling on their desktops right? Here's a screenshot of a composited desktop with a drop-shadow (and a fade-in effect, but that's obviously not visible in the image):
![]() Oh wait! It's not a desktop machine, it's my N800! There's actually nothing fancy about composite, it's just drawing somewhere else then putting the displayed screen together from the pieces (either automatically or by a manager application). Normally this translates to shadows and transparency in people's minds, but that's only a part of it. Composting can be used for various other purposes like task switchers showing real window contents, smoother window transitions and so on. This all of course translates to "draw more power from the battery", as the usual tradeoff. More unusual tradeoff (and a bug somewhere) is constantly increasing X server memory consumption with xcompmgr. It's normal for the server to take more memory (usually the window contents are not preserved if the window is offscreen), but not in constantly increasing manner ;) Unfortunately the xcompmgr performance on N800 is not as good as one would like. The fade-in effect is working for smaller windows (the statusbar menus, home screen menu) but for the regular sized others-menu it doesn't render the whole sequence. I'm not sure if it's really the notoriously bad graphics output badwidth limiting it or is it just using too much CPU to get the rendering done. I've done some testing and seems that I can get "live" thumbnails of constantly updating apps without too much troubling the CPU. Relatively speaking of course, it does eat a bunch of resources which usually is not what you want to do on a embedded device. Specially if you plan on going with the battery for comfortable periods of time. Want to try? To make my life easier, but also to encourage experimenting with Composite extension on N800, I went through the trouble of building few composite-enabling debs for IT2007. What I did was take the xserver-xomap sources and built it with composite support. Also I took the Composite library from Debian and built a deb for that. Since those two don't really offer anything by themselves, I packaged xcompmgr to accompany them, with an entry in others-menu and even an single-click-install file! WARNING: DON'T DO THIS WITHOUT ACCEPTING THAT YOU MIGHT HAVE TO FLASH YOUR DEVICE Also, I don't take any responsibility of what that .install might do to your device. Legal mambo-jambo out of the way, there's two things you need to do. First, make sure you have the repository from repository.maemo.org in your appmanager. Although this is usually meant only for the SDK, it has the Damage extension library we'll need. Easiest way to add that is to click on "Maemo repository" at Maemo-Hackers repository page. Secondly, download this .install file for xcompmgr: Xcompmgr.install (File->Save as.. in the web browser seems to work), and launch it from the file manager. There are two reasons for this; the Google server doesn't know anything about the mime type used for the .install files and, perhaps more importantly, this will update your X server, possibly resulting in a bootloop so providing a single-click way to do that probably isn't that wise :). You can manually reverse it by installing the older X back, but I'm not sure if without a manager the composite-enabled X will do anything differently than the normal server... Then just reboot (to start using the newer X) and launch xcompmgr to start enjoying some shadows and fadings (well, as much as you can ;). There's no UI to tweak the settings, but one can modify them from /usr/bin/xcompmgr.sh if my selection is not to one's liking. Thanks to Tuomas for hosting the debs. The repository also includes -dev packages so start hacking! :) Wednesday, February 14. 2007N800 - VFP or not to VFP?
We started on a quest with Tuomas to find out just how much performance we could squeeze out of the N800 by compiling the whole thing with VFP support (in essence, use the hardware floating point unit on the device).
You might be surprised to learn that most if not all of the software on N800 is compiled without support for the FPU. I guess the main motivation to do so is library size, and Tommi Komulainen mentioned on IRC that the boot time was increased without real benefits when they enabled VFP in GTK+ and Pango, so it was quickly reverted. The reasons for this boot time growth most likely include the loss of thumb instruction support the VFP imples, which means growth in library size (I'll get back to that below) as well as the fact that mixing thumb code and VFP code does not mix at all. Lets look at these hypotheses a bit closer: Library size and memory consumption Tuomas already had figures on the rootfs size growth as well as per-library notes in his blog, and that much is obvious. Usually the number one reason to use thumb code in the first place is to reduce the code size. I don't consider the flash eating a problem (after all, you can now get 8 Gt of space with the two SD slots...), but unfortunately this will be projected to RAM usage as well. We didn't think of this too much while making the tests, but I tried a quick 'cat /proc/meminfo' at the end of the boot cycle and got the following results:
So you get an initial 7MB penalty for wanting more speed. This most likely will increase when more libraries are loaded, although most of the common stuff should be already in memory by the time the desktop is up. My gut feeling is that this will become a problem only if you tend to run multiple apps or keep many browser windows open, but of course have nothing concrete to offer as proof... The boot speed is also discussed by Tuomas, but here's few notes about that as well:
In short, mixing thumb code and VFP code is not a good idea. This is due to the nature of the thumb instructions, they are 16 bit versus the 32 bits of "normal" ARM instructions. When the processor executes thumb code, it needs to be in a different mode than when executing normal code. Now, if the processor has to execute two different types of code in sequence, it needs to switch back and forth in the modes, which will be relatively slow. Now, picture a scenario where Glib is thumb code and GTK+ is VFP. Every single time any GObject traffic is happening, like say signals, the processor is jumping from mode to mode like crazy when it executes GObject code in thumb and then signal handlers in VFP which are more than likely to use stuff like g_strdup() etc which again means you go back to thumb... You can see it's a lost cause to find any performance there. This is pretty well illustrated by the following table, which shows the gtkperf timings step-by-step when working towards fully VFP system. The test was executed so that I took the debs that Tuomas had built and divided them up in chunks, installing one chunk at a time, rebooting and running gtkperf.
Gtkperf is not the most robust test in the world, and we observed big variance between runs. In particluar the thumb-only number is one of the bigger ones I got, but it never was much under 900s and the 930ish figure was repeating for me at that time... Thus the numbers above should not be taken as face value, but as a indication of a trend. And that is very clear in these numbers, the difference between thumb system and VFP system is almost 200s so that cannot possibly be a fluke ;) Most notable drops in the time comes in the end, when we start to get a "pure" VFP system. This is to me a clear indication that any benefit gotten from the VFP in higher libs is swallowed by the constant mode swithcing (and probably by other, more complex issues like caches). When you don't need to switch modes anymore, it's all pure benefit from hardware floats. So what about Cairo, my very best favourite graphics library? Yes indeed, it too gets a hefty boost from using hardware floats. It is pretty clearly cut in two sets: the radials (2x-7x boost) and the rest (<2x boost or nothing). The mosaic_tesselate_curves gets also a good boost (almost 2,5x), so it's not only radial patterns that benefit. As a conclusion, I'm not shy to waste my flash and RAM on speed, so I'll definitely be running a VFP-enabled system in the future on my N800 :) Wednesday, February 7. 2007Toyz Toyz Toyz
The most rewarding thing in working for a embedded field company is that you get to fondle lots of fun toys. Being a Cairo-lover, usually the first instinct for me is to run cairo-perf on them ;)
Here's some of the toys I've been and will be playing with (click for a bigger image): From the left: Nokia 770, Trolltech Greenphone and Nokia N800 (that's actually my discount-code-aquired personal device in the picture). The big thingamagick underneath the compactly packaged ones is a Marvell PXA-320 development board. We have some other thingies here too, like the Samsung Q1 and some older Texas Instruments OMAP 1510 boards etc, but those in the picture are the devices I've been involved with (and are not under NDA ;). The others I've got covered, but the Greenphone still needs to have Cairo compiled to it. I don't really know how easily that's doable but I intend to find out... Pictures © Mika Yrjölä ![]() Monday, January 22. 2007Want to write a CPA applet for Maemo but writing C gives a rash?
No problem, use cpa-launcher.
What's this? It's a small C-project stub that can be used as a base for a Control Panel applet. The funny thing about it is that it doesn't show any GUI, it just creates a hidden window that is modal to the CP, runs a binary (the name is compiled in, see below) and blocks CP until the launched process finishes. This means that it doesn't have to be C anymore, but can be C++, Python or Mono or whatever. Why? I wrote this after finding out that Gtkmm wasn't an option for writing a CPA. It would with some hacking run once, but not twice. The module is never really unloaded so Gtkmm tries happily to re-register all the types it has already registered and goes boom. How do I use it? Grab the package, rummage around in the Makefile.am's and configure.ac's and debian/ dir and switch "cpa-launcher" to be your applets name (there's comments where you need to change). The src/Makefile.am defines the binary name that will be launched, the cpa-launcher.cc file should be left as-is (not even change the name, though it won't hurt if you change it from the Makefiles too...). Then 'dpkg-buildpackage -rfakeroot' should give you a .deb to install. The reason for this not being a generic binary is that when CPA calls the execute() function from the module, it doesn't give any useful indication of the applet name or title or anything. Thus, you always have to compile a version for your binary/script. You'd normally want your plugin to be single package anyway, so now you just get the (C part of the) autotools stuff for free. There needs to be a catch buried here... True, and it's a nasty one too. Launching a dialog and launching a binary are not comparable in speed. The maemo-launcher-hack helps a little with that, but it won't be equal ever. Specially considering that the launchee might be a Python script. There isn't a banner for the launching yet, but I think it requires one. But at least you can write the plugin in Python now ;) Saturday, January 13. 2007Cairo performance on small devices
Inspired by Carl's mail to the maemo-developers mailing list, I took a few gadgets from the office and ran cairo-perf on them with the latest cairo snapshot (1.3.10).
The contenders:
I included my old laptop to get a view on how far off devices like 770 are from a "desktop" machine. While oldish hardware, it runs a GNOME desktop decently when there is enough memory (64MB internal + 256MB SO-DIMM in extension slot). The test setup was to have Cairo snapshot 1.3.10 and X running on each machine. Now, for the image backend the setup is pretty comparable as it doesn't depend on too much other software, but the X is a bit different story. The 770 is using xserver-xomap_6.6.3 (I guess, version from maemo2.0 repository) and the N800 is using xserver-xomap_1.1.99.3-0osso21 (ie. from the X.org 7.x version releases, with modifications). These have the advantige of being customised for the hardware, but both devices lack serious GPU power (the OMAP 2420 does have some grahpics stuff, but I don't know if they are being used on N800). The PXA-320 board graphics acceleration I'm not sure of, but at least we simply use the generic framebuffer driver for Xorg 7.x on it. It's possible that the kernel side has acceleration, but my knowledge on that area is pretty slim. Also the toolchain used is the freely available mainstone one, so no too much optimization there either. My poor old laptop is not the hottest piece of GPU power either by todays standards (an integrated Trident Cyberblade), but at least it has one! :) It is running a self-compiled Xorg 7.x too. The performance matrix So here are the results of the runs and the comparison. Each name links to the full log of the run, and each item in the table links to the comparison of the intersecting devices. Highlights below. So, what can we see from the results?
(Page 1 of 4, totaling 16 entries)
» next page
|
QuicksearchLinkListBlog AdministrationPowered by |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||