Troubleshooting kded4 Bugs

Not just recently, but as long as KDE exists, we are getting bug reports for the KDE process “kded4“. These include:

  • kded4 crashes
  • kded4 leaks memory
  • kded4 eats 100% CPU
  • kded4 <defunct> process

Before I dive into troubleshooting those problems, let me explain what kded4 does and how it works.

Little Demons Everywhere

In the Plasma Workspace there is a widget to show notifications when the battery goes low. But how does the widget know when this happens? It could just check the battery level every couple of seconds, but this would waste CPU cycles. Instead, many ACPI BIOS implementations can trigger hardware interrupts to notify the operating system about the battery state.

Now Plasma could connect to the OS directly, but this would mean it will not run on other operating systems. The solution is a daemon which has OS specific backends. They watch those OS signals and Plasma can connect to the daemon’s interface (usually via D-Bus) to learn about the battery’s health.

Thus a daemon is small process that often just sits there and waits for OS events and triggers simple actions. KDE needs daemons to watch the network, the printer ports, the screen brightness hotkeys, the mounted disks, etc.

With the numbers of daemons we need, but with the very little work they do, it makes sense to use a single process for all of them. The key advantage is less memory usage, and this is exactly what kded4, the KDE daemon process, exploits: To enable the various daemon functionalities, it loads specialized kded modules instead of needing a separate process for each of them.

The Crux of kded4 Bugs

The downside of this approach is that bugs in any of those modules can be fatal for the complete process; in other words, every module must function correctly for kded4 to operate successfully. Which brings us back to the bugs: Some modules still have issues.

Lets say someone reports a problem about growing kded4 memory usage. With the above description in mind, it is easy to understand that – without further analysis – the bug could be in any module, or even in multiple of them. Since the set of modules that developers use might be different than what the reporter used, and the daemons often only handle very seldom actions, many issues cannot be reproduced easily, and without some help from the bug reporter, those problems often cannot be fixed.

The following checklist should guide you through the steps required to troubleshoot kded4 bugs and help collecting information needed to fix them.

Make Sure the Bug is Reproducible

Some bugs only show once. You installed an update, and on next reboot, kded4 crashes, maybe because some configuration files were not correctly updated yet. On subsequent reboots, it works fine without a crash. What to do? Most people do not save previous configuration files, so even when going back to the old version before the update, the crash might never happen again.

You plugged in a headset, and kded4 starts consuming 100% CPU cycles, maybe because a new device triggers a bug in underlying Phonon modules. If you restart, this bug never happens again, because Phonon might “remember” the headset, and not trigger the bug on this code path. Worth a report?

You log out of KDE, noticing you have hundreds of hanging (defunct) kded4 processes lying around. After killing them all and logging in again, there is only a single kded4 process, and that correctly terminates on logout, so you might think the problem was only temporary. The reason is that the process might only hang when a module performs a certain action; it is never the logout per se that makes kded4 hang.

Related are problems that show when kded4 hangs or does not run at all. The symptoms are often completely unrelated. For example, people notice cookies in Konqueror not working, and the usual cause is kded4 not running. Some module might prevent it from starting, or cause it to hang, e.g. waiting for an event that never happens. Of course, these issues should also be fixed.

If you want to report such a bug, make sure you can reproduce it. Try finding old configuration files, try reverting to previous versions, try to find out what steps or actions are required to trigger it. Use the top console command to verify if kde4 really consumes CPU cycles or eats memory. If you cannot reproduce, there is little chance we can. Bugs that never go away are easier, but that does not imply we can see them, too.

Isolate the Offender

The next step is to find out which module is responsible. Without this information, there is often little we can do.

Disable kded4 modules in System Settings > Startup and Shutdown > Service Manager. Some modules cannot be disabled there or automatically start even if you disabled them. To disable such a module, remove its .desktop file from /usr/share/kde4/services/kded/, run kbuildsycoca4, and restart kded4. (Move them to a place where you can easily restore them. This requires root privileges.)

Regularily retrying to reproduce the bug with some modules disabled should eventually lead to the offender. Of course you should start with the “obvious” candidates. For example, if a problem shows when plugging an audio device, the fault is very likely in the phononserver module. For crashes, the offending module can often be found in the complete backtrace.

But sometimes the bug is only triggered by specific combinations of modules, so some patience trying to find them certainly helps. If the problem is reproducible regardless of modules, the bug might really be in kded4 itself, but this is very unlikely from looking at last years bug reports.

Check Existing Bugs

Before filing a new bug report, you should check existing reports for similar or identical problems. If you add a comment with your findings it may be the determining factor for fixing the bug. If you found no similar report, add a new one. Ideally you report it against the module which is responsible.

Note that some distributions might add additional distribution specific modules, in particular to check repositories for software updates. Bugs for those modules should be reported to the bug tracker of your distribution.

In any case, if you are not sure about any of above steps, you might contact KDE forums to ask for help. Even if you cannot provide all required information, it is better to write a report for a new bug, instead of hoping someone else does. Only if you want to see it fixed, of course.

About these ads

20 Comments »

  1. jbernardo said

    How do you suggest I go about tracking the offending module for the 100% cpu bug discussed here – http://kubuntuforums.net/forums/index.php?topic=3116537.0 ? I’d disable the NetworkManager service, but stopping it just makes me unable to test the bug, as I can’t connect to the broadband network (or any other network) without it…

  2. Ami said

    Thanks for the very important post. The methods you describe here are post-mortem. Most users could use this to report bugs, but not to work around it. The only way is to solve a problem todat is to wait for a new KDE release to fix the bug. If I would know that the crash is caused by a faulty screen brightness module (for example), I would probably choose to disable it until a fix is out.

    How much memory is actually saved by this approach? Most of KDE and Qt is in shared libraries anyway. By aggregatting the modules to a single process you forefit the all OS facilities like scheduling, memory usage reports and process restart.

    I would very much like to see a more fine grain control to kded. It is an important facility that sometimes get a bad name being unstable all because of its modules…

  3. gnumdk said

    Will try to find what module make kded4 eat 100% of my CPU on every logout…

  4. Axel said

    If I understand correctly, if there are a problem in any module used by kded that can implied many problem with the most common the 100% CPU. It would be very good to have a sort of safety measure.

    In my case, I do have sometimes a 100% CPU and killing kded process is working very fine and I cna recuperate a fonctional desktop. Perhaps that should be done more or less automatically.

    What you suggest it very good to help to debug but not everybody can do it but the problem touch a lot of people.

  5. kdepepo said

    @jbernardo, 100% CPU kded4 bugs that are caused when disconnecting from a network are tracked at bug 272527. It appears that they are related to “ntrack” (bug 268038), but more help finding the exact cause is appreciated.

    @Ami, I have 33 kded modules in my installation. I cannot say how much memory each process would need, but I am sure the savings exceed 50 MB.

  6. Alex Merry said

    The NTrack module is actually from kde-runtime, not any specific distribution.

  7. footsal said

    thank you for your great contribution on hunting down KDE bugs!

    a kde user

  8. kdepepo said

    @Alex, thanks for the correction, I fixed the text :)

  9. Karellen said

    I imagine it would be a lot of work to create a “debug” mode/configuration switch where modules are loaded in a separate process. If you’re experiencing semi-occasional but hard-to-reproduce bugs, setting this would allow you to make the trade off *if you wanted* in order to be able to find out which module is causing a problem when it does happen.

  10. jbernardo said

    @kdepepo: If ntrack is the one shown in start-up services as “Network status” (or something similar, I’m using the Portuguese translation so it has “Estado da Rede”), then disabling it seems to fix the 100% cpu in kded4 when I disconnect 3G. I think I’ll check now bug #268038…

  11. Ami said

    @kdepepo, 50MB is a lot of memory, nevertheless for a desktop/laptop system with > 2GB RAM, 50MB is a worthy tradeoff for a seperate process mode like @Karellen suggested. I prefer having less memory but being able to hunt down and stop problematic Kded components.

    Do you think it would be possible to have a module-per-process model (even for debug/testing purposes)?

  12. anon said

    @Ami

    I guess on a desktop it’s a “worthy tradeoff”. But remember, KDE does *not* run on simply one platform.

    And having “modules per process” would be a shitload of work for what payoff? Then those codepaths would have to be maintained, no less…

    Doesn’t make sense, imo.

  13. jbernardo said

    @kdepepo In the end I had to edit /usr/share/kde4/services/kded/networkstatus.desktop and set X-KDE-Kded-autoload=false and X-KDE-Kded-load-on-demand=false to get kded4 not to hang with 100% cpu on vpn or broadband disconnection, just stopping and disabling the startup services wasn’t enough, the module would still get loaded making it impossible to pinpoint the “culprit”. Seems like this is one of the modules that keeps auto-loading even if disabled in the startup services.
    Thanks for the guide, lets see now if I can add anything to the bug reports.

  14. Achim Bohnet said

    Thx for the description of kded4 bug hunting.

    But it unfortunately confirms me in my impression and experience that kded4 bug hunting is ‘too time consuming’ :( I’ve more or less given up (shame on me) and try if logout/in or reboot fixes kde4d. This has the bad taste of ‘like-windows’ but it works :(

    IMHO it’s necessary to improve either robustness of the kde4d system (e.g. kdeinit+forking) or via and/or add debug helpers, to e.g. top
    like interface that show module: wake ups/sec and % cpu or maybe a use SIGUSR1 that prints out currently running module.

  15. anon too said

    Small trick when you have a process using CPU for no good reason:

    Enable core dumps, and then “kill -SEGV” the offender. Save the dump, and analyse it later. At the very least, you’re likely to get a backtrace going through the problematic code.

  16. Karellen said

    Actually @anon, you might be better off switching to a real console and attaching to kded4 with gdb. That way you could pause it and check the stack a number of times, and also inspect the loop that it’s in, and dump memory if required.

  17. hamelg said

    Here, kdebugdialog helped me to find out a workaround.
    When logging out my kde session, kded4 never terminates.
    I enabled debug messages for kded with kdebugdialog and saw this message at logging out :
    startkde: Done.
    kded(25301) ObexFtpDaemon::offlineMode: Offline mode
    kded4: Fatal IO error: client killed
    kded(25301) LircClient::~LircClient: deleting theSocket

    To fix that, I put a “sleep 1″ in my shutdown script in .kde4/shutdown directory.

  18. David Howells said

    Can you add a bit on how to valgrind kded4? I’m seeing kded4 gradually eating all my RAM, and if I could start it under valgrind, I could probably locate the problem fairly quickly.

  19. kdepepo said

    David, you probably need to run kded4 with “–nofork” argument to make it appear in valgrind. Make sure it is not already running. Thanks for help with debugging it, but see also bugs 294497 and 306206.

  20. David Howells said

    I’ve seen those, thanks, though I’m not sure how applicable they are since this is my desktop and I don’t suspend/resume it and the power management settings are all off.

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 137 other followers

%d bloggers like this: