Power Management 21 August 2005 Summary Sleep, a.k.a. suspend to RAM, is working, but it's not entirely reliable -- every so often it fails, for reasons I've not been able to track down. I got it working both from the linux console and from x-windows. Whether various modules will suspend is not yet tested -- so far I'm suspending without modules. By default, suspend to RAM is resumed when the lid is opened; with acpid running, it is triggered when the lid is closed (though the pin is too short and needs fixing -- some superglue on the tip might do it). It is a joy to wake up to a fully ready, cool machine. Hibernation, a.k.a. software suspend or suspend to disk, is working reliably -- including from x-windows (using both nv and nvidia). One effect of suspend working is that far less heat is generated from a resume than from a reboot, so the machine runs cooler even in KDE. However, it heats up fast, so icewm and indeed console is still recommended. Note that suspend messages go to the kernel log if it's running, which it might as well be. The fan turns on around 48°C as the machine warms, and off around 42°C as the machine cools. With no special cooling, it takes the machine half an hour to warm from room temperature to the first fan speed. When you suspend the machine, the temperature falls as much as five degrees every ten minutes, so in half an hour the machine is mostly back to room temperature. I've experimented with an icepack to maintain a more congenial (congealed) work environment, and temperatures can be kept just above 40°C for long periods, provided you work only in console. Battery is detected properly, and hdparm is set in scripts to spin down the drive every ten minutes. Most daemons are turned off with the cool script, and most modules similarly unloaded. Frequency scaling is set to 1.2GHz. Monitor CPU frequency and temperatures with temp. In console and x-windows, the rest modes have been mapped to the vprMatrix key. Suspend to disk by pressing vpr (or writing sus) -- this is the normal way of leaving the machine. Suspend to ram by pressing shift-vpr (or write sov) -- use it when you can afford to lose context, as it frequently fails. To run the cool script, press ctrl-vpr. For keyboard mappings, see Keyboard. After a dozen or so suspends, the system may go slightly off -- notably, KDE wouldn't start once. This isn't thoroughly tested. Note that you can suspend Linux (even running KDE) and then boot into WinXP -- just change lilo.conf first, as it currently isn't giving a choice, and make sure the XP-readable partitions aren't mounted when you suspend (they won't be if you run cool first). I could do the same in WinXP -- it has both sleep and hibernation. Software
Commands
13-18 August 2005: suspend to RAM (Rome) I solved the problem of reinitializing the graphics card! Wow. I had to use "vbetool post" from the vbetool package. While suspended to RAM, the machine cools completely -- to 28 degrees, as cold as it ever gets here. You can leave x-windows running, at least with the nv driver, but the suspend command must be issued in a linux console. The /usr/local/bin/sov script now handles suspend to RAM from both console and x-windows -- it uses chvt and adds a short but possibly critical pause before switching back to x-windows. There is still be a residual problem, though you can get long sequences of successful sleep cycles. However, there are unexplained failed suspensions:
Since resume is currently not failing, I'm starting to think possibility three is likely not the case, but it's still early days. For now the priority is to keep the machine working, even if that means using unnecessary safety margins. Having resume fail is a bother -- the machine must be reset and rebooted, and the drives aren't cleanly unmounted, and your whole workflow is interrupted. So policy is to stay on the safe side -- it is such a pleasure to have things reliably working, so knock wood. I've added date (and temperature) stamps to the sleep script (sov), so I'll have a detailed record of how many suspensions work. In spite of all my own warnings, I tested sov2, which borrowed elements from the kernel's contributed script, and it failed miserably. There seems to be a problem with the drive freezing, because the problem occurs before the command to wake up the graphics card is issued. A likely culprit: the cool script, which puts the drive to sleep every one minute. Some idiosyncratic event like that is not a bad candidate for the problems I'm seeing. Once suspend to RAM succeeds, I typically don't run the cool script again -- and after running nvx once, suspend tends to work. After reverting to a ten-minutes of inactivity spin-down down time (hdparm -s 120), I've not had problems. Well, the problems keep returning -- every so often, the hard drive doesn't start up. The problem is not the screen; it wakes up every time the drive wakes up. In the sov1 script I used "echo 3 > /proc/acpi/sleep" rather than "echo mem > /sys/power/state", which works fine, but I have so far no reason to assume it's any better. Using the sov script ran tests of 13 cycles without finding problems, but every so often the problem does recur. I still don't see a pattern in the occasional failures. I tested sleep mode in WinXP and it works flawlessly, even with very brief pauses between sleep and awake. I tested a dozen times and had no problems. The leds light up in just the same patterns as they do in Linux. Of course afterwards Linux passed the same test. I guess I could do the following: if it is critical that I not lose the material, suspend to disk. If you can afford a loss, suspend to ram. So far the latter is not entirely reliable -- but the failure rate is likely somewhat higher than needed because I've forgotten to unload modules, or some other factor listed above matters. In brief, keep using sleep (sov; try sov1 next time you get problems) but use sus if it's critical that you keep the session data. Warnings and suggestions
The miracle of suspending to RAM is that you get an ice-cold
machine in x-windows. No disk activity, just cold and silent
applications, like a Grecian urn. Suspending modules Once it's reasonably clear that suspending to RAM from a minimal system, with no modules loaded, works reliably, I may begin to experiment with some loaded modules. The sov script currently assumes the cool script has already been run, which removes all but a few USB modules for mouse support. The sov script itself removes USB modules before suspending and adds them back in after, though this isn't absolutely necessary -- it's just that if you unplug a mouse while it's sleeping (which I might well do) , that could apparently cause a major upset, possibly leading to a failed resume (not tested). Anyway, the current system is working, so keep it for now. Other modules haven't been tested -- for instance, will nvidia with native agp suspend to RAM? It now suspends to disk, so it might suspend to RAM too. However, the kernel documentation's power/video.txt says about another laptop that it needs the open-source nv driver to resume. In any case, there are downsides to using the nVidia module; it's very large, and significantly lengthens the time it takes to resume. For normal work, there's no advantage and some disadvantages in terms of resource consumption. What was needed to make this work There's a known problem with suspend to RAM: the video card doesn't initialize properly on resume. See /usr/src/linux/Documentation/power/video.txt for suggestions; here's what worked. I installed the vbetool and put each of the six options on scripts called 1 through 6:
I then ran the sleep script: sudo chmod 664 /sys/power/state and pressed the lid sensor to resume. Once resumed, I tried each of the six scripts in turn -- and number 5 worked! All the others failed to reactivate the monitor. The "post" command runs the BIOS code located at c000:0003, which is the code run by the system BIOS at boot in order to intialize the video hardware, so it's pretty direct and intrusive stuff, but it works beautifully. The command must be run from a text console, as it will otherwise interfere with the operation of X (man vbetool). Wow -- I got it back. I think the first time I tried this is a year ago; I had no idea what the problem was, and it didn't occur to me it could be just the monitor that needed waking up! I also needed the vbetool -- as noted in /usr/src/linux/Documentation/power/video.txt A week later, I noticed the kernel's power/video.txt documentation has this recipe for using vbetools, which I really should try at one point (my comments in the script):
As you can see, this is a suspend-to-RAM script that is actually very similar to your own -- with some sophisticated elaborations that of course may or may not be genuine improvements. It even has the extra switch to a console and back that I discovered was needed. I incorporated most of the suggestions into the sov2 script, on the assumption this is more robust. I can also cut the ten-second wait in x-windows this way. However, the machine froze solid -- no hard drive activity even. So something is seriously wrong with sov2. On the other hand, something is seriously right with my own sov script. By mistake I forgot to run the cool script first, and ended up suspending with oodles of modules loaded! My script handled sleep and resume extremely robustly -- that is to say, it ran into a loop, suspending fine, but on resuming it would attempt to suspend again at once. So that's not good, but its problem had nothing to do with waking up the console -- that worked every time! So for the moment, just live with it and see if it has weaknesses or not. Various suggestions from others (there may still be some good ideas here). First one I tried that failed: 1. Edit /etc/default/acpi-support and uncomment the ACPI_SLEEP=true line In brief, I added "pci=noacpi acpi_sleep=s3_bios" to the options in lilo, but this didn't do the trick. I don't have an /etc/default/acpi-support directory, and don't know if there's a Debian equivalent. or these suggestions for a Sony Vaio: -- the instructions may be Radeon-specific:
This particular method of switching from one virtual terminal to another didn't work. However, once I had been made aware of the command, I was able to make it work by timing when it was OK to use it -- essentially, you can do it at the beginning, before you go to sleep, and at the end, after you have woken up. Or try this one:
Note that this is essentially the method that I discovered works on my machine, though not using the RadeonTool of course, and I've had no luck suspending directly from x-windows. I could ask the author of vbetool perhaps? Or just live with having to switch to a console first. Actually I think I see a way around this: before suspending, issue chvt 2, and after resuming, issue chvt 4. Put in a delay if you need to. OK -- that worked -- but you need a fairly long delay, around 5 seconds, or you get nothing and have to shut down. In retrospect, the problem may not have been the length of the delay in the script, after resuming, but rather that you have to wait to press the lid sensor. If you don't wait at least ten seconds or so, resume will fail. This effect in fact may have compromised some of my earlier tests, rendering the results unpredictable -- the curious thing is that I developed a gut-level instinctual expectation that something would go horribly wrong that was quite reliable -- and this intuition in turn, when validated and reflected on, made me realize that the script failed when I pressed resume too soon. A bayesian network had apparently been built -- or is this simple Hebbian learning? No, doesn't Hebbian learning simply mean that what's done once is easier the second time? A bayesian network creates expectations based on inferences about the probability of something happening that hasn't happened yet -- perhaps especially something bad? Are bayesian networks particularly easy to build to detect patterns of aversive stimuli? The reset pinhole, incidentally, only turns power off, but it does so also when the power button fails. Finally, I figured out how to use fgconsole to check if we're in windows (vt > 3) and created a reasonably elegant script, even if I still don't know how to ask if a variable has been assigned or not. I found that switching virtual terminal to vt 8 (not used) and back again helped restore the right horizontal alignment in the focus terminal and it doesn't seem to create any problems. On the whole I'm pleased with the script; it handles things quite intelligently. Now what I still don't know how to do is this: I want the commands in the script to go straight to the kernel, not be shown on the command prompt (when I press the programmed vpr key). Now I cannot use that key while I'm in pico, for instance, since it will just output the name of the script in my text rather than execute it. I could ask Shinn Wu. 6 August 2005 Update: the nvidia driver (Rome) Does the nvidia driver support software suspend (swsusp), or will it soon do so? Here's what the app-n of nVidia's readme says: DPMS modes "suspend" and "standby" do not work correctly on a flat panel display, or on a second CRT when using TwinView. The screen becomes blank instead of the flat panel being set to the requested DPMS state. More generally, app-s of nVidia's readme says that the driver supports suspend and resume based on APM, but does not support ACPI. The instructions also say that Linux 2.6 AGPGART supports suspend, but only for some chipsets -- so does it for my ALi1541? The instructions also mistakenly claim that Linux 2.6 does not support suspend to disk. I see a SuSE instruction page claiming that the nvidia module will suspend fine as long as it's using its own internal agp and not ali-agp. I discovered on 14 August 2005 that I can use nVidia's internal agp with no performance loss -- around 1300 FPS. Check this with "cat /proc/driver/nvidia/agp/status". If there is no line "Status: Enabled", then AGP support is not available. I checked and found it was enabled. And in fact suspend works! Both suspend and resume work without problems. Maybe I'll have problems down the line, but for the moment it looks like even the nVidia driver supports suspend to disk, provided it is using its own internal AGP. This means suspend to disk works fine with these modules loaded: Module Size Used by If nvidia suspends, it may well be that most other modules also suspend fine now -- note for instance that agpgart has no problems. Suspend and resume both take a fair bit longer with the nvidia module -- maybe twice or three times as long. 1 August 2005 Update (still Cambridge) When I press the lid sensor, a buzzing sound stops and the screen goes blank. That buzzing is more or less the only sound the vpr is making now, in cool mode, but I imagine I can't fix it. I could run acpid now that I've verified acpi is working, at least
for suspend to disk. It appears to rely on laptop-mode-tools. I guess
it would be nice if the suspend script was triggered simply by closing
the lid. Well -- that worked! I just modified
/etc/acpi/actions/lm_lid.sh to run /usr/local/bin/sus and now the
system suspends when the lid-sensor is pressed -- of course acpid has
to be running. However, the lid sensor is too short to actually sense
that the lid is closed (unless you press on the closed hinge), so I'll
look to have that fixed during the next repair and in the meantime use
the vpr key. That's generally better anyway, since I don't generally
close the lid. 30 July 2005 Update: suspend to disk working and mapped to vpr key (Cambridge) On 26 July 2005 I tested suspend to disk (echo 4, or
hibernate), and it worked! I had issued "cool" and "sync" first to keep
things as simple as possible. The machine turns off, but resumes when
turned on again. The idea is to save time booting, and return to where
you were working. Through a laborious experimentation I also succeeded in remapping the vpr matrix to to the suspend script, see Keyboard. From /usr/src/linux/Documentation/power/swsusp.txt: You need to append resume=/dev/your_swap_partition to kernel command line. Then you suspend by Note the content of /sys/power/state: I have some leftover files from an earlier swsusp
installation, including /usr/local/sbin/suspend and /etc/suspend.conf.
These provide information about what is needed to make suspend work,
but since I got a customized solution you should perhaps just remove
these files. In the Sigillo directory I left some swsuspend2 files, and
in /usr/src I left some packages. Things are already working so I'm not
very motivated to do anything more. 10 December 2004 Update Sleep and software suspend is slowly being sorted out in the kernel -- check out if this works:
20 Aug 2004 -- new version Installing new version of config file /etc/laptop-mode/laptop-mode.conf ...Also installed and started acpid. 17 August 2004
Package: laptop-mode-tools in Debian Maintainer: Bart Samwel <bart@samwel.tk> I've never seen a kernel option -- I just built the 2.6.7 kernel -- but read Documentation/laptop-mode.txt 2 January 2004 Update The 2.6 kernel branch is finally getting Jens Axboe's "laptop-mode" patch, ported to 2.6 by Bart Samwel.
The discussion started in late December and version 5 was posted on 2
January 2004. Version 6 was merged in Linux 2.6.2-rc1-mm2 and includes a kernel configuration option
and Documentation/laptop-mode.txt. Bart also wrote a script I made a copy of in
sigillo:/root/scripts/spindown.sh that requires laptop-mode -- make
sure to write him about it before you use it. Frequency scaling 17 August 2004 cpufreqd
Aug 19 13:55:25 sigillo cpufreqd: scan_system_info(): battery present - 100 - on-line 27 July 2004 powernowd
cpufreq: CPU0 - ACPI performance management activated. cpufreq: *P0: 2000 MHz, 20000 mW, 250 uS cpufreq: P1: 1200 MHz, 10000 mW, 250 uS Note, however, that only two steps are handled, although the wattage is halved. When i use the powernowd package with this setting, I get only those two values. If, in contrast, I activate P4_CLOCKMOD, I get eight possible values in powernowd. The net result, however, still appears to be superior with CONFIG_X86_ACPI_CPUFREQ -- I'm basing this judgment on lower fan speeds after leaving the machine alone for a while -- perhaps because the voltage is also decreased? I guess I could ask the guy who wrote powernowd about this. You can also use the cpufreqd pakcage, but I don't have any reason to think it's better than powernowd. I installed powernowd (web page) on 27 July 2004 and it is working fine. Config file /etc/init.d/powernowd (use sysv-rc-conf or sysvconfig to activate/deactivate) To see the current CPU speed, issue cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeedHere's what I get with the CONFIG_X86_P4_CLOCKMOD setting -- 8 possible speeds: # powernowd -h for options # man powernowd # powernowd -vvv powernowd: PowerNow Daemon v0.90, (c) 2003-2004 John Clemens powernowd: Settings: powernowd: verbosity: 3 powernowd: mode: 1 (AGGRESSIVE) powernowd: step: 100 MHz (100000 kHz) powernowd: lowwater: 20 % powernowd: highwater: 80 % powernowd: poll interval: 1000 ms powernowd: Found 1 cpu: powernowd: cpu0: 250Mhz - 2000Mhz
Suspend to RAM ACPI suspend to ram (untested):
I discovered that the noflushd utility has just been updated for the
2.6 kernel -- it's maintained by Daniel Kobras at sourceforge. I got
the cvs version of version 2.7 dated January, 16th 2004, configured
with ./configure --with-scheme=debian and issued make from sigillo; it
built fine. I then did a checkinstall -D on gubbio and built the debian
package. Daniel writes, "on a Debian system, you can simply run
'dpkg-buildpackage -r -us -uc -b' and get a Debian package containing a
simpler init script with less compatibility cruft in it." On 21 January 2003 Daniel released 2.7.1 in Debian and I installed it. The control script is /etc/init.d/noflushd and the
configuration file /etc/default/noflushd. I set the timeout from 30
minutes. You can start and stop it using wajig: just start noflushd/etc/fstab has noatime: /dev/hda5 / ext3 defaults,noatime,errors=remount-ro 0 1I changed ext3 to ext2, as journaled file systems bypass the buffers used by noflushd. To make sure you don't lose data, issue syncbefore leaving the machine; that should flush the hard drive memory buffers to disk. From the README file: Make sure to tell syslogd not to sync logfiles after each write. (Go to /etc/syslog.conf and prepend all the less important logfiles with a '-'.)Also look out for cron jobs. Early notes In late April 2003 I installed the 2.5.69 kernel on Sigillo and it
handles battery detection perfectly. ACPI I removed apmd. "Notice that you may mount your ext3 partitions with the the noatime I added noatime to /etc/fstab for the ext3 partitions: /dev/hda5 / ext3 defaults,noatime,errors=remount-ro 0 1 I then ran this command to activate the new value: root@sigillo:/home/steen# mount -oremount / Verify this worked with cat /proc/mounts.
The noflushd watches the activity of drives and spins them down after a certain time without access. The timeout can be defined by the user. At the same time it watches the kernel update daemon (kupdated), which periodically flushes all buffers to disk. So noflushd prevents kupdated to write the buffers to disk, if the disk is spun down and there are no other disk accesses. On 19 January 2003 I installed noflushd -- version 2.6.3.1 from unstable. This is the recommended commandline for the most common laptop setup: noflushd -n 60 /dev/hda I issued this -- but the noatime may be just as important. Installing noflushd makes no sense if there are no additional steps taken to reduce hard disk activity. One thing to do is to set the noatime attribute for mount. Usually Linux logs the most recent access time for all files, which leads to many short-time disk accesses. On the other hand logging that is not essential, neither for Linux nor for the user. Setting the attribute noatime for a partition in /etc/fstab prevents Linux from logging the access time of files -- I already didthis (see above). The noatime option seems to have made a big difference, but I may have to issue noflushd -n 60 /dev/hda every time I start up -- where can I add it at startup? It turns out there's a /etc/default/noflushd file that is read by /etc/init.d/noflushd. I set timeout to 15 and moved noflushd into init 2 (actually it was already there, after the installation). So that's some progress -- the ACPI may have to wait, but support actually looks pretty good, so you might want to start thinking about what it is you want to happen.
|