48 Commits

Author SHA1 Message Date
Jeff Brown
7dd2d19725 Emit a more meaningful cause for watchdog restarts.
Print the real thread name and a better indication of where the
hang was detected.

Bug: 10646480
Change-Id: Ic94742d0db08b8531cfd1429bb0026d6c30b779d
2013-09-06 16:23:54 -07:00
Dianne Hackborn
fa012b35b8 Improve watchdog error reporting.
We now keep track of all the threads that are stopped, and
print stacks for all of them.  Also more threads are now adding
themselves to the watchdog.

Unfortunately the stack we get from threads is far less useful
than the stacks from the ANR report, because these don't include
any information about the lock the thread is blocked on and what
thread is holding that lock.  For example, here is a test of the
log output from causing a hang in the system process:

W/Watchdog( 5205): *** WATCHDOG KILLING SYSTEM PROCESS: com.android.server.am.ActivityManagerService, main thread
W/Watchdog( 5205): foreground thread stack trace:
W/Watchdog( 5205):     at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:14333)
W/Watchdog( 5205):     at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:142)
W/Watchdog( 5205):     at android.os.Handler.handleCallback(Handler.java:730)
W/Watchdog( 5205):     at android.os.Handler.dispatchMessage(Handler.java:92)
W/Watchdog( 5205):     at android.os.Looper.loop(Looper.java:137)
W/Watchdog( 5205):     at android.os.HandlerThread.run(HandlerThread.java:61)
W/Watchdog( 5205): main thread stack trace:
W/Watchdog( 5205):     at com.android.server.am.ActivityManagerService.broadcastIntent(ActivityManagerService.java:12252)
W/Watchdog( 5205):     at android.app.ContextImpl.sendBroadcastAsUser(ContextImpl.java:1158)
W/Watchdog( 5205):     at com.android.server.DropBoxManagerService$3.handleMessage(DropBoxManagerService.java:161)
W/Watchdog( 5205):     at android.os.Handler.dispatchMessage(Handler.java:99)
W/Watchdog( 5205):     at android.os.Looper.loop(Looper.java:137)
W/Watchdog( 5205):     at com.android.server.ServerThread.initAndLoop(SystemServer.java:1050)
W/Watchdog( 5205):     at com.android.server.SystemServer.init2(SystemServer.java:1125)
W/Watchdog( 5205):     at com.android.server.SystemServer.init1(Native Method)
W/Watchdog( 5205):     at com.android.server.SystemServer.main(SystemServer.java:1116)
W/Watchdog( 5205):     at java.lang.reflect.Method.invokeNative(Native Method)
W/Watchdog( 5205):     at java.lang.reflect.Method.invoke(Method.java:525)
W/Watchdog( 5205):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:774)
W/Watchdog( 5205):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:590)
W/Watchdog( 5205):     at dalvik.system.NativeStart.main(Native Method)
I/Process ( 5205): Sending signal. PID: 5205 SIG: 9

Change-Id: I8ff9892d8d072d8dc599a73de4bdb75e3b1a6e97
2013-05-13 18:01:54 -07:00
Dianne Hackborn
f6438b16ba Eradicate basically all of the system reboot stuff in the watchdog.
This was implemented for 1.0 out of paranoia about the possibility
of needing to schedule reboots of the system after it went out in
the field, which I am happy to say was never ever used.

Let's get rid of it.

A small stub is left, to still service the API that allows people
to send a reboot broadcast to have the system immediately reboot.

Change-Id: I6731b24a28340e50c8015f8cb28e48f74f69f9b7
2013-05-09 18:53:48 -07:00
Dianne Hackborn
efa92b2182 Cleanup some of the thread merging.
Adds an optimization for checking whether a looper is stuck,
with a new Looper method to see if its thread is currently
idle.  This will allow us to put a large number of loopers
in the monitor efficiently, since we generally won't have to
do a context switch on each of them (since most looper threads
spend most of their time idle waiting for work).

Also change things so the system process's main thread
is actually running on the main thread.  Because Jeff
asked for this, and who am I to argue? :)

Change-Id: I12999e6f9c4b056c22dd652cb78c2453c391061f
2013-05-07 15:33:26 -07:00
Kenny Root
1d69bad411 resolved conflicts for merge of a98b0ff8 to master
Change-Id: I1f4a952d360c48426e22a7772726b6867cc19771
2013-05-07 10:14:46 -07:00
Kenny Root
add582122d resolved conflicts for merge of 485d7a31 to master
Change-Id: I058e19af8732df44457bdc614ee810a642dc25e4
2013-05-07 09:51:31 -07:00
Dianne Hackborn
8bd64df2ad Help for the debugging help for issue #8734824.
Add a new "hang" am command that lets you hang the system
process.  Useful for testing.

Change-Id: Ice0fc52b49d80e5189f016108b03f9fd549b58a7
2013-05-06 16:07:26 -07:00
Dianne Hackborn
5b88a2fd7b Debugging help for issue #8734824: WATCHDOG KILLING SYSTEM PROCESS
IActivityController has a new callback which the Watchdog calls
when it detects that the system process is hung.  This may be
use full monkey.  All hail the monkey!

Also add a new private feature to Binder to be able to turn off
all incoming dump() calls to a process.  The watchdog uses this
when it reports it is hung, so that if someone, say, wants to
collect a bug report at this point they won't get stuck waiting
for things that are all busted.

Change-Id: Ib514d97451cf3b93f29e194c1954e29f948c13b1
2013-05-06 11:16:18 -07:00
Dianne Hackborn
98eb06a12e Fix build.
Change-Id: Ib8f99e5137ace23ba4bfa764e81cce1f9f7d1aa8
2013-05-02 19:50:00 -07:00
Dianne Hackborn
c9dc93e5ca Merge "Start combining threads in system process." 2013-05-03 01:48:55 +00:00
Dianne Hackborn
8d044e8bc2 Start combining threads in system process.
This introduces four generic thread that services can
use in the system process:

- Background: part of the framework for all processes, for
work that is purely background (no timing constraint).
- UI: for time-critical display of UI.
- Foreground: normal foreground work.
- IO: performing IO operations.

I went through and moved services into these threads in the
places I felt relatively comfortable about understanding what
they are doing.  There are still a bunch more we need to look
at -- lots of networking stuff left, 3 or so different native
daemon connectors which I didn't know how much would block,
audio stuff, etc.

Also updated Watchdog to be aware of and check these new
threads, with a new API for other threads to also participate
in this checking.

Change-Id: Ie2f11061cebde5f018d7383b3a910fbbd11d5e11
2013-05-02 17:42:40 -07:00
Michael Wright
56a6c66158 Dump system server stack trace on watchdog failure
Change-Id: Ia70b116b789a51c4fbdd31348f127685f20f7500
2013-05-01 17:15:56 -07:00
Michael Wright
4da44d144a Improve watchdog explanation when system server is blocked
Change-Id: I5965a2b01c474cbe2a1ab342c3520b3d403d92e8
2013-04-25 14:44:55 -07:00
Jean-Christophe PINCE
1d08e599fe Watchdog race condition when sending message out of synchronized
A race condition leading to false positive detections might occur
when the monitoring thread executes very fast and terminates before
the sending thread entered the synchronized section.

Change-Id: I6fe686f8f12393e11fa18326508af5b73738f9d7
Author: Jean-Christophe PINCE <jean-christophe.pince@intel.com>
Signed-off-by: Jean-Christophe PINCE <jean-christophe.pince@intel.com>
Signed-off-by: Shuo Gao <shuo.gao@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
Signed-off-by: Jack Ren <jack.ren@intel.com>
Author-tracking-BZ: 81644
2013-04-09 15:02:40 -07:00
Michael Wright
8fa56f60a7 Lock when obtaining the current monitor name.
Also, remove some dead code.

Change-Id: I0e65671f9ca43addd8fc44dcd878bcff2f588e42
2013-04-01 16:46:36 -07:00
John Michelau
116415271b Fix Watchdog HeartbeatHandler to run on correct thread
The HeartbeatHandler for the System Server Watchdog has been running
on the wrong thread due to a race condition in initialization.  It's
designed to run on ServerThread, so that it can catch lockups in the
main looper of the System Server.  It has been running on
ActivityManagerThread instead, so it does not detect lockups on the
ServerThread as it should.

ActivityManagerService is calling Watchdog.getInstance() before
ServerThread calls Watchdog.getInstance().init(), so the handler is
being bound to the ActivityManagerThread instead of the ServerThread.

Explicitly bind HeartbeatHandler to ServerThread, so that the Watchdog
catches lockups on this critical thread.

Change-Id: Iccb184ac3adb817feb86ed4ee0e50e443bf74636
2013-03-19 14:39:33 -05:00
Colin Cross
5df1d871fe trigger kernel blocked stack trace on system server watchdog
Bug b/7638530 may be caused by a kernel deadlock when killing
processes under low memory conditions.  Write to /proc/sysrq-trigger
to get a kernel log of blocked tasks before killing the system server.

Bug: 7638530
Change-Id: I60df324ad4affdadbf13650099dc4dfb38722420
2012-11-29 12:44:09 -08:00
Dianne Hackborn
c428aae642 Fix issue #7267494, issue #7212347
7267494 Calendar is not syncing
Check for whether a content provider is dead before returning
it.  This is kind-of a band-aid, but probably the right thing
to do; I'm just not sure exactly the full details of why this
problem is happening.  Hopefully this "fixes" it, though I don't
have a way to repro to tell.

7212347 System power off dialog is only visible to user 0
Make it visible.  Also turn on some battery debugging stuff and
clean it up so we can just keep it.

Change-Id: I5add25bf2a763c8dfe1df23bc5c753a9ea5d157a
2012-10-03 18:07:23 -07:00
Jeff Brown
a4d8204e30 Fix some synchronization issues in BatteryService.
Some of the BatteryService state was being locked
sometimes and it wasn't at all consistent.

Bug: 7158734
Change-Id: I46e75f66fde92c5a577a80a6bd99c9573066f3c1
2012-10-02 19:11:19 -07:00
Jeff Sharkey
4de9936e85 Remove unused Secure settings.
Carefully leave default values intact in Watchdog for now.

Bug: 7232007, 7232230
Change-Id: Id944181109305aed41e0766fdd39625b43cb1d19
2012-09-26 17:58:19 -07:00
Jean-Baptiste Queru
0b5a4a1513 am 11626a91: am 9eb3bd88: am 42a58ecd: Merge "Revert "Watchdog: Improvement of debuggability""
* commit '11626a91b6e695e7a8fa9e9a9f1a37df11cfb4e2':
  Revert "Watchdog: Improvement of debuggability"
2012-09-05 11:07:21 -07:00
Jean-Baptiste Queru
784827b27c Revert "Watchdog: Improvement of debuggability"
This reverts commit 9211b13c3268035b0da0c51ed2d6d5a578d45ff3.
2012-09-04 13:35:12 -07:00
Jean-Baptiste Queru
4698e36db6 am 6ab3ea5f: am 147ef944: am 60d1e1a0: Merge "Watchdog: Improvement of debuggability"
* commit '6ab3ea5f48abfd777d5bd18d92acc3bc766f78ce':
  Watchdog: Improvement of debuggability
2012-08-30 13:54:18 -07:00
rikard dahlman
9211b13c32 Watchdog: Improvement of debuggability
If the watchdog detects a problem the system server process
is killed, that is followed by a crash. Because the crash is
done after the system server process is killed, the crash
don't contain info about the system server.
This improvement will make sure that the system is crashed
before the system server process is killed.
Behavior is only changed for eng and userdebug builds.

Change-Id: I9f1c8fd8b03d0114032ed44fb582705ad0b49733
2012-08-30 12:27:50 +02:00
Jeff Brown
9630704ed3 Power manager rewrite.
The major goal of this rewrite is to make it easier to implement
power management policies correctly.  According, the new
implementation primarily uses state-based rather than event-based
triggers for applying changes to the current power state.

For example, when an application requests that the proximity
sensor be used to manage the screen state (by way of a wake lock),
the power manager makes note of the fact that the set of
wake locks changed.  Then it executes a common update function
that recalculates the entire state, first looking at wake locks,
then considering user activity, and eventually determining whether
the screen should be turned on or off.  At this point it may
make a request to a component called the DisplayPowerController
to asynchronously update the display's powe state.  Likewise,
DisplayPowerController makes note of the updated power request
and schedules its own update function to figure out what needs
to be changed.

The big benefit of this approach is that it's easy to mutate
multiple properties of the power state simultaneously then
apply their joint effects together all at once.  Transitions
between states are detected and resolved by the update in
a consistent manner.

The new power manager service has is implemented as a set of
loosely coupled components.  For the most part, information
only flows one way through these components (by issuing a
request to that component) although some components support
sending a message back to indicate when the work has been
completed.  For example, the DisplayPowerController posts
a callback runnable asynchronously to tell the PowerManagerService
when the display is ready.  An important feature of this
approach is that each component neatly encapsulates its
state and maintains its own invariants.  Moreover, we do
not need to worry about deadlocks or awkward mutual exclusion
semantics because most of the requests are asynchronous.

The benefits of this design are especially apparent in
the implementation of the screen on / off and brightness
control animations which are able to take advantage of
framework features like properties, ObjectAnimator
and Choreographer.

The screen on / off animation is now the responsibility
of the power manager (instead of surface flinger).  This change
makes it much easier to ensure that the animation is properly
coordinated with other power state changes and eliminates
the cause of race conditions in the older implementation.

The because of the userActivity() function has been changed
so that it never wakes the device from sleep.  This change
removes ambiguity around forcing or disabling user activity
for various purposes.  To wake the device, use wakeUp().
To put it to sleep, use goToSleep().  Simple.

The power manager service interface and API has been significantly
simplified and consolidated.  Also fixed some inconsistencies
related to how the minimum and maximum screen brightness setting
was presented in brightness control widgets and enforced behind
the scenes.

At present the following features are implemented:

- Wake locks.
- User activity.
- Wake up / go to sleep.
- Power state broadcasts.
- Battery stats and event log notifications.
- Dreams.
- Proximity screen off.
- Animated screen on / off transitions.
- Auto-dimming.
- Auto-brightness control for the screen backlight with
  different timeouts for ramping up versus ramping down.
- Auto-on when plugged or unplugged.
- Stay on when plugged.
- Device administration maximum user activity timeout.
- Application controlled brightness via window manager.

The following features are not yet implemented:

- Reduced user activity timeout for the key guard.
- Reduced user activity timeout for the phone application.
- Coordinating screen on barriers with the window manager.
- Preventing auto-rotation during power state changes.
- Auto-brightness adjustment setting (feature was disabled
  in previous version of the power manager service pending
  an improved UI design so leaving it out for now).
- Interpolated brightness control (a proposed new scheme
  for more compactly specifying auto-brightness levels
  in config.xml).
- Button / keyboard backlight control.
- Change window manager to associated WorkSource with
  KEEP_SCREEN_ON_FLAG wake lock instead of talking
  directly to the battery stats service.
- Optionally support animating screen brightness when
  turning on/off instead of playing electron beam animation
  (config_animateScreenLights).

Change-Id: I1d7a52e98f0449f76d70bf421f6a7f245957d1d7
2012-08-15 03:06:24 -07:00
Jeff Brown
4f8ecd8029 Move power manager to a new package.
Change-Id: I5f5a6435e64354b7d6535e8e9a63934ba7a3f448
2012-06-18 19:43:44 -07:00
Dianne Hackborn
f72467ad98 Include important native processes in watchdog stacks.
Helps us track down deadlocks involving native service processes.

Bug: 6615693
Change-Id: I580047550772e29586195a8cf440141574e3f40c
2012-06-08 18:36:48 -07:00
Jeff Sharkey
a353d2654a Differentiate between system_server and unknown.
Bug: 5531966
Change-Id: I2b64b04f3f5a8760a2314729e8b90e9dd6699cb4
2011-10-28 11:13:27 -07:00
Mathias Agopian
cf2317ef13 put the watchdog values back to what they should be
Change-Id: I4f394248c2f4c514f74b66fde3cb69bbed9ec796
2011-08-25 17:12:37 -07:00
Mathias Agopian
2370d0a14f make sure to re-initialize SurfaceTexture to its default state on disconnect
this caused problems where the NavigationBar would disapear or be
drawn in the wrong orientation.

Change-Id: I083c41338db83a4afd14f427caec2f31c180d734
2011-08-25 17:03:30 -07:00
Joe Onorato
43a17654cf Remove the deprecated things from Config.java. These haven't been working since before 1.0.
Change-Id: Ic2e8fa68797ea9d486f4117f3d82c98233cdab1e
2011-04-07 19:23:05 -07:00
Brad Fitzpatrick
9765c72eea Watchdog can get deadlocked on activity manager
Bug: 3351719
Change-Id: Ie5bb39e5ff92f41c14ae59240173fab9c2491a91
2011-01-14 11:47:01 -08:00
Dianne Hackborn
6b1afebdac Improve debug output when an ANR happens.
- Collect data at better times.
- Collect per-thread CPU usage as soon as possible after the ANR, and print
  in log.
- Based on new per-thread CPU usage, limit the number of processes we
  collect stacks from to not include inactive not interesting procs.
- Improve the way ProcessStats compute and reports its data.

Change-Id: I12b17fb47d593d175be69bb792c1f57179bf4fdf
2010-08-31 18:51:27 -07:00
Christopher Tate
c27181c7f3 Remove memory monitoring from the system watchdog
This was originally written as an in-case-we-need-it facility, but was
never actually used in production.  It also soaked up a surprising amount
of cpu on occasion, as well as doing sketchy things like demoting the
system_server's primary looper thread to the background cgroup at times.

Change-Id: I9a81a8d1e9caea9e0a1277d97785fe96add438d7
2010-06-30 14:53:39 -07:00
Christopher Tate
ecaa7b41ca Watchdog now records kernel stacks when it fires
The kernel threads are appended to the usual /data/anr/traces.txt file
and dropboxed along with the usual Dalvik stack dumps.

Change-Id: I120f1f5ee54c965efe9ac0c7f40fdef56385f1fa
NOTE: this change depends on the kernel publishing /proc/$PID/stack
2010-06-04 14:55:02 -07:00
Christopher Tate
6ee412d51d Also dump system process threads halfway through the watchdog interval
This gives us a snapshot of what the system process was doing after 30 seconds
of apparent inactivity as well as after 1 minute, to help distinguishing actual
deadlocks from too-slow progress, livelock, etc.

Change-Id: I19758861d1b25f298e88788e8f1c7ec7bf828823
2010-05-28 12:23:16 -07:00
Dan Egnor
4bded0744a Dump the phone process stack (as well as the system process) on watchdog reset.
Change-Id: I3c47086f9cc010f524da7de539942ea30d0338e3
2010-03-11 22:27:59 -08:00
Suchi Amalapurapu
6ffce2e9a3 Add new shutdown observer for MountService.
Use new observer before rebooting and shutting down.
Add some unit tests for unmount and shutdown code paths
Fix registering/unregistering part in MountService
Use ShutdownThread in PowerManager.reboot()
Add reboot support to ShutdownThread.
Remove MountService code from PowerManagerService.java and Power.java.
Clean shutdown/reboot is handled exclusively by ShutdownThread now.

Change-Id: Iefb157451d3d9c426cb431707b870a873c09123d
2010-03-09 17:00:18 -05:00
Dan Egnor
9bdc94b7a4 Improve watchdog diagnostics.
Capture stack traces from the system process using the same
mechanism as ANRs (which will initialize traces.txt, etc).
Also record the watchdog reset in the dropbox for uploading.

Bug: 2475557
2010-03-04 17:31:27 -08:00
Joe Onorato
8a9b22056b Switch the services library to using the new Slog 2010-03-01 13:06:50 -08:00
Doug Zongker
f68888951a move Watchdog's settings from Gservices to Secure
Change-Id: Iac1146dafa12f9247874514c9aeefa5f8f83933d
2010-01-06 16:38:14 -08:00
Doug Zongker
ab5c49c7e7 move event log tags used by system server into this package
We can now locate event log tag definitions in individual packages
(and java constants for the tag numbers get auto-generated), so move
all the tags used by the system server into the package.
2009-12-04 10:31:43 -08:00
Dianne Hackborn
723738cfae Expand support for different screen sizes.
Applications can now declare that they support small, normal, or
large screens.  Resource selection can also be done based on these
sizes.  By default, pre-Donut apps are false for small and large,
and Donut or later apps are assumed to support all sizes.  In either
case they can use <supports-screens> in their manifest to declare
what they actually support.
2009-06-26 13:37:05 -07:00
The Android Open Source Project
9066cfe988 auto import from //depot/cupcake/@135843 2009-03-03 19:31:44 -08:00
The Android Open Source Project
d83a98f4ce auto import from //depot/cupcake/@135843 2009-03-03 18:28:45 -08:00
The Android Open Source Project
b798689749 auto import from //branches/cupcake/...@125939 2009-01-09 17:51:23 -08:00
The Android Open Source Project
f013e1afd1 Code drop from //branches/cupcake/...@124589 2008-12-17 18:05:43 -08:00
The Android Open Source Project
54b6cfa9a9 Initial Contribution 2008-10-21 07:00:00 -07:00