194 Commits

Author SHA1 Message Date
Chenjie Yu
e1361ed422 Adjust 1st bucket start time
adjust 1st bucket start time for a partial bucket
also make valuemetric and gauge metric pull on first bucket

Bug: 111607838
Bug: 111660710
Bug: 111842941

Test: unit test
Change-Id: I5932c2258f8deac57e7abbf26f3214f87914a964
2018-07-27 10:53:38 -07:00
Chenjie Yu
a0f0224906 ValueMetric supports multiple aggregation types
1. Add support for MIN, MAX, AVG
2. ValueMetric also allow floats now, in addition to long data type.
AnomalyDetection still takes long only. I am not sure if it makes
sense to do anomaly on AVG. I will leave that for later.
3. ValueMetric supports sliced condition change for pushed events.
I don't think it makes sense for pulled events to have sliced condition
changes so leave it for now.

Test: unit test
Change-Id: I8bc510d98ea9b8a6eb16d04ff99dce6b574249cd
2018-07-13 10:24:41 -07:00
Yao Chen
5bfffb54da Clean up TODOs in statsd
+ Created bugs for those TODOs that are still relevant.
+ Remove obsolete TODOs.

Test: no code change.
Change-Id: I41c2a89a882f087817ee6cbc3f095e1d80e1928e
2018-06-25 11:08:04 -07:00
Chenjie Yu
e22192071d StatsPullerManager not use singleton
This is to be consistent with other patterns such as UidMap.
This also makes unit test simpler.

Change-Id: I1558cd609e470481f269ecf2ae616277a95cfbf0
Bug: 72722120
Test: unit test
2018-06-14 15:46:54 -07:00
Bookatz
d27ab45ad3 Remove TODO in statsd AnomalyTracker_test
The underlying item the TODO is referencing had already been resolved
so the test line can be properly added, per the TODO.

Change-Id: I5c16e7ea319bd16e37475381def656b38f39d17f
Fixes: 80095149
Test: make statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
2018-05-24 10:35:02 -07:00
Yangster-mac
1c58f04cd3 Add a field in config to disable/enable the string hashing in metric report.
Statsd hashes (using its own hashing function) raw strings to reduce the
upload data size when there are duplicate strings in the report. And in cloud,
the clearcut translator would backfill the strings.

In a few droidfood users, we find the translator was unable to do that. While
debugging the root cause, we first decided to provide an option to disable
the hashing from the cloud.

Test: statsd unit test, CTS test, tested manually

BUG: b/79943763
Change-Id: If0359c8cf3f3cf83a2938db9ebf95ea7906f0b0c
2018-05-18 10:39:50 -07:00
Chenjie Yu
021e25307d ValueMetric pushed events should check condition
+ fix unit test flakiness

Bug: 79873404
Change-Id: I15b52a79b18c05603640781e4450e7b62fac24ba
Fix: 79873404
Test: unit test
2018-05-16 14:50:11 -07:00
David Chen
092a5a9b85 Fixes Value metrics in statsd and app upgrades.
Pulled value metrics with conditions had a subtle bug that caused
us to leave the condition on even if it should've been false.

Bug: 79778783
Test: Added unit-test and verified on marlin-eng.
Change-Id: I31f34791118319b3471f7a6ea8a024e2d511cfe7
2018-05-15 17:51:47 -07:00
David Chen
56ae0d9a48 Fixes statsd reports missing strings and SCS.
Reports written to disk don't contain the strings used, which will
make this report unusable if there are strings that don't show up
again. We should always include the strings, so this option is
removed entirely.

Also, we hard-coded the wrong number of fields when pulling
ModemActivityInfo. There are actually 10 fields, not 6.

Bug: 79601503
Test: Tested unit-tests pass on marlin-eng.
Change-Id: I6834b096ced77418a9cc2ddd79b08d1c9c447fae
2018-05-11 17:04:56 -07:00
TreeHugger Robot
a159842161 Merge "Fix the flaky gauge/value e2e test due to cached events." into pi-dev 2018-05-10 01:02:10 +00:00
android-build-team Robot
c3d0798455 Merge "Fix partial bucket unit tests." into pi-dev 2018-05-09 18:07:24 +00:00
Yangster-mac
58e609e339 Fix the flaky gauge/value e2e test due to cached events.
Test: statsd test
BUG: b/79265262
Change-Id: I4d67f1c2edb6215a3cea23f8c7b2e8d5099c4aac
2018-05-08 16:19:48 -07:00
David Chen
9e6dbbdadf Fix statsd returning uidmap with empty reports.
We notice devices uploading a bunch of bytes for the uidmap even if
the device is running an empty config, so there are no actual metrics
to report. This hardcodes some logic to skip the inclusion of the
uidmap if there are exactly 0 metrics.

Bug: 79381210
Test: Tested unit-tests on marlin-eng
Change-Id: I96348235341a7faf15ff57d4d1eccac635a3a999
2018-05-07 18:07:19 -07:00
Yao Chen
cc884dfc94 Fix partial bucket unit tests.
Bug: 79347749
Test: statsd_test
Change-Id: I69eee7172d6fe4ce895530f089193eb08653e269
2018-05-07 10:34:31 -07:00
David Chen
48944901f7 Fixes statsd returning too much data at once.
We observe a single ConfigMetricsReportList can be greater than the
safe size for the binder transaction buffer since we only check the
size of the current metrics in progress, but we also return the
previous reports stored on disk.

This change will attempt to send another ConfigMetricsReportList
as soon as possible if there's already a report on disk.

Also fixes a bug when trying to trigger data fetch before the client
has registered the corresponding dataFetchOperation.

Bug: 79201869
Test: Tested manually on marlin-eng
Change-Id: I2d3677162804a27e7a7a95d482d80c46bd994a67
2018-05-04 17:09:16 -07:00
Yangster-mac
892f3d3229 Reset statsd and correctly record the dump reason when system
server restarts/crashes.

Test: statsd test
BUG: b/79161505
Change-Id: I0646c764964f6eafde91f9ae0179a1c837af320d
2018-05-03 17:05:24 -07:00
Yangster-mac
9def8e3995 Reduce statsd log data size.
1. Hash the strings in metric dimensions.
2. Optimize the timestamp encoding in bucket.
   Use bucket num for full bucket and millis for
   partial bucket.
3. Encode the dimension path per metric and avoid
   deduping it across dimensons.

Test: statsd test
Change-Id: I18f69654de85edb21a9c835c73edead756295e05
BUG: b/77813755
2018-04-26 04:30:18 -07:00
Chenjie Yu
e36018b272 add dump report reason to reports
+ also change uidmapping version numbers to int64_t

Bug: 78132855
Change-Id: Iac7ea93e4bf651bd65bd03383e7ab4971af4fc29
Fix: 78132855
Test: gts test
2018-04-18 20:19:21 +00:00
David Chen
81245fd53a Adds option to drop small buckets for statsd.
We notice that some of the pulled metrics have a ton of data, and
during app upgrades, we're forming partial buckets that represent
small periods of time but require many bytes of data. We now have an
option to drop these buckets that are too short. Note that we still
have to pull the data to keep the metrics for the next bucket
correct. We include a new field in the value and gauge metric outputs
so that it's easy to tell when a bucket was dropped.

We drop the partial buckets also from anomaly detection since we
should be computing anomalies from the same data that is reported.

Test: Added unit-tests for value and gauge metrics.
Bug: 77925710
Change-Id: Ic370496377c6afd380e02278a6c1ed8b521a2731
2018-04-16 18:42:14 -07:00
Jeff Sharkey
6b64925737 Protect usage data with OP_GET_USAGE_STATS.
APIs that return package usage data (such as the new StatsManager)
must ensure that callers hold both the PACKAGE_USAGE_STATS permission
and the OP_GET_USAGE_STATS app-op.

Add noteOp() method that can be called from native code.

Also add missing security checks on command interface.

Bug: 77662908, 78121728
Test: builds, boots
Change-Id: Ie0d51e4baaacd9d7d36ba0c587ec91a870b9df17
2018-04-16 12:44:32 -06:00
TreeHugger Robot
6b317915e8 Merge "StatsManager throws exceptions" into pi-dev 2018-04-11 17:02:06 +00:00
Yao Chen
163d2602db Handle logd reconnect.
When statsd reconnects to logd, statsd will read all logs from buffer again. To prevent us from
reprocessing old events, we do the following:

1. At any given moment, record the largest timestamp(T_max) and last timestamp (check point) that
   we've seen before.
2. When reconnection happens, we look for the check point until we see a new log with a timestamp
   larger than T_max.
   -> If we found the CP, resume after the CP. Success
   -> If we can't find CP, there is definitely log loss. We reset all configs.

Note:
1. Logd has an API to read logs after a certain timestamp. But this api is vulnerable to
time changes from Settings. So we cannot rely on it.

2. If logd inserts a new log (with older timestamp) before CP, we cannot detect it. It's not
   possible to detect it without record all timestamps we have seen.

Test: statsd_test
Bug: 77813113

Change-Id: Ic3fdb47230807606ab11dc994cb162194adb8448
2018-04-10 22:06:03 -07:00
Yangster-mac
15f6bbc24f Flush the bucket when creating the metric producer.
Use int64 for value field.
E2e test for gauge/value metric.

BUG: b/74445671

Test: statsd test.
Change-Id: I823a0bade8f89834bdfb9cf48864852a47d7b63b
2018-04-10 20:25:13 -07:00
Bookatz
4f71629002 StatsManager throws exceptions
When StatsManager fails to connect to statsd, it now throws an exception
for the caller to catch. It also throws an exception of the config being
added is of an unreadable format.

Due to backwards compatibility issues, the old APIs could not be
changed, so new ones were made to replace the old ones. The old ones are
now temporary and will be removed when the compatibility issue is
resolved.

Bug: 77648233
Test: gts-tradefed run gts-dev --module GtsStatsdHostTestCases
Change-Id: Ibea05883a29b9b3ef9927d2f8fe295eb99832ab7
2018-04-10 19:07:32 -07:00
Chenjie Yu
ae63b0af94 Drop value if the bucket is totally tainted
Bug: 77870358
Change-Id: Ia96970a3254de08f94b91ad53be2fdb9f4db7eb4
Fix: 77870358
Test: unit test
2018-04-10 14:59:31 -07:00
Yangster-mac
e68f3a5811 Flush the partial bucket when startd shuts down or config updated.
Test: statsd test

BUG: b/77556036
Change-Id: Ie4a04ace55e07c4529cdff5906ba874f8815f620
2018-04-05 18:05:57 -07:00
David Chen
bd12527c90 Fix uid map to be simpler and fix partial bucket.
The previous scheme captured periodic snapshots for each config with
complex logic that's unnecessary and wasted memory. We actually don't
need to store any snapshots since we just convert the current state
into a snapshot and also include the deltas (change events) since the
previous report until now.

To make the system more robust, we also include up to 100 of the
deleted apps in the uid map.

Also, fix the wiring of the partial buckets so the metric producers
form partial buckets on both app upgrade and removal, but not on
installation of a new app.

Also, we update StatsCompanionService to also include disabled apps.

Bug: 77607583
Test: Verified unit-tests pass and added new e2e tests.
Change-Id: I98e1f544d6e6571545ae1581c4cebab807596f51
2018-04-05 16:15:01 -07:00
Yangster-mac
b142cc8add Statsd config TTL
Roughly check the config every hour to see whether the ttl expired.
If so, read the config from disk and recreate the metric manager.

Test: statsd test

BUG: b/77274363

Change-Id: I16838afe5bbe966c3a0f638869751f9b59a5a259
2018-04-04 15:59:43 +00:00
David Chen
faa1af535b Includes annotations with statsd reports.
It's tricky to determine the source of the metrics on a device
currently since we can take the union of multiple configs and send
only one giant statsd_config into statsd. We will use the int64 field
to track the sub config id's and the int32 field to track the version
for each sub config, but the fields are named more generically as
annotations.

The annotations are available in both the reports and metadata.

Test: Check that all unit-tests pass on marlin-eng
Bug: 77327261
Change-Id: Ic37c549c8b2991676f69948c515156765c9f5108
2018-04-03 18:20:40 -07:00
Yangster-mac
c04feba805 Move forward the alarm timestamp when config is added to statsd.
Test: statsd test
BUG: b/77344187

Change-Id: Ieacffaa29422829b8956f2b3fcb2c647c8c3eed9
2018-04-02 18:12:36 -07:00
TreeHugger Robot
46eef8d049 Merge "E2e test for periodic alarm." into pi-dev 2018-03-31 03:04:59 +00:00
TreeHugger Robot
3b826c6075 Merge "Statsd MAX now obeys refractory periods too" into pi-dev 2018-03-31 01:10:57 +00:00
Bookatz
0dbc7a4343 Statsd MAX now obeys refractory periods too
The logic of where refractory period enforcement was moved out of the
anomaly tracker and into the metric's predictAnomalyTimestamp. It was
done for ORING, but not for MAX. This fixes MAX.

Bug: 74446029
Test: adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Change-Id: I51e43c7c132f424af8fe20a37f2ad10cc55b5989
2018-03-30 16:21:17 -07:00
TreeHugger Robot
9fef2594f3 Merge "Clean up atoms.proto" into pi-dev 2018-03-30 20:48:09 +00:00
Chenjie Yu
5caaa9d854 Clean up atoms.proto
changes are:
1) for pushed atoms, use attribution node in place of uid when
appropriate
2) name changes to be more consistent

Bug: 73823969
Test: manual test
Change-Id: Iacf7186dbd7a2282f7fe481f43dbbf92e1165b47
2018-03-30 10:11:03 -07:00
Chenjie Yu
6d370f40fe Add unit test ValueMetricProducer on boundary
Mostly to add test to assure the corner cases are covered.
One minor logic change is if two true conditions happen, in the case
when following happen:
(bucket boundary1) -> (condition false) -> (condition true) -> (pull
triggered for the boundary1)
Previously we take the latest. Now we skip the late boundary pull.

Bug: 76384731
Test: unit test
Change-Id: I345c2210a58bf03eb91d65742573073d2668358b
2018-03-29 21:54:54 -07:00
Chenjie Yu
1a0a941c20 Fix StatsCompanionService pull on bucket ends
+ change StatsPullerManager internal time units to be consistent
+ use series of alarms for pullers, instead of use setRepeating

Bug: 76223345
Bug: 75970648
Test: cts test
Change-Id: I9e6ac0ce06541f5ceabd2a8fa444e13d40e36983
2018-03-29 00:11:13 -07:00
Yangster-mac
684d195227 E2e test for periodic alarm.
Test: new test

BUG: b/76281156
Change-Id: I60cb28baaeec6996e946a7cb3358ec8e0aca80e5
2018-03-27 13:26:20 -07:00
TreeHugger Robot
d9afdee26f Merge "Support sliced condition change in GaugeMetric" into pi-dev 2018-03-27 18:27:27 +00:00
Yao Chen
427d372552 Support sliced condition change in GaugeMetric
TODO: We need CTS to verify the behavior.

Bug: 73958484
Test: statsd_test
Change-Id: I56406983ddede12bc6a2e12188693a0c51ccae5c
2018-03-27 09:21:19 -07:00
TreeHugger Robot
cdc9d9008b Merge "Flush the past buckets in anomaly tracker when time jumps forward" into pi-dev 2018-03-27 07:21:21 +00:00
Yangster-mac
be10ddfe46 Flush the past buckets in anomaly tracker when time jumps forward
E2e test for count/duration anomaly trackers.

Test: new statsd tests.

BUG: b/74446029

Change-Id: Ia9be0240ba5021d44c1e1f096d67563e9138bb59
2018-03-26 18:44:01 -07:00
David Chen
35045cbc34 Fix uidmap in statsd.
Previously tried an optimization that results in corrupted proto
output. This changes to a safer approach of storing the snapshot data
in memory and only converting to proto output when the
ProtoOutputStream is provided.

Also fixes a security issue when trying to invoke triggerUidSnapshot
since we forgot to use SCS' permissions.

Test: Added a unit-test to verify output of StatsLogProcessor.
Bug: 76231867
Change-Id: Id410ce3505fda9d71caa71942ef3068b55872c66
2018-03-24 15:01:04 -07:00
Yao Chen
d50f2ae487 rename neq_all_string to neq_any_string in statsd_config
Bug: 73897465
Test: statsd_test
Change-Id: I1d020de873fc26fbb502f0b3487b85fdb3896753
2018-03-23 11:10:13 -07:00
TreeHugger Robot
8512630f8b Merge "Allow statsd to be given empty config." into pi-dev 2018-03-21 05:34:16 +00:00
Howard Ro
e51af37475 Merge "Fix recovery of stats data from previous input while using ProtoOutputStream" into pi-dev 2018-03-21 00:31:23 +00:00
David Chen
9fdd40302e Allow statsd to be given empty config.
Statsd clients may want to set an empty config temporarily, so it's
more convenient to allow them to set an empty config instead of
having to use the removeConfig and then having to remember to call
StatsManager#setDataFetchOperation.

Test: Added unit-tests and check they pass on marlin-eng.
Bug: 74997752
Change-Id: I2e762e5ec01e5a2c9a3469fb330b53fefbd734d6
2018-03-20 15:56:11 -07:00
yro
4beccbe3de Fix recovery of stats data from previous input while using
ProtoOutputStream

- Specify the length of message to avoid libprotoutil from thinking that
we are trying to write bool
- We only attach the previous dump file to the upload file where config
key matches
- Store ConfigMetricsReport (instead of ConfigMetricsReportList) onto
disk
- Stop use stack after scope in StorageManager
- Migrate UidMap to use ProtoOutputStream and renaming variables to
prevent confusion

Bug: 74021554
Bug: 75968524
Test: manual test, statsd_test, CTS tests
Change-Id: Iedf52633d7f5b985f5a934a3fb5a0c3c3b2e7fd1
2018-03-20 15:00:59 -07:00
TreeHugger Robot
f5de606f51 Merge "Deletes default allowed_log_sources in statsd." into pi-dev 2018-03-19 18:14:18 +00:00
TreeHugger Robot
ad121b9fdc Merge changes I3ffb6e97,I689df136,Ia67a8eb6 into pi-dev
* changes:
  Statsd: remove DurationAnomalyTracker.resetStorage
  Statsd AnomalyDetection stopAlarm also checks old
  Statsd AnomalyDetection improvements
2018-03-19 05:49:27 +00:00