317 Commits

Author SHA1 Message Date
Yangster-mac
849dfdc25b metric activation and TTL.
Currently, once a metric config is pushed to statsD, it will
start to collect metrics immediately. This CL introduces the metric
activation logic. When metric needs an activation, the metric producer
will hold until the activation event is detected. Then the metric producer
starts metric generation until the TTL expires (timebomb).

This is to support Mainline where it wants to collect a few metrics for
a few hours when the binary push starts or flag flips.

Test: statsd test
BUG: b/117858835
Change-Id: I992ae98f4303d5b79932eb94eddf6c19ded3727e
2018-10-18 10:45:31 -07:00
Yangster-mac
32f07af29c Match pulled events in gauge metric.
Bug: b/117703265

Test: statsd unit tests
Change-Id: Ia4c67ebfdb4f9647d4135c6879f04faa4efdd471
2018-10-17 17:47:11 -07:00
TreeHugger Robot
dfa4c24aaf Merge "Add pulled atom subscription for shell." 2018-10-16 21:46:49 +00:00
Yangster-mac
a8a304458b Fix the typos and naming convention in atoms.proto
Test: cts,statsd
BUG: b/117708491

Change-Id: Ib381ef66ae9925938e1f70b9a8869ef008e3d335
2018-10-14 22:04:52 -07:00
Yao Chen
41e606c1fc Add pulled atom subscription for shell.
+ Changed the output format from Atom to ShellData, which is a wrapper for repeated Atom
  This is useful because pulled atoms are usually a list of atoms.

Test: statsd_test added
Bug: 110536553

Change-Id: I0e2f55bdd9015c9bc95b87a630297c6f13e39636
2018-10-12 09:23:25 -07:00
TreeHugger Robot
c1dea6f02e Merge "Fix statsd_test unit tests for TestKeyValuePairsEvent" 2018-10-04 22:41:36 +00:00
Howard Ro
a5b0f1edf1 Fix statsd_test unit tests for TestKeyValuePairsEvent
Test: tested in marlin with 100% passing
Change-Id: I4f3ac89061ec126356a5cc42e1a575599f3302fc
2018-10-02 16:20:43 -07:00
TreeHugger Robot
0d52450ca6 Merge "Add unit tests for ShellSubscriber and fix a bug" 2018-10-01 23:43:40 +00:00
Yao Chen
398dd19f66 Add unit tests for ShellSubscriber and fix a bug
Test: statsd_test
Change-Id: Iaf0558ec2a2dc190bedb240da8019868266ec8f5
2018-10-01 14:49:03 -07:00
TreeHugger Robot
908bd8dea2 Merge "Fix a bug in DurationMetric's dimensions from condition and make unit tests expect the right answer." 2018-09-28 23:38:29 +00:00
Howard Ro
4078dd4e15 Support int32_t (Java Integer) in KeyValuePair atom
Bug: 116826451
Test: statsd_test + manual verification through logcat
Change-Id: I0157c22033907fea46e26ee4262c723fa8c0b518
2018-09-27 17:51:40 -07:00
TreeHugger Robot
07d7baf36c Merge "statsd events/gauge: remove WallClockTime" 2018-09-12 21:43:03 +00:00
TreeHugger Robot
b299070edf Merge "Remove dimension fields in GaugeMetric output" 2018-09-10 17:47:30 +00:00
Yangster-mac
e124e42582 Interface of writing key value pair atom to socket and parsing from statsd.
Test: statsd unit test
BUG: b/114231161

Change-Id: I3543900934b5e8e0677bf1e7cc454d61064a2475
2018-09-07 11:09:37 -07:00
Bookatz
fe2dde8184 statsd events/gauge: remove WallClockTime
EventMetricData stores wall_clock_timestamp_nanos.
It is expensive, costing 10 bytes per event and evidently not needed.
Similar for GaugeMetricData.

Bug: 113072343
Test: make -j8 statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Test: run cts-dev -m CtsStatsdHostTestCases
Test: Manually confirm that events/gauges don't have wallclock
Change-Id: Iae978a434354c049e1fa61d42536be981c862b4f
2018-09-05 10:53:33 -07:00
Chenjie Yu
4c31f67965 Remove dimension fields in GaugeMetric output
GaugeMetric output is atom format and can contain all fields including
those already included in dimensions.
This is not necessary and duplicate.
And, if we do set dimensions, we can use this to reduce data size
similar to repeated fields, where same dimension value only appear once.

Bug: 113061955
Test: unit test
Change-Id: I299bd1cb1b9b90ea7426ef182df78d2ffc091910
2018-08-22 14:14:02 -07:00
Yangster-mac
48b3d62bfe Create log event from key value maps.
BUG: b/112816333
Test: statsd test.
Change-Id: Ib66f06186abfacd77807436379e1e142a5b87c99
2018-08-19 22:37:59 +00:00
TreeHugger Robot
a5f28237e4 Merge "allow statsd pull based on event trigger" 2018-08-11 20:09:22 +00:00
Chenjie Yu
8858897205 allow statsd pull based on event trigger
Several restrictions:
1) This is only with GaugeMetric. We don't have use case for ValueMetric
to pull on event trigger.
2) trigger_event is set in the config. It can be generic atom matcher. But
we limit the number of atoms referenced in it to be 1. So we don't allow
multiple atoms to form a complex trigger.
3) This has to go with ALL_CONDITION_CHANGES sampling type.

+ also specify atom id of GaugeMetric output.

Bug: 111937835
Test: unit test
Change-Id: Ia15b1f209945f022edffb9ec5d673317d55d9e4f
2018-08-10 20:49:52 -07:00
TreeHugger Robot
632b39288d Merge "Remove the obsolete code for logd and add statsd socket log loss detection." 2018-08-11 00:30:47 +00:00
Yangster
8a34384797 Return unknown for combination condition eval when operation is NOT and
there is no child.

Test: added unit test and rerun the statsd tests.

BUG: b/112311529

Change-Id: I0c5829e3cb26474b7dbcc05f20c4311e9f801d97
2018-08-10 04:30:11 +00:00
Yao Chen
3ff3a490e4 Remove the obsolete code for logd and add statsd socket log loss detection.
+ Remove dead code
+ Add a simple log loss detection as a starter to see if there is any log loss
  detected at all.

TODO: If we do see log loss, we can add more sophisticated logging and reset mechanism.

Bug: 80538532
Test: statsd_test
Change-Id: Iff150c9d8f9f936dbd4586161a3484bef7035c28
2018-08-06 16:24:49 -07:00
Chenjie Yu
e1361ed422 Adjust 1st bucket start time
adjust 1st bucket start time for a partial bucket
also make valuemetric and gauge metric pull on first bucket

Bug: 111607838
Bug: 111660710
Bug: 111842941

Test: unit test
Change-Id: I5932c2258f8deac57e7abbf26f3214f87914a964
2018-07-27 10:53:38 -07:00
Chenjie Yu
a0f0224906 ValueMetric supports multiple aggregation types
1. Add support for MIN, MAX, AVG
2. ValueMetric also allow floats now, in addition to long data type.
AnomalyDetection still takes long only. I am not sure if it makes
sense to do anomaly on AVG. I will leave that for later.
3. ValueMetric supports sliced condition change for pushed events.
I don't think it makes sense for pulled events to have sliced condition
changes so leave it for now.

Test: unit test
Change-Id: I8bc510d98ea9b8a6eb16d04ff99dce6b574249cd
2018-07-13 10:24:41 -07:00
Yao Chen
0aff90329b Fix a bug in DurationMetric's dimensions from condition and make unit tests expect the right answer.
Bug: 111119889
Test: statsd_test
Change-Id: Ie8379031c93641c011bf27694b47ae21fe8f8d7a
2018-07-03 10:51:05 -07:00
Yao Chen
5bfffb54da Clean up TODOs in statsd
+ Created bugs for those TODOs that are still relevant.
+ Remove obsolete TODOs.

Test: no code change.
Change-Id: I41c2a89a882f087817ee6cbc3f095e1d80e1928e
2018-06-25 11:08:04 -07:00
Chenjie Yu
e22192071d StatsPullerManager not use singleton
This is to be consistent with other patterns such as UidMap.
This also makes unit test simpler.

Change-Id: I1558cd609e470481f269ecf2ae616277a95cfbf0
Bug: 72722120
Test: unit test
2018-06-14 15:46:54 -07:00
Bookatz
d27ab45ad3 Remove TODO in statsd AnomalyTracker_test
The underlying item the TODO is referencing had already been resolved
so the test line can be properly added, per the TODO.

Change-Id: I5c16e7ea319bd16e37475381def656b38f39d17f
Fixes: 80095149
Test: make statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
2018-05-24 10:35:02 -07:00
Yangster-mac
1c58f04cd3 Add a field in config to disable/enable the string hashing in metric report.
Statsd hashes (using its own hashing function) raw strings to reduce the
upload data size when there are duplicate strings in the report. And in cloud,
the clearcut translator would backfill the strings.

In a few droidfood users, we find the translator was unable to do that. While
debugging the root cause, we first decided to provide an option to disable
the hashing from the cloud.

Test: statsd unit test, CTS test, tested manually

BUG: b/79943763
Change-Id: If0359c8cf3f3cf83a2938db9ebf95ea7906f0b0c
2018-05-18 10:39:50 -07:00
Chenjie Yu
021e25307d ValueMetric pushed events should check condition
+ fix unit test flakiness

Bug: 79873404
Change-Id: I15b52a79b18c05603640781e4450e7b62fac24ba
Fix: 79873404
Test: unit test
2018-05-16 14:50:11 -07:00
David Chen
092a5a9b85 Fixes Value metrics in statsd and app upgrades.
Pulled value metrics with conditions had a subtle bug that caused
us to leave the condition on even if it should've been false.

Bug: 79778783
Test: Added unit-test and verified on marlin-eng.
Change-Id: I31f34791118319b3471f7a6ea8a024e2d511cfe7
2018-05-15 17:51:47 -07:00
David Chen
56ae0d9a48 Fixes statsd reports missing strings and SCS.
Reports written to disk don't contain the strings used, which will
make this report unusable if there are strings that don't show up
again. We should always include the strings, so this option is
removed entirely.

Also, we hard-coded the wrong number of fields when pulling
ModemActivityInfo. There are actually 10 fields, not 6.

Bug: 79601503
Test: Tested unit-tests pass on marlin-eng.
Change-Id: I6834b096ced77418a9cc2ddd79b08d1c9c447fae
2018-05-11 17:04:56 -07:00
TreeHugger Robot
a159842161 Merge "Fix the flaky gauge/value e2e test due to cached events." into pi-dev 2018-05-10 01:02:10 +00:00
android-build-team Robot
c3d0798455 Merge "Fix partial bucket unit tests." into pi-dev 2018-05-09 18:07:24 +00:00
Yangster-mac
58e609e339 Fix the flaky gauge/value e2e test due to cached events.
Test: statsd test
BUG: b/79265262
Change-Id: I4d67f1c2edb6215a3cea23f8c7b2e8d5099c4aac
2018-05-08 16:19:48 -07:00
David Chen
9e6dbbdadf Fix statsd returning uidmap with empty reports.
We notice devices uploading a bunch of bytes for the uidmap even if
the device is running an empty config, so there are no actual metrics
to report. This hardcodes some logic to skip the inclusion of the
uidmap if there are exactly 0 metrics.

Bug: 79381210
Test: Tested unit-tests on marlin-eng
Change-Id: I96348235341a7faf15ff57d4d1eccac635a3a999
2018-05-07 18:07:19 -07:00
Yao Chen
cc884dfc94 Fix partial bucket unit tests.
Bug: 79347749
Test: statsd_test
Change-Id: I69eee7172d6fe4ce895530f089193eb08653e269
2018-05-07 10:34:31 -07:00
David Chen
48944901f7 Fixes statsd returning too much data at once.
We observe a single ConfigMetricsReportList can be greater than the
safe size for the binder transaction buffer since we only check the
size of the current metrics in progress, but we also return the
previous reports stored on disk.

This change will attempt to send another ConfigMetricsReportList
as soon as possible if there's already a report on disk.

Also fixes a bug when trying to trigger data fetch before the client
has registered the corresponding dataFetchOperation.

Bug: 79201869
Test: Tested manually on marlin-eng
Change-Id: I2d3677162804a27e7a7a95d482d80c46bd994a67
2018-05-04 17:09:16 -07:00
Yangster-mac
892f3d3229 Reset statsd and correctly record the dump reason when system
server restarts/crashes.

Test: statsd test
BUG: b/79161505
Change-Id: I0646c764964f6eafde91f9ae0179a1c837af320d
2018-05-03 17:05:24 -07:00
Yangster-mac
9def8e3995 Reduce statsd log data size.
1. Hash the strings in metric dimensions.
2. Optimize the timestamp encoding in bucket.
   Use bucket num for full bucket and millis for
   partial bucket.
3. Encode the dimension path per metric and avoid
   deduping it across dimensons.

Test: statsd test
Change-Id: I18f69654de85edb21a9c835c73edead756295e05
BUG: b/77813755
2018-04-26 04:30:18 -07:00
Chenjie Yu
e36018b272 add dump report reason to reports
+ also change uidmapping version numbers to int64_t

Bug: 78132855
Change-Id: Iac7ea93e4bf651bd65bd03383e7ab4971af4fc29
Fix: 78132855
Test: gts test
2018-04-18 20:19:21 +00:00
David Chen
81245fd53a Adds option to drop small buckets for statsd.
We notice that some of the pulled metrics have a ton of data, and
during app upgrades, we're forming partial buckets that represent
small periods of time but require many bytes of data. We now have an
option to drop these buckets that are too short. Note that we still
have to pull the data to keep the metrics for the next bucket
correct. We include a new field in the value and gauge metric outputs
so that it's easy to tell when a bucket was dropped.

We drop the partial buckets also from anomaly detection since we
should be computing anomalies from the same data that is reported.

Test: Added unit-tests for value and gauge metrics.
Bug: 77925710
Change-Id: Ic370496377c6afd380e02278a6c1ed8b521a2731
2018-04-16 18:42:14 -07:00
Jeff Sharkey
6b64925737 Protect usage data with OP_GET_USAGE_STATS.
APIs that return package usage data (such as the new StatsManager)
must ensure that callers hold both the PACKAGE_USAGE_STATS permission
and the OP_GET_USAGE_STATS app-op.

Add noteOp() method that can be called from native code.

Also add missing security checks on command interface.

Bug: 77662908, 78121728
Test: builds, boots
Change-Id: Ie0d51e4baaacd9d7d36ba0c587ec91a870b9df17
2018-04-16 12:44:32 -06:00
TreeHugger Robot
6b317915e8 Merge "StatsManager throws exceptions" into pi-dev 2018-04-11 17:02:06 +00:00
Yao Chen
163d2602db Handle logd reconnect.
When statsd reconnects to logd, statsd will read all logs from buffer again. To prevent us from
reprocessing old events, we do the following:

1. At any given moment, record the largest timestamp(T_max) and last timestamp (check point) that
   we've seen before.
2. When reconnection happens, we look for the check point until we see a new log with a timestamp
   larger than T_max.
   -> If we found the CP, resume after the CP. Success
   -> If we can't find CP, there is definitely log loss. We reset all configs.

Note:
1. Logd has an API to read logs after a certain timestamp. But this api is vulnerable to
time changes from Settings. So we cannot rely on it.

2. If logd inserts a new log (with older timestamp) before CP, we cannot detect it. It's not
   possible to detect it without record all timestamps we have seen.

Test: statsd_test
Bug: 77813113

Change-Id: Ic3fdb47230807606ab11dc994cb162194adb8448
2018-04-10 22:06:03 -07:00
Yangster-mac
15f6bbc24f Flush the bucket when creating the metric producer.
Use int64 for value field.
E2e test for gauge/value metric.

BUG: b/74445671

Test: statsd test.
Change-Id: I823a0bade8f89834bdfb9cf48864852a47d7b63b
2018-04-10 20:25:13 -07:00
Bookatz
4f71629002 StatsManager throws exceptions
When StatsManager fails to connect to statsd, it now throws an exception
for the caller to catch. It also throws an exception of the config being
added is of an unreadable format.

Due to backwards compatibility issues, the old APIs could not be
changed, so new ones were made to replace the old ones. The old ones are
now temporary and will be removed when the compatibility issue is
resolved.

Bug: 77648233
Test: gts-tradefed run gts-dev --module GtsStatsdHostTestCases
Change-Id: Ibea05883a29b9b3ef9927d2f8fe295eb99832ab7
2018-04-10 19:07:32 -07:00
Chenjie Yu
ae63b0af94 Drop value if the bucket is totally tainted
Bug: 77870358
Change-Id: Ia96970a3254de08f94b91ad53be2fdb9f4db7eb4
Fix: 77870358
Test: unit test
2018-04-10 14:59:31 -07:00
Yangster-mac
e68f3a5811 Flush the partial bucket when startd shuts down or config updated.
Test: statsd test

BUG: b/77556036
Change-Id: Ie4a04ace55e07c4529cdff5906ba874f8815f620
2018-04-05 18:05:57 -07:00
David Chen
bd12527c90 Fix uid map to be simpler and fix partial bucket.
The previous scheme captured periodic snapshots for each config with
complex logic that's unnecessary and wasted memory. We actually don't
need to store any snapshots since we just convert the current state
into a snapshot and also include the deltas (change events) since the
previous report until now.

To make the system more robust, we also include up to 100 of the
deleted apps in the uid map.

Also, fix the wiring of the partial buckets so the metric producers
form partial buckets on both app upgrade and removal, but not on
installation of a new app.

Also, we update StatsCompanionService to also include disabled apps.

Bug: 77607583
Test: Verified unit-tests pass and added new e2e tests.
Change-Id: I98e1f544d6e6571545ae1581c4cebab807596f51
2018-04-05 16:15:01 -07:00