Currently, once a metric config is pushed to statsD, it will
start to collect metrics immediately. This CL introduces the metric
activation logic. When metric needs an activation, the metric producer
will hold until the activation event is detected. Then the metric producer
starts metric generation until the TTL expires (timebomb).
This is to support Mainline where it wants to collect a few metrics for
a few hours when the binary push starts or flag flips.
Test: statsd test
BUG: b/117858835
Change-Id: I992ae98f4303d5b79932eb94eddf6c19ded3727e
+ Changed the output format from Atom to ShellData, which is a wrapper for repeated Atom
This is useful because pulled atoms are usually a list of atoms.
Test: statsd_test added
Bug: 110536553
Change-Id: I0e2f55bdd9015c9bc95b87a630297c6f13e39636
EventMetricData stores wall_clock_timestamp_nanos.
It is expensive, costing 10 bytes per event and evidently not needed.
Similar for GaugeMetricData.
Bug: 113072343
Test: make -j8 statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Test: run cts-dev -m CtsStatsdHostTestCases
Test: Manually confirm that events/gauges don't have wallclock
Change-Id: Iae978a434354c049e1fa61d42536be981c862b4f
GaugeMetric output is atom format and can contain all fields including
those already included in dimensions.
This is not necessary and duplicate.
And, if we do set dimensions, we can use this to reduce data size
similar to repeated fields, where same dimension value only appear once.
Bug: 113061955
Test: unit test
Change-Id: I299bd1cb1b9b90ea7426ef182df78d2ffc091910
Several restrictions:
1) This is only with GaugeMetric. We don't have use case for ValueMetric
to pull on event trigger.
2) trigger_event is set in the config. It can be generic atom matcher. But
we limit the number of atoms referenced in it to be 1. So we don't allow
multiple atoms to form a complex trigger.
3) This has to go with ALL_CONDITION_CHANGES sampling type.
+ also specify atom id of GaugeMetric output.
Bug: 111937835
Test: unit test
Change-Id: Ia15b1f209945f022edffb9ec5d673317d55d9e4f
+ Remove dead code
+ Add a simple log loss detection as a starter to see if there is any log loss
detected at all.
TODO: If we do see log loss, we can add more sophisticated logging and reset mechanism.
Bug: 80538532
Test: statsd_test
Change-Id: Iff150c9d8f9f936dbd4586161a3484bef7035c28
adjust 1st bucket start time for a partial bucket
also make valuemetric and gauge metric pull on first bucket
Bug: 111607838
Bug: 111660710
Bug: 111842941
Test: unit test
Change-Id: I5932c2258f8deac57e7abbf26f3214f87914a964
1. Add support for MIN, MAX, AVG
2. ValueMetric also allow floats now, in addition to long data type.
AnomalyDetection still takes long only. I am not sure if it makes
sense to do anomaly on AVG. I will leave that for later.
3. ValueMetric supports sliced condition change for pushed events.
I don't think it makes sense for pulled events to have sliced condition
changes so leave it for now.
Test: unit test
Change-Id: I8bc510d98ea9b8a6eb16d04ff99dce6b574249cd
+ Created bugs for those TODOs that are still relevant.
+ Remove obsolete TODOs.
Test: no code change.
Change-Id: I41c2a89a882f087817ee6cbc3f095e1d80e1928e
This is to be consistent with other patterns such as UidMap.
This also makes unit test simpler.
Change-Id: I1558cd609e470481f269ecf2ae616277a95cfbf0
Bug: 72722120
Test: unit test
The underlying item the TODO is referencing had already been resolved
so the test line can be properly added, per the TODO.
Change-Id: I5c16e7ea319bd16e37475381def656b38f39d17f
Fixes: 80095149
Test: make statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Statsd hashes (using its own hashing function) raw strings to reduce the
upload data size when there are duplicate strings in the report. And in cloud,
the clearcut translator would backfill the strings.
In a few droidfood users, we find the translator was unable to do that. While
debugging the root cause, we first decided to provide an option to disable
the hashing from the cloud.
Test: statsd unit test, CTS test, tested manually
BUG: b/79943763
Change-Id: If0359c8cf3f3cf83a2938db9ebf95ea7906f0b0c
Pulled value metrics with conditions had a subtle bug that caused
us to leave the condition on even if it should've been false.
Bug: 79778783
Test: Added unit-test and verified on marlin-eng.
Change-Id: I31f34791118319b3471f7a6ea8a024e2d511cfe7
Reports written to disk don't contain the strings used, which will
make this report unusable if there are strings that don't show up
again. We should always include the strings, so this option is
removed entirely.
Also, we hard-coded the wrong number of fields when pulling
ModemActivityInfo. There are actually 10 fields, not 6.
Bug: 79601503
Test: Tested unit-tests pass on marlin-eng.
Change-Id: I6834b096ced77418a9cc2ddd79b08d1c9c447fae
We notice devices uploading a bunch of bytes for the uidmap even if
the device is running an empty config, so there are no actual metrics
to report. This hardcodes some logic to skip the inclusion of the
uidmap if there are exactly 0 metrics.
Bug: 79381210
Test: Tested unit-tests on marlin-eng
Change-Id: I96348235341a7faf15ff57d4d1eccac635a3a999
We observe a single ConfigMetricsReportList can be greater than the
safe size for the binder transaction buffer since we only check the
size of the current metrics in progress, but we also return the
previous reports stored on disk.
This change will attempt to send another ConfigMetricsReportList
as soon as possible if there's already a report on disk.
Also fixes a bug when trying to trigger data fetch before the client
has registered the corresponding dataFetchOperation.
Bug: 79201869
Test: Tested manually on marlin-eng
Change-Id: I2d3677162804a27e7a7a95d482d80c46bd994a67
1. Hash the strings in metric dimensions.
2. Optimize the timestamp encoding in bucket.
Use bucket num for full bucket and millis for
partial bucket.
3. Encode the dimension path per metric and avoid
deduping it across dimensons.
Test: statsd test
Change-Id: I18f69654de85edb21a9c835c73edead756295e05
BUG: b/77813755
We notice that some of the pulled metrics have a ton of data, and
during app upgrades, we're forming partial buckets that represent
small periods of time but require many bytes of data. We now have an
option to drop these buckets that are too short. Note that we still
have to pull the data to keep the metrics for the next bucket
correct. We include a new field in the value and gauge metric outputs
so that it's easy to tell when a bucket was dropped.
We drop the partial buckets also from anomaly detection since we
should be computing anomalies from the same data that is reported.
Test: Added unit-tests for value and gauge metrics.
Bug: 77925710
Change-Id: Ic370496377c6afd380e02278a6c1ed8b521a2731
APIs that return package usage data (such as the new StatsManager)
must ensure that callers hold both the PACKAGE_USAGE_STATS permission
and the OP_GET_USAGE_STATS app-op.
Add noteOp() method that can be called from native code.
Also add missing security checks on command interface.
Bug: 77662908, 78121728
Test: builds, boots
Change-Id: Ie0d51e4baaacd9d7d36ba0c587ec91a870b9df17
When statsd reconnects to logd, statsd will read all logs from buffer again. To prevent us from
reprocessing old events, we do the following:
1. At any given moment, record the largest timestamp(T_max) and last timestamp (check point) that
we've seen before.
2. When reconnection happens, we look for the check point until we see a new log with a timestamp
larger than T_max.
-> If we found the CP, resume after the CP. Success
-> If we can't find CP, there is definitely log loss. We reset all configs.
Note:
1. Logd has an API to read logs after a certain timestamp. But this api is vulnerable to
time changes from Settings. So we cannot rely on it.
2. If logd inserts a new log (with older timestamp) before CP, we cannot detect it. It's not
possible to detect it without record all timestamps we have seen.
Test: statsd_test
Bug: 77813113
Change-Id: Ic3fdb47230807606ab11dc994cb162194adb8448
When StatsManager fails to connect to statsd, it now throws an exception
for the caller to catch. It also throws an exception of the config being
added is of an unreadable format.
Due to backwards compatibility issues, the old APIs could not be
changed, so new ones were made to replace the old ones. The old ones are
now temporary and will be removed when the compatibility issue is
resolved.
Bug: 77648233
Test: gts-tradefed run gts-dev --module GtsStatsdHostTestCases
Change-Id: Ibea05883a29b9b3ef9927d2f8fe295eb99832ab7
The previous scheme captured periodic snapshots for each config with
complex logic that's unnecessary and wasted memory. We actually don't
need to store any snapshots since we just convert the current state
into a snapshot and also include the deltas (change events) since the
previous report until now.
To make the system more robust, we also include up to 100 of the
deleted apps in the uid map.
Also, fix the wiring of the partial buckets so the metric producers
form partial buckets on both app upgrade and removal, but not on
installation of a new app.
Also, we update StatsCompanionService to also include disabled apps.
Bug: 77607583
Test: Verified unit-tests pass and added new e2e tests.
Change-Id: I98e1f544d6e6571545ae1581c4cebab807596f51