Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pm: Incorrect usage count in power domain when using put_async() on device #87051

Open
simengangstad opened this issue Mar 13, 2025 · 1 comment
Assignees
Labels
area: Power Management bug The issue is a bug, or the PR is fixing a bug

Comments

@simengangstad
Copy link

simengangstad commented Mar 13, 2025

Describe the bug

There seems to be a bug where the usage count of a power domain is not decremented when the usage count surpasses 1 and pm_device_runtime_put_async() is used on a device under the power domain.

To Reproduce

I've tested with the following modified tests in tests/subsys/pm/power_domain/src/main.c using native_sim:

west build -d build_native_sim zephyr/tests/subsys/pm/power_domain -p auto && ./build_native_sim/zephyr/zephyr.exe

ZTEST(power_domain_1cpu, test_power_domain_device_balanced)
{
	const struct device *balanced_domain = DEVICE_DT_GET(TEST_DOMAIN_BALANCED);
	const struct device *dev = DEVICE_DT_GET(TEST_DEV_BALANCED);
	enum pm_device_state state;
	int ret;

	/* Init domain */
	pm_device_init_suspended(balanced_domain);
	pm_device_runtime_enable(balanced_domain);

	/* At this point domain should be SUSPENDED */
	pm_device_state_get(balanced_domain, &state);
	zassert_equal(state, PM_DEVICE_STATE_SUSPENDED);

	/* Get and put the device without PM enabled should not change the domain */
	ret = pm_device_runtime_get(dev);
	zassert_equal(ret, 0);
	ret = pm_device_runtime_put(dev);
	zassert_equal(ret, 0);

	pm_device_state_get(balanced_domain, &state);
	zassert_equal(state, PM_DEVICE_STATE_SUSPENDED);

	/* Same thing with the domain in active state */
	ret = pm_device_runtime_get(balanced_domain);
	zassert_equal(ret, 0);
	pm_device_state_get(balanced_domain, &state);
	zassert_equal(state, PM_DEVICE_STATE_ACTIVE);

	ret = pm_device_runtime_get(dev);
	zassert_equal(ret, 0);
	ret = pm_device_runtime_put(dev);
	zassert_equal(ret, 0);

	pm_device_state_get(balanced_domain, &state);
	zassert_equal(state, PM_DEVICE_STATE_ACTIVE);

	// NOTE: Added this to reset the state for the next test
	pm_device_runtime_put(balanced_domain);
	pm_device_runtime_disable(balanced_domain);
}

ZTEST(power_domain_1cpu, test_power_domain_put_async_balanced)
{
	const struct device *balanced_domain = DEVICE_DT_GET(TEST_DOMAIN_BALANCED);
	const struct device *dev = DEVICE_DT_GET(TEST_DEV_BALANCED);
	enum pm_device_state state;
	int ret;

	/* Init domain */
	pm_device_init_suspended(balanced_domain);
	pm_device_runtime_enable(balanced_domain);
	pm_device_runtime_enable(dev);

	/* At this point domain should be SUSPENDED */
	pm_device_state_get(balanced_domain, &state);
	zassert_equal(state, PM_DEVICE_STATE_SUSPENDED);

	pm_device_state_get(dev, &state);
	zassert_equal(state, PM_DEVICE_STATE_SUSPENDED);

	ret = pm_device_runtime_get(dev);
	zassert_equal(ret, 0);
	zassert_equal(pm_device_runtime_usage(dev), 1);
	zassert_equal(pm_device_runtime_usage(balanced_domain), 1);

	// --- Removing this part makes the test pass

	ret = pm_device_runtime_get(dev);
	zassert_equal(ret, 0);
	zassert_equal(pm_device_runtime_usage(dev), 2);
	zassert_equal(pm_device_runtime_usage(balanced_domain), 2);

	ret = pm_device_runtime_put_async(dev, K_NO_WAIT);
	zassert_equal(ret, 0);
	k_yield();
	zassert_equal(pm_device_runtime_usage(dev), 1);
	zassert_equal(pm_device_runtime_usage(balanced_domain), 1);

	// ---

	ret = pm_device_runtime_put_async(dev, K_NO_WAIT);
	zassert_equal(ret, 0);
	k_yield();
	zassert_equal(pm_device_runtime_usage(dev), 0);
	zassert_equal(pm_device_runtime_usage(balanced_domain), 0);

	/* Clean up */
	pm_device_runtime_disable(balanced_domain);
	pm_device_runtime_disable(dev);
}

Expected behavior

I expect that the usage count would be equal in this case for both the device and the power domain. I assume that k_yield() and using K_NO_WAIT should rule out any race conditions here (?).

I've tested with the following changes, which alleviates the problem, but feels like a hacky solution. Essentially what is missing if I'm not incorrect is that the usage count of the power domain should be decremented as long as it is > 1 and we are calling put_async(). The only other place this is done is in the work submitted if I'm not mistaken here.

diff --git a/subsys/pm/device_runtime.c b/subsys/pm/device_runtime.c
index 4533886dd2a..b61ceef4b8f 100644
--- a/subsys/pm/device_runtime.c
+++ b/subsys/pm/device_runtime.c
@@ -364,6 +364,14 @@ int pm_device_runtime_put_async(const struct device *dev, k_timeout_t delay)
 		k_spin_unlock(&pm_sync->lock, k);
 	} else {
 		ret = runtime_suspend(dev, true, delay);
+
+		const struct device *domain = PM_DOMAIN(&pm->base);
+
+		if ((ret == 0) && domain != NULL &&
+		    atomic_test_bit(&dev->pm_base->flags, PM_DEVICE_FLAG_PD_CLAIMED) &&
+		    domain->pm_base->usage > 1) {
+			ret = pm_device_runtime_put(domain);
+		}
 	}
 	SYS_PORT_TRACING_FUNC_EXIT(pm, device_runtime_put_async, dev, delay, ret);
 

Environment:

  • OS: Linux (Ubuntu 24.04)
  • Toolchain: Zephyr SDK 16.8.0
  • Version: v3.7.0
@simengangstad simengangstad added the bug The issue is a bug, or the PR is fixing a bug label Mar 13, 2025
Copy link

Hi @simengangstad! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Power Management bug The issue is a bug, or the PR is fixing a bug
Projects
None yet
Development

No branches or pull requests

4 participants