Question 1

Is PWM-Bench a model?

Accepted Answer

No. It is a benchmark framework — a task definition, scoring methodology, baselines, leakage controls, and governance protocol. It does not ship a model.

Question 2

Is PWM-Bench a dataset?

Accepted Answer

No. Many datasets may instantiate it. PWM-Bench specifies how forecasts are made, sealed, resolved, and scored; the underlying participant data stays under participant control.

Question 3

Does PWM-Bench claim to solve person understanding?

Accepted Answer

No. It proposes a way to measure progress toward it. The benchmark makes the claim falsifiable, not settled.

Question 4

Why forecasting?

Accepted Answer

Understanding is latent and cannot be observed directly. Forecasting is observable: if a system understands an individual, it should predict that individual's future better than population knowledge and personal routine.

Question 5

Why not self-report?

Accepted Answer

Self-report is valuable but retrospective and incomplete. PWM-Bench tests whether additional evidence improves future predictive accountability — a property self-report alone cannot establish.

Question 6

Why is this not just personalization?

Accepted Answer

Personalization predicts outputs. PWM-Bench tests whether a system can forecast the evolving state that generates those outputs — attention, goals, and goal-state transitions — under sealed, prospective conditions.

Question 7

Are there results yet?

Accepted Answer

No. The current release is pre-pilot. Scores will be reported in PWM-Pilot. No empirical results are currently on the leaderboard.

Question 8

Will raw participant data be public?

Accepted Answer

No. PWM-Bench is designed for federated execution and aggregate-only reporting. Raw data stays under participant control; only resolved outcomes and aggregate metrics leave the client.

Question 9

What stops a model from just memorizing a person?

Accepted Answer

The identity-permutation test. If a system's apparent skill survives when forecasts are scored against the wrong individual, that skill was not person-specific. PWM-Bench requires skill to collapse under permutation.

Question 10

How do you prevent leakage from the future?

Accepted Answer

Forecasts are sealed and timestamped before outcomes occur, evaluation is strict walk-forward with no random cross-validation, and no system may access evidence dated after its forecast time.