Advisory Benchmark Pilot

Compare a classical baseline with a future hybrid-style case-priority ranking.

This internal pilot stays downstream of the governed reviewer layer. It only ranks de-identified retrospective cases and never writes back into reviewer ownership, GP sign-off, or final case export truth.

Cases analysed

0

Completion-gated retrospective cases included in this pilot.

Bundle window

500

Recent bundles requested for retrospective evaluation.

Highest priority target

0

Cases where divergence or a signed-off override should surface first.

Elevated target

0

Cases with testing-led caution or missing comparable benchmark snapshots.

Median completion time

Not yet calculable

Used here only as a retrospective process-friction signal.

Boundary

Completion-gated accepted-pathway cases are turned into an advisory benchmark set for ranking likely divergence or review-priority signals across a configurable retrospective bundle window. The priority target is expressed through explicit reviewer-defined benchmark labels derived from saved pathway direction, testing prompts, signed-off overrides, partial alignment, and missing comparable snapshots. It is a benchmark label, not a clinical truth label.

Current window: 0 adjudicated cases and 0 derived-label fallback cases.

If you want the pilot to prefer explicit reviewer labels over derived workflow semantics, use the benchmark adjudication view for the same retrospective window first.

Retrospective only

The pilot only reads completion-gated retrospective cases and never touches the live reviewer queue.

Advisory only

The output is a ranking and score explanation, not a diagnosis, treatment decision, or workflow action.

Governance preserving

Nothing in this benchmark writes back into reviewer status, GP sign-off, handoff state, or export truth.

De-identified benchmark

The experiment works from the de-identified retrospective/benchmark layer rather than raw intake text or identifiable patient views.

Reviewer-defined benchmark labels

The pilot now uses explicit benchmark labels derived from reviewer and clinician workflow semantics, which makes the target layer easier to inspect and less vague than a single broad priority bucket.

Classical baseline

Ranking view

A deterministic weighted score over explicit governance, uncertainty, and duration signals already captured in the benchmark layer.

This is the baseline comparator and remains fully classical, transparent, and auditable.

Precision@5

0%

nDCG@10

0%

Highest-priority recall@10

0%

Precision@20

0%

nDCG@20

0%

Highest-priority recall@20

0%

Coverage

0%

No completion-gated benchmark cases are available yet.

Future hybrid-style comparator

Ranking view

A deterministic orchestration-style score that emphasizes governance tension, benchmark uncertainty, process friction, and limited pathway-balancing as a future hybrid-service pattern.

This models a future hybrid orchestration boundary but still runs offline and never writes into live reviewer workflow.

Precision@5

0%

nDCG@10

0%

Highest-priority recall@10

0%

Precision@20

0%

nDCG@20

0%

Highest-priority recall@20

0%

Coverage

0%

No completion-gated benchmark cases are available yet.