Posted in

Introducing Critique, a new multi-model deep research sys…

Microsoft Researcher adds multi-model intelligence with Critique and Council. Critique separates generation and review across models to boost factual accuracy, breadth, and presentation quality, enforcing rubric-based evaluation and stricter citation grounding. Council presents parallel model reports plus a judge synthesis highlighting divergences.

Today Researcher adds multi-model intelligence to Microsoft 365 Copilot. This release introduces Critique and Council for deeper, more reliable research outputs. The change shifts Researcher to a generator-plus-reviewer and model-comparison architecture.

Main feature/change and impact

Critique separates generation and evaluation into distinct model roles to improve quality. One model handles planning, retrieval, and drafting while a second model reviews and refines. Evaluations on the DRACO benchmark show a +7.0 point aggregated improvement over prior single-model systems. This architecture improves factual accuracy, analytical depth, presentation, and citation quality across most domains.

Practical implications

Teams will get stronger, more defensible research artifacts with evidence-grounded claims. Council enables side-by-side model reports plus a synthesized cover letter highlighting agreements and divergences. Administrators can choose Auto for Critique or Model Council for comparisons in the model picker. Organizations should plan for slightly different token usage and validation workflows when adopting these features.
“Critique will be the default experience in Researcher, available when Auto is selected in the model picker.”
This update matters for high-stakes research and regulated domains requiring verifiable sourcing. Next steps are enabling Critique or Council in pilot projects, tracking DRACO-like metrics, and updating review checklists. Monitor token usage and evaluation outcomes as you integrate the new Researcher capabilities.

Key points from the article:

  • Critique splits generation and evaluation for higher research quality.
  • Critique improved DRACO aggregate score by +7.0 points.
  • Council shows side-by-side model reports with judge summary.
  • Reviewer enforces strict evidence grounding and citation quality.
  • Available in Frontier program; default when Auto selected.
  • Related Coverage:

    From the Source