智能问数产品现况与选型观察

<main>
  <header class="hero" id="top">
    <div>
      <h1>智能问数产品现况与选型观察</h1>
      <p>从“AI 写 SQL”到“语义层驱动的可信数据智能体”，主流产品正在把问数、图表、权限、指标口径、洞察报告和 Agent 工作流接到一起。</p>
      <div class="hero-meta">
        <span class="pill">ChatBI</span>
        <span class="pill">Text-to-SQL</span>
        <span class="pill">Semantic Layer</span>
        <span class="pill">Agentic BI</span>
      </div>
    </div>
    <div class="hero-panel" aria-label="关键观察">
      <div class="stat-grid">
        <div class="stat"><b>4 类</b><span>商业 BI、云数据平台、国内厂商、开源自建</span></div>
        <div class="stat"><b>1 条主线</b><span>从自然语言转 SQL 走向可信语义与治理</span></div>
        <div class="stat"><b>30-50</b><span>建议 MVP 先用高频问题集验证准确率</span></div>
        <div class="stat"><b>不是模型</b><span>成败核心是指标口径、权限和持续运营</span></div>
      </div>
    </div>
  </header>

  <section id="overview">
    <h2>一眼看清：产品正在往哪里走</h2>
    <p class="lead">智能问数已经不再只是把一句中文翻译成 SQL。越成熟的产品，越强调语义模型、指标治理、权限、安全执行、可解释回答和反馈评测。</p>
    <div class="grid cols-4">
      <article class="card">
        <span class="tag">商业 BI AI 化</span>
        <h3>从看板到对话</h3>
        <p>Power BI、Tableau、ThoughtSpot、FineBI 等把自然语言入口放进已有 BI 体系，优势是权限、报表、图表和交付成熟。</p>
      </article>
      <article class="card">
        <span class="tag green">云平台原生</span>
        <h3>从数据底座发力</h3>
        <p>Snowflake、Databricks、Looker、QuickSight 更重视和数仓、湖仓、目录、权限、计算资源的结合。</p>
      </article>
      <article class="card">
        <span class="tag amber">国内场景</span>
        <h3>中文与私有化</h3>
        <p>Quick BI 小Q、FineChatBI、Smartbi、观远、衡石等更强调中文业务术语、私有部署、信创和行业交付。</p>
      </article>
      <article class="card">
        <span class="tag cyan">开源自建</span>
        <h3>更可控但更费工</h3>
        <p>SQLBot、SuperSonic、WrenAI、DB-GPT 等适合 PoC 和二开，但企业级治理、评测和运维需要自己补齐。</p>
      </article>
    </div>
  </section>

  <section class="alt" id="map">
    <h2>市场地图：成熟度与可控性的取舍</h2>
    <p class="lead">越靠右表示企业级成熟度越高，越靠上表示自研可控性越强。多数团队真正要选的不是“最强产品”，而是“最符合自身数据底座和交付约束的路线”。</p>
    <div class="matrix" role="img" aria-label="智能问数产品市场象限">
      <svg viewBox="0 0 980 460" xmlns="http://www.w3.org/2000/svg">
        <rect width="980" height="460" fill="#fff"/>
        <line x1="100" y1="380" x2="900" y2="380" stroke="#8ea0b3" stroke-width="2"/>
        <line x1="100" y1="380" x2="100" y2="60" stroke="#8ea0b3" stroke-width="2"/>
        <line x1="500" y1="60" x2="500" y2="380" stroke="#d9e0e8" stroke-dasharray="6 8"/>
        <line x1="100" y1="220" x2="900" y2="220" stroke="#d9e0e8" stroke-dasharray="6 8"/>
        <text x="430" y="425" fill="#5d6876" font-size="16">产品成熟度 / 企业交付能力</text>
        <text x="28" y="245" fill="#5d6876" font-size="16" transform="rotate(-90 28 245)">自研可控性 / 可二开程度</text>

        <g font-size="13" font-weight="700">
          <rect x="665" y="92" width="150" height="34" rx="8" fill="#e9f7f0" stroke="#b8dec9"/><text x="683" y="114" fill="#12724c">FineBI / Smartbi</text>
          <rect x="695" y="142" width="166" height="34" rx="8" fill="#e8f8fb" stroke="#b6e1e8"/><text x="714" y="164" fill="#08798a">Power BI / Tableau</text>
          <rect x="595" y="194" width="172" height="34" rx="8" fill="#eef4ff" stroke="#bfd0ee"/><text x="613" y="216" fill="#2057b7">ThoughtSpot / Looker</text>
          <rect x="605" y="252" width="180" height="34" rx="8" fill="#fff5df" stroke="#ead29b"/><text x="623" y="274" fill="#8a5400">Snowflake / Databricks</text>
          <rect x="380" y="118" width="154" height="34" rx="8" fill="#f2effc" stroke="#d0c7ef"/><text x="398" y="140" fill="#6045ad">SQLBot / DB-GPT</text>
          <rect x="320" y="174" width="176" height="34" rx="8" fill="#f2effc" stroke="#d0c7ef"/><text x="338" y="196" fill="#6045ad">SuperSonic / WrenAI</text>
          <rect x="228" y="264" width="160" height="34" rx="8" fill="#fff0ee" stroke="#ecc1bd"/><text x="246" y="286" fill="#9b2b27">Chat2DB / SQLChat</text>
        </g>

        <text x="118" y="82" fill="#334155" font-size="14" font-weight="700">高可控，需自建治理</text>
        <text x="665" y="82" fill="#334155" font-size="14" font-weight="700">成熟产品，生态绑定</text>
        <text x="116" y="363" fill="#334155" font-size="14" font-weight="700">工具化探索</text>
        <text x="668" y="363" fill="#334155" font-size="14" font-weight="700">平台原生能力</text>
      </svg>
    </div>
  </section>

  <section id="global">
    <h2>海外产品：从 Copilot 到 Agentic Analytics</h2>
    <p class="lead">海外头部产品普遍不再讲单点 NLQ，而是把自然语言入口和语义模型、数据权限、报表创作、嵌入式分析、Agent 工作流绑定。</p>
    <div class="grid cols-2">
      <article class="card product">
        <div class="product-head"><h3>Power BI Copilot</h3><span class="tag">微软生态</span></div>
        <p>适合已有 Power BI/Fabric 的组织。强项是报表总结、DAX 辅助、语义模型驱动问答和办公生态协同。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>BI 底座成熟</li><li>权限和容量统一管理</li><li>报表作者提效明显</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>需要付费容量</li><li>非英文与区域限制需验证</li><li>绑定微软生态</li></ul></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>Tableau Agent</h3><span class="tag cyan">视觉分析</span></div>
        <p>更像分析师的建图和探索助手。适合用自然语言生成图表、计算字段、过滤排序和探索建议。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>可视化表达强</li><li>适合分析创作</li><li>与 Tableau UI 协同好</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>不是开放式机器人</li><li>复杂数据混合受限</li><li>仍依赖数据建模</li></ul></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>ThoughtSpot Spotter</h3><span class="tag green">AI Analyst</span></div>
        <p>搜索式分析代表产品，强调自然语言问答、可视化、自助分析和企业 trust layer。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>业务用户体验成熟</li><li>搜索式分析积累深</li><li>嵌入式分析能力强</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>商业成本较高</li><li>依赖模型和字段治理</li><li>国内部署需评估</li></ul></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>Looker Conversational Analytics</h3><span class="tag violet">LookML 语义层</span></div>
        <p>基于 Gemini 和 LookML，让自然语言在受治理的 Explore 和 Data Agent 中执行。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>语义层强</li><li>口径一致</li><li>可做 data agent</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>LookML 门槛高</li><li>绑定 Google/Looker</li><li>合规边界需审查</li></ul></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>Snowflake Cortex Analyst</h3><span class="tag amber">API-first</span></div>
        <p>更像 Snowflake 原生问数 API。通过 semantic model 或 semantic views 把业务问题转为可信 SQL。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>与 Snowflake 权限结合</li><li>API 集成灵活</li><li>语义模型路径清晰</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>需自建前端体验</li><li>绑定 Snowflake</li><li>多轮仍有边界</li></ul></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>Databricks Genie</h3><span class="tag green">湖仓问数空间</span></div>
        <p>围绕 Genie Space 创建领域化问数入口，由分析师配置数据集、示例 SQL、业务说明和评测。</p>
        <div class="pros-cons">
          <div class="mini"><b>优势</b><ul><li>Unity Catalog 治理</li><li>SQL/表格/可视化闭环</li><li>支持 API 和外部嵌入</li></ul></div>
          <div class="mini"><b>限制</b><ul><li>绑定 Databricks</li><li>需要持续策划空间</li><li>计算成本需控制</li></ul></div>
        </div>
      </article>
    </div>
  </section>

  <section class="alt" id="domestic">
    <h2>国内产品：中文业务、私有化和行业交付更关键</h2>
    <p class="lead">国内产品常把智能问数放在 BI、数据门户、经营分析、移动办公和私有化项目里一起交付。它们的优势不只在模型，而在中文业务术语、权限、报表闭环和现场实施。</p>
    <table>
      <thead>
        <tr>
          <th>产品</th>
          <th>主要特色</th>
          <th>优点</th>
          <th>注意点</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td>Quick BI 小Q问数</td>
          <td>PC/移动端自然语言问数、数据集问答、仪表板问数、多轮对话</td>
          <td>阿里云和 Quick BI 生态衔接好，中文体验和移动办公友好</td>
          <td>增值模块、版本和区域有限制，绑定 Quick BI 数据集</td>
        </tr>
        <tr>
          <td>FineBI / FineChatBI</td>
          <td>可信查数、智能模式/极速模式、结合 FineBI 报表和数据门户</td>
          <td>国内交付成熟，私有化和复杂报表能力强</td>
          <td>商业闭源，二开空间有限，依赖前期数据治理</td>
        </tr>
        <tr>
          <td>Smartbi AIChat</td>
          <td>智能问数、复杂计算、图表生成、下钻引导、歧义澄清</td>
          <td>重视同环比、累计等复杂指标计算，适合政企金融</td>
          <td>宣传准确率需用真实业务数据做 POC 验证</td>
        </tr>
        <tr>
          <td>观远问数 Agent</td>
          <td>意图识别、知识召回、问题理解、数据查询、可视化和洞察建议</td>
          <td>场景化强，贴近经营会议和业务分析</td>
          <td>依赖业务知识库和指标口径运营</td>
        </tr>
        <tr>
          <td>衡石 Data Agent</td>
          <td>Text2Metrics、指标管理、Agentic BI、嵌入式 BI</td>
          <td>用指标层约束问数，减少纯 SQL 幻觉，适合 ISV</td>
          <td>指标体系建设成本较高，不适合只做轻量查表</td>
        </tr>
        <tr>
          <td>网易知数 / 有数 ChatBI</td>
          <td>指标体系、知识库、数据分析智能体</td>
          <td>互联网数据分析实践沉淀，适合运营和增长场景</td>
          <td>公开资料相对少，需 POC 确认产品边界</td>
        </tr>
      </tbody>
    </table>
  </section>

  <section id="opensource">
    <h2>开源方案：适合验证，但生产化要补课</h2>
    <p class="lead">开源产品的价值在于可控和可学习。真正上线时，要特别关注许可证、权限模型、SQL 安全、评测体系和长期维护。</p>
    <div class="grid cols-3">
      <article class="card product">
        <div class="product-head"><h3>SQLBot</h3><span class="tag red">中文 PoC</span></div>
        <p>基于大模型和 RAG 的智能问数系统，能较快跑通 ChatBI、SQL 和图表闭环。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar green"><i style="width: 86%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar amber"><i style="width: 68%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar"><i style="width: 56%"></i></div></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>SuperSonic</h3><span class="tag green">语义层</span></div>
        <p>统一 ChatBI 和 Headless BI，内置语义模型、自动补全、多轮和三级权限。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar amber"><i style="width: 54%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar green"><i style="width: 78%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar green"><i style="width: 82%"></i></div></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>WrenAI</h3><span class="tag violet">Agent GenBI</span></div>
        <p>面向 AI Agent 的开放上下文层，用 MDL、业务定义、示例和记忆生成可信 BI。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar amber"><i style="width: 50%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar violet"><i style="width: 88%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar green"><i style="width: 80%"></i></div></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>DB-GPT</h3><span class="tag cyan">数据助手</span></div>
        <p>覆盖 SQL、Python 分析、技能、RAG、沙箱、图表和 HTML 报告，适合做数据分析 Agent。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar cyan"><i style="width: 66%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar green"><i style="width: 82%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar amber"><i style="width: 58%"></i></div></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>QueryWeaver</h3><span class="tag amber">Graph Text2SQL</span></div>
        <p>用图结构理解 schema，提供 REST API、MCP 和前端，适合复杂表关系研究。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar amber"><i style="width: 62%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar green"><i style="width: 76%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar"><i style="width: 46%"></i></div></div>
        </div>
      </article>
      <article class="card product">
        <div class="product-head"><h3>Chat2DB / SQLChat / PandasAI</h3><span class="tag">工具型</span></div>
        <p>更适合开发者、DBA、分析师或 Notebook 场景，不建议直接作为大量业务用户的正式问数入口。</p>
        <div class="score">
          <div class="score-row"><span>上手</span><div class="bar green"><i style="width: 80%"></i></div></div>
          <div class="score-row"><span>可控</span><div class="bar amber"><i style="width: 64%"></i></div></div>
          <div class="score-row"><span>治理</span><div class="bar"><i style="width: 36%"></i></div></div>
        </div>
      </article>
    </div>
  </section>

  <section class="alt" id="compare">
    <h2>选型时看三件事</h2>
    <p class="lead">不要只比较“能不能回答”。真正决定长期价值的是语义治理能力、产品集成方式和运营评测闭环。</p>
    <div class="grid cols-3">
      <article class="card">
        <span class="tag green">可信</span>
        <h3>是否有语义层</h3>
        <p>能否定义指标、维度、同义词、口径、表关系、行列权限。没有语义层，复杂问题很难稳定。</p>
      </article>
      <article class="card">
        <span class="tag cyan">可用</span>
        <h3>是否闭环到业务入口</h3>
        <p>是否能进入门户、看板、IM、移动端、报告和订阅。用户不会为了问数再打开一个陌生工具。</p>
      </article>
      <article class="card">
        <span class="tag amber">可运营</span>
        <h3>是否可评测和纠错</h3>
        <p>能否沉淀标准问题、标准 SQL、失败案例、用户反馈和回归测试。智能问数越用越准，靠的是运营。</p>
      </article>
    </div>
  </section>

  <section id="pitfalls">
    <h2>落地难点：不是模型不够聪明</h2>
    <p class="lead">大模型能提升交互，但它不能自动修复口径混乱、字段命名混乱、权限缺失和历史数据质量问题。</p>
    <div class="flow">
      <div class="step"><b>1</b><h3>表字段不懂业务</h3><p>库表名、字段名和注释不足时，模型只能猜。</p></div>
      <div class="step"><b>2</b><h3>指标口径不统一</h3><p>同一个“人数”在不同部门可能含义不同。</p></div>
      <div class="step"><b>3</b><h3>多表关系复杂</h3><p>join 路径、时间粒度和聚合层级最容易出错。</p></div>
      <div class="step"><b>4</b><h3>权限必须前置</h3><p>问数必须先过角色、行列级权限和脱敏规则。</p></div>
      <div class="step"><b>5</b><h3>没有评测就会退化</h3><p>模型、提示词、表结构变化后，要能回归测试。</p></div>
    </div>
    <div style="height: 18px"></div>
    <div class="callout">
      <strong>关键判断</strong>
      纯 Text-to-SQL 可以做演示，但生产系统应走“自然语言 → 业务语义层 → 查询计划 → SQL → 校验执行 → 可解释答案”的路线。
    </div>
  </section>

  <section class="alt" id="roadmap">
    <h2>自研项目路线：先小而准，再扩业务域</h2>
    <p class="lead">建议用一个业务域跑通闭环，而不是一开始接入全库。比如招生、财务、人事、教学、科研中选一个，先把 30-50 个高频问题做准。</p>
    <div class="timeline">
      <div class="phase">
        <b>阶段一<br>2-4 周</b>
        <ul>
          <li>选择一个业务域和 5-10 张核心表。</li>
          <li>整理 30-50 个高频问题、标准 SQL 和口径说明。</li>
          <li>跑通问答、SQL、安全执行、图表和解释。</li>
        </ul>
      </div>
      <div class="phase">
        <b>阶段二<br>1-2 个月</b>
        <ul>
          <li>加入权限、脱敏、SQL 白名单、超时和审计。</li>
          <li>支持多轮追问、同比环比、下钻和图表切换。</li>
          <li>接入门户或 IM，建立反馈和失败案例修复流程。</li>
        </ul>
      </div>
      <div class="phase">
        <b>阶段三<br>3-6 个月</b>
        <ul>
          <li>扩展到多个业务域，建设统一指标和数据目录。</li>
          <li>做自动评测、回归测试和问题推荐。</li>
          <li>封装 API/MCP，让报告生成、门户和其他智能体复用。</li>
        </ul>
      </div>
    </div>
  </section>

  <section id="refs">
    <h2>资料来源</h2>
    <p class="lead">以下为主要参考入口，产品细节以官方文档和实际 POC 为准。</p>
    <div class="refs">
      <a href="https://learn.microsoft.com/en-us/power-bi/create-reports/copilot-introduction">Microsoft Power BI Copilot overview</a>
      <a href="https://help.tableau.com/current/online/en-us/web_author_einstein.htm">Tableau Agent 官方帮助</a>
      <a href="https://docs.thoughtspot.com/cloud/26.6.0.cl/spotter">ThoughtSpot Spotter 文档</a>
      <a href="https://docs.cloud.google.com/looker/docs/conversational-analytics-overview">Looker Conversational Analytics</a>
      <a href="https://docs.snowflake.com/en/user-guide/snowflake-cortex/cortex-analyst">Snowflake Cortex Analyst</a>
      <a href="https://docs.databricks.com/aws/en/genie/">Databricks Genie Spaces</a>
      <a href="https://aws.amazon.com/blogs/business-intelligence/best-practices-for-enabling-business-users-to-answer-questions-about-data-using-natural-language-in-amazon-quicksight/">Amazon QuickSight Q</a>
      <a href="https://help.aliyun.com/zh/quick-bi/user-guide/chat-bi-overview">阿里云 Quick BI 小Q问数</a>
      <a href="https://help.fanruan.com/finebi/doc-view-259.html">FineBI / FineChatBI</a>
      <a href="https://www.smartbi.com.cn/aichat_agentbi">Smartbi AIChat</a>
      <a href="https://docs.guandata.com/product/chatbi/ChatBI-product-introduction">观远问数 Agent</a>
      <a href="https://www.hengshi.com/blog/hengshi-data-agent-selection-guide.html">衡石 Data Agent</a>
      <a href="https://github.com/dataease/SQLBot">SQLBot GitHub</a>
      <a href="https://github.com/tencentmusic/supersonic">SuperSonic GitHub</a>
      <a href="https://github.com/Canner/WrenAI">WrenAI GitHub</a>
      <a href="https://github.com/eosphoros-ai/DB-GPT">DB-GPT GitHub</a>
    </div>
  </section>

  <div class="footer">
    核心 takeaway：智能问数的价值不在“能不能生成 SQL”，而在能否围绕业务语义给出可信、可解释、可审计、可持续优化的数据答案。
  </div>
</main>