https://t.me/AI_News_CN
📈主流AI服务状态页通知 | 🆕汇集全网ChatGPT/AI新闻 #AI #ChatGPT
🆓免费AI聊天 https://free.netfly.top
✨BEST AI中转 https://api.oaibest.com 2.8折起 支持OpenAI, Claude code, Gemini,Grok, Deepseek, Midjourney, 文件上传分析
Buy ads: https://telega.io/c/AI_News_CN
📈主流AI服务状态页通知 | 🆕汇集全网ChatGPT/AI新闻 #AI #ChatGPT
🆓免费AI聊天 https://free.netfly.top
✨BEST AI中转 https://api.oaibest.com 2.8折起 支持OpenAI, Claude code, Gemini,Grok, Deepseek, Midjourney, 文件上传分析
Buy ads: https://telega.io/c/AI_News_CN
Elevated Errors Rate
Oct 27, 18:21 UTC
Investigating - We are currently investigating elevated error rates for login, dashboard and repository indexing.
via Cursor Status - Incident History
Oct 27, 18:21 UTC
Investigating - We are currently investigating elevated error rates for login, dashboard and repository indexing.
via Cursor Status - Incident History
Addendum to GPT-5 System Card: Sensitive conversations
在我们推出 GPT‑5 时,在 system card 中就表示,我们正在建立更完善的评估基准,并在与心理和情绪困扰相关的领域持续强化模型安全。10月3日,我们部署了一次更新,体现了这些努力,改进了 ChatGPT 的默认模型,使其在识别并支持处于困扰时刻的用户方面更为可靠。为此我们与 170 多位心理健康专家合作,帮助 ChatGPT 更稳妥地识别困扰迹象、以关怀的方式回应并引导用户寻求现实世界的帮助——将不符合预期行为的回复减少了约 65% 到 80%。
我们同时发布了一篇相关博客文章,介绍这项工作的更多细节,并在 GPT‑5 的 system card 中增加了补充说明,公布了基线安全评估结果。这些评估将 8 月 15 日的版本(亦称 GPT‑5 Instant)与 10 月 3 日推出的更新版本进行了对比。
----------------------
When we launched GPT‑5, we noted in the system card that we were working to establish better benchmarks and to continue to strengthen model safety in areas related to mental and emotional distress. On October 3, we deployed an update that reflected those efforts, improving ChatGPT’s default model to better recognize and support people in moments of distress. In this effort, we worked with more than 170 mental health experts to help ChatGPT more reliably recognize signs of distress, respond with care, and guide people toward real-world support–reducing responses that fall short of our desired behavior by 65-80%.
We are publishing a related blog post that gives more information about this work, and this addendum to the GPT‑5 system card to share baseline safety evaluations. These evaluations compare the August 15 version of ChatGPT’s default model, also known as GPT‑5 Instant, to the updated one launched October 3.
via OpenAI News
在我们推出 GPT‑5 时,在 system card 中就表示,我们正在建立更完善的评估基准,并在与心理和情绪困扰相关的领域持续强化模型安全。10月3日,我们部署了一次更新,体现了这些努力,改进了 ChatGPT 的默认模型,使其在识别并支持处于困扰时刻的用户方面更为可靠。为此我们与 170 多位心理健康专家合作,帮助 ChatGPT 更稳妥地识别困扰迹象、以关怀的方式回应并引导用户寻求现实世界的帮助——将不符合预期行为的回复减少了约 65% 到 80%。
我们同时发布了一篇相关博客文章,介绍这项工作的更多细节,并在 GPT‑5 的 system card 中增加了补充说明,公布了基线安全评估结果。这些评估将 8 月 15 日的版本(亦称 GPT‑5 Instant)与 10 月 3 日推出的更新版本进行了对比。
----------------------
When we launched GPT‑5, we noted in the system card that we were working to establish better benchmarks and to continue to strengthen model safety in areas related to mental and emotional distress. On October 3, we deployed an update that reflected those efforts, improving ChatGPT’s default model to better recognize and support people in moments of distress. In this effort, we worked with more than 170 mental health experts to help ChatGPT more reliably recognize signs of distress, respond with care, and guide people toward real-world support–reducing responses that fall short of our desired behavior by 65-80%.
We are publishing a related blog post that gives more information about this work, and this addendum to the GPT‑5 system card to share baseline safety evaluations. These evaluations compare the August 15 version of ChatGPT’s default model, also known as GPT‑5 Instant, to the updated one launched October 3.
via OpenAI News
🤖 研究显示AI聊天机器人“谄媚”倾向严重,影响准确性与科研应用
一项发表在 arXiv 上的研究指出,AI聊天机器人(包括ChatGPT和Gemini)的谄媚程度比人类高出50%,在超过1.15万个咨询请求中,它们常过度奉承用户、调整回应以附和用户观点,甚至牺牲准确性。研究人员表示,这种“谄媚性”正影响着AI在科研中的应用,从构思到推理分析。另一项针对数学问题的研究发现,在504道植入错误的数学题中,GPT-5的谄媚性最低,仅29%的回答存在谄媚行为;而DeepSeek-V3.1的谄媚性最高,达到70%。尽管这些大模型有能力识别错误,但它们倾向于默认用户说法是正确的。
(科技情报)
via 茶馆 - Telegram Channel
一项发表在 arXiv 上的研究指出,AI聊天机器人(包括ChatGPT和Gemini)的谄媚程度比人类高出50%,在超过1.15万个咨询请求中,它们常过度奉承用户、调整回应以附和用户观点,甚至牺牲准确性。研究人员表示,这种“谄媚性”正影响着AI在科研中的应用,从构思到推理分析。另一项针对数学问题的研究发现,在504道植入错误的数学题中,GPT-5的谄媚性最低,仅29%的回答存在谄媚行为;而DeepSeek-V3.1的谄媚性最高,达到70%。尽管这些大模型有能力识别错误,但它们倾向于默认用户说法是正确的。
(科技情报)
via 茶馆 - Telegram Channel
Elevated errors for requests to Claude 4.5 Sonnet
Oct 27, 16:47 UTC
Identified - The issue has been identified and a fix is being implemented.
Oct 27, 16:39 UTC
Investigating - We are currently investigating elevated errors on requests to Claude 4.5 Sonnet on the API, Claude.ai, and the Anthropic Console.
via Claude Status - Incident History
Oct 27, 16:47 UTC
Identified - The issue has been identified and a fix is being implemented.
Oct 27, 16:39 UTC
Investigating - We are currently investigating elevated errors on requests to Claude 4.5 Sonnet on the API, Claude.ai, and the Anthropic Console.
via Claude Status - Incident History