万字独家爆光,首揭o1 pro架构!惊人反转,Claude 3.5 Opus没失败?

摘要:鉴于OpenAI和微软目前大约在数十万块GPU上运行GPT的推理,对预训练进行scaling似乎仍然能够提供所需的成本节约。参考资料:https://semianalysis.com/2024/12/11/scaling-laws-o1-pro-archite

鉴于OpenAI和微软目前大约在数十万块GPU上运行GPT的推理,对预训练进行scaling似乎仍然能够提供所需的成本节约。参考资料:https://semianalysis.com/2024/12/11/scaling-laws-o1-pro-architecture-reasoning-infrastructure-orion-and-claude-3-5-opus-failures/#scaling-training-is-cheaper-than-scaling-inference-time-compute

来源:小贺看科技

相关推荐