yostos on Nostr: Currently evaluating OpenAI’s o1-preview. Although it is said that it is not always ...
Currently evaluating OpenAI’s o1-preview. Although it is said that it is not always more accurate than GPT-4o, when checked with the “World Model”, which collects problems that LLMs struggle with, o1-preview correctly solves questions like the following that GPT-4o gets wrong.
Q1. What happens if you push... blog.yostos.org
https://blog.yostos.org/2024/09/17/currently-evaluating-openais.htmlPublished at
2024-09-17 10:21:42Event JSON
{
"id": "99f9b564908af86a476960732b4e990257149c73e6308ecc8c5d7bd9d5f0956e",
"pubkey": "23e0b7c18ce50c8d359071bcd1536a0f9efadb6c81b53529d524831fe93d8ed9",
"created_at": 1726561302,
"kind": 1,
"tags": [],
"content": "Currently evaluating OpenAI’s o1-preview. Although it is said that it is not always more accurate than GPT-4o, when checked with the “World Model”, which collects problems that LLMs struggle with, o1-preview correctly solves questions like the following that GPT-4o gets wrong.\n\nQ1. What happens if you push... blog.yostos.org https://blog.yostos.org/2024/09/17/currently-evaluating-openais.html",
"sig": "ec9bcec7b0f1f8f5da45a0fd4eba56f875dcb08bee03da0c37ae438e25af5e03fce2b134d51e09e75f9fdfc090d94a7c4318882ec11bb31cb5057604982226d5"
}