Traditional CoT techniques involve prompting the model to respond with its reasoning step by step. o1 is still doing that. But what's different here is they've figured out the optimal way to chain thoughts of reasoning together. All without using complicated, inefficient RAG databases to optimize the response. Effectively, the model is trained how to "reason" by using the most effective (read predictive) strategy for CoT.

Does any normal person consider rationalwiki a reliable source?

Yes. Rationalwiki is high up (or used to) on the google search index. Most people that visit rationalwiki are directed from Google. Same deal with wikipedia. The average user doesn't know or care about any online drama.