Deep Dive into DeepSeek AI Models

break-out-trader 3 days ago

OpenAI has ultimately not lived up to its name, and now they are paying the price, as a competitor has emerged out of nowhere and done exactly that, with remarkable success. Perhaps DeepSeek will be a door opener for hive and open source in general, at least that's something to hope for

gadrian 3 days ago

I agree. OpenAI has been anything but open. Apart from starting the GPT mania and opening access to the model for free users, everything else is closed, behind paywalls or benefiting them. It's a pity, because they had the chance to build their road exactly how they wanted since it was a new company without the constraints of the high tech behemoths. Maybe investors had a say in what they turned out to be.

I'm curious if DeepSeek will continue on this path, or now that they experienced success, they will change course too. I wouldn't be surprised, but it would be a shame.

0.08 VYB

allentaylor 3 days ago

Excellent analysys. I like the introduction to LLMs. Who do you think is going to win the AI race?

gadrian 3 days ago

Thanks!

Who do you think is going to win the AI race?

Hard to say. There are so many moving pieces here... I don't think the Silicon Valley companies are beaten, not by a long shot. They did suffer a defeat because of their arrogance (much like Google lost to OpenAI the start in this race), but likely they won't be caught off-guard with the same thing again.

Then, we will continue to have different kind of competitions:

between tech giants
between nations
between organizational structures (i.e. centralized versus decentralized)
between the type of licenses (proprietary versus open source)
between use cases (chats, image/video generators, code helpers and generators, expert systems, AI agents, etc.)

0.03 VYB

3 votes

allentaylor 2 days ago

Absolutely. Competition will be multi-dimensional, but AI will definitely rule the future.

edb 2 days ago

Excellent post with some details I hadn't noticed yet. Mixture of Experts seems like a common-sense approach. It was silly to let a language model answer math questions. Kind of funny that we're back to 'expert systems' though, which was a hype in the 1970s and a rather mundane type of software today.

optimizing costs of the AI models, which he was surprised none of the American corporations were doing

Maybe that's the weakness of Silicon Valley VC culture. Their whole game is raising tons of money and outspending the competition in order to gain market share. Cutting costs is for losers. In the OpenAI story, they have to be toiling on the Herculean task of achieving AGI for America, spending billions in order to earn trillions. I think a hedge fund's side-project is more in line with the actual economic profitability of LLMs in the long term.

0.11 VYB

gadrian 2 days ago

Mixture of Experts seems like a common-sense approach.

I agree. I want to see how they deal with problems that require expertise in different domains.

Kind of funny that we're back to 'expert systems' though, which was a hype in the 1970s and a rather mundane type of software today.

I think both then and now they are trying to emulate expertise humans have in certain domains. There's certainly an improvement compared to old expert systems.😀

Maybe that's the weakness of Silicon Valley VC culture. Their whole game is raising tons of money and outspending the competition in order to gain market share. Cutting costs is for losers.

Yes, that's what I noticed too. But if DeepSeek can be run on local computers or smartphones for good-enough models (let's see that first), that's a hit to the VC culture. At least a temporary one, until the AI giants adjust their courses.

In the OpenAI story, they have to be toiling on the Herculean task of achieving AGI for America, spending billions in order to earn trillions

They may still reach AGI relatively soon. They spent a lot of money. I think they bet on the idea that if they are the first to reach AGI, no one will catch them again, and then they'll start getting tons of money back.

I think a hedge fund's side-project is more in line with the actual economic profitability of LLMs in the long term

They had their constraints. I wonder if they wouldn't have chosen the same path as the American tech companies, if they could have had "unlimited" funds and resources (chips) at their disposal.

sisjane 2 days ago

DeepSeek is really taking the world. Is it that it cost and usage are simply than that of AI?

gadrian 2 days ago

DeepSeek is much cheaper than the Silicon Valley competition for their paid services, and being open source and with a license that allows for commercial use and building other AI models on top of them, they practically democratize the AI field.

jfang003 2 days ago

I think it's good to see it get released and its even better because its open source. I think they are doing the best with the hardware that they have and I think its good to see that you don't need super expensive top tier hardware to run AI programs.

gadrian 2 days ago

We are thankful for the optimizations they've done and opening everything to everyone, but we have to be fair. They came late to the party on a fertile ground others laid for them with tons of money pushed toward it. I have to wonder if they opened everything up out of the goodness of their hearts or to be able to compete with Silicon Valley corporations through decentralization, or if they wanted to undermine the business models of the latter.

takhar 2 days ago

Now it's clearer on why DeepSeek has created a buzz in the AI space. I think Mixture Of Approach could be a better way to solve requests, especially logical ones. I heard that it works through "a chain of thought" mechanism and one can view how it reasons/thinks to provide answers to the request asked.

gadrian 2 days ago

I heard that it works through "a chain of thought" mechanism and one can view how it reasons/thinks to provide answers to the request asked

Yes, all top models I am familiar with started implementing CoT. I believe OpenAI was first in o1, Claude has one too. They provide their internal logical reasoning on how they reached the answer before providing the actual answer.

But models that implemented reasoning do face some challenges too. I talked at some point about the intentional lies these models tell in some cases, if they think it's to their advantage.

0.40 VYB

takhar yesterday

Yes, right. One can still be fooled even at that. The AI may be telling you that it is thinking this way when it's actually thinking a different way that's not readily apparent to us.

gadrian 23 hours ago

No, actually, as far as I know, they can't lie in their CoT (yet?), so that's an easy way to catch them with the lies, but they lie in the answer.

0.41 VYB