The Great Image Race

in #hive-1679222 months ago

Generative AI is the rage.

There is good reason for this. When we look at the advancement, it is mind numbing.

This may come as a surprise to many but I do not think we are even at the knee of the exponential curve. As stated other places, my view is we hit this with the next generation of LLMs. That means Grok3, Llama4, Claude4, and ChatGPT 5.0 will gap where we are now.

Of course, for the most part, these are text based systems although they are also being used for images. While Grok did get a bit of attention with Grok 2.0, it doesn't really compete with the image models.

For that realm, we have to look at Ideogram and, most recently, Flux. They are the ones leading that race.


Image generated by Ideogram

The Image Race Is Heating Up

Just like with text, we are seeing images advancing at an amazing pace. Midjourney was one of the early leaders, followed by Ideogram.

Flux got a lot of attention of late, the engine behind Grok. This means we are seeing a changing of the guard.

Or are we?

Ideogram just released version 2.0. This is what it had to say.

"Ideogram 2.0 significantly outperforms other text-to-image models across many quality metrics, including image-text alignment, overall subjective preference, and text rendering accuracy," the company said in its official announcement.

Personally, I don't put a lot of stock into the metric based rankings. I understand the need to quantify and I guess it is better to be leading than behind. However, the latter doesn't mean that it is worse than what is ranked ahead.

This is why I watch a lot of videos done by people who actually test this stuff out. They tell you the areas where they fall show.

That said, Ideogram coming out with an update was a long time coming.

So far, it is one of the better applications for text-to-image. I have only done some minor playing with the second version so I cannot accurately report on it.

The key here is it is moving ahead.

I did notice the speed is a bit better off the couple images I did generate.

18 Months From Now

It is easy to fall victim of looking at where things are at this moment.

With the pace of change in all things related to generation AI, just wait a little bit. I keep saying we are very early in this race and it is far from over.

Things are improving as measured by orders of magnitude instead of percentages. That is how much we are seeing. Between the amount of compute, algorithm improvements, and what some call unhobbling, the improvements are measured in terms of X (3X, 5X).

As we stretch the time frame out, we see orders of magnitude emerging. This means a 5X is a .5 OOM.

Here is a chart of the projected growth by an ex-OpenAI employee.


Source

This is saying that, from compute and algorithmic efficiency, we are look at a 3-6 OOM. Basically, this equations to a 1,000 - 1 million times improvement from ChatGPT 4.o.

And that is before we take the unhobbling into consideration.

When we put all of this together, it is easy to see how we could be looking at a great deal more advancement over the next few years.

Text-to-image is going to follow a similar path. Even if the numbers are slightly lower, we will be impressed with what is possible.

The is a race that is just getting started.


What Is Hive

Posted Using InLeo Alpha

Sort:  

I couldn't test Dall-E the last couple of months, as you now need a ChatGPT pro account, but leonardo.ai is my favorite image generator. I'm sure, this feature is implemented in every image generator AI by now, but leonardo has finally consistent characters, so that you can tell a real story and multiple image show the same character.
That was a game changer.

I havent tried that one.

Like most of these, if one gets it, wait a generation or two, and the others will have it.

Yeah, you're right about that. I'm pretty excited about what Grok will offer us in a few months for verified users.

I am not a big X user and dont have premium so the capabilities of Grok, other than from an industry standpoint, is in my sphere. But it is good to see the advancement knowing that most others are going to follow closely.

AI image race is on the peak and I feel somehow Midjourney dominates this market especially since Canva acquired it.

Hard to tell who is leading the race. They switch so quickly. The latest news is Ideogram is at the forefront but others say Flux.

The next update will change that too.

The competition is really getting tougher and tougher as the day goes by and this is getting so evident as that can not be denied

Basically when dealing with unlimited, there is no competition.

Image generation is pretty cool. The quality that they produce now is so much better compared to a year ago. Nowadays, realistic image prompts are already hard to identify as AI generated. What I find interesting though is that the big names are no longer at the top. I think this is because they are no longer focusing on images, but rather on video. Since videos are just slightly different images shown quickly, that is the next logical step.

Imagine what they will be one year from now. That is what most do not realize about the generative #ai. It is improve rapidly.

We are not talking percentages but orders of magnitude. That is a much different world.

To be honest, I don't really like the word "fashion", it is usually very fleeting and you never know when something else will replace it, a short-term investment is visible in this. !BEER


Hey @taskmaster4450le, here is a little bit of BEER from @barski for you. Enjoy it!

We love your support by voting @detlev.witness on HIVE .