We spoke a fair bit about data over the last few months. This is something that is becoming increasingly important. It is a shift that people involved in Web 3.0 need to be aware of.
Within the technology realm, many are wondering about the value of synthetic data. This is data that is machine generated. We touched upon it a few times but it is worthy of some deeper exploration.
The challenge is, right now, the synthetic data that is generate resides on closed servers. Each time we do a prompt on a ChatGPT or Gemini, that data is generated for that company.
This is simply another twist on the old social media model. Users are generating more data for Big Tech. The front end is different yet it produces the same results.
It gets compounded when people actually have "chats". A conversation provides human data mixed with the synthetic. To me, this is the Holy Grail as the knowledge graphs have to be strengthened.
Which brings up to the value of synthetic data. We have to consider there might be enormous value there and, if so, what is required of Web 3.0.
Image generated by Ideogram
Web 3.0 And Synthetic Data
The present debate within technology circles is whether synthetic data, that generated by machines, is valuable.
One of the theories is that the synthetic data will degrade over time. For this reason, many conclude that model training will asymptote over time.
Here is a short video with Mark Zuckerberg discussing what their revelations were when training Llama.
https://inleo.io/threads/view/taskmaster4450le/re-leothreads-28v81tnmd?referral=taskmaster4450le
Another important idea coming from Zuckerberg is his view that, in the future, inference generating synthetic data will feed the models for training.
This means that people, through their using of these AI features, are going to provide the data for the development of more advanced models.
It does appear the more things change, the more they stay the same.
We are once again back to users providing Big Tech with the fundamental components to generate billions.
This is where Web 3.0 needs to change things.
One of the core principles of Web 3.0 is decentralization. That is the exact opposite of enormous centralized entities that experience powerful network effects. If left unchallenged, we will end up with a handful of companies that run the digital world.
In other words, a replication of Web 2.0.
If synthetic data is so important to the future, which Zuckerberg believes, then it is something we should pay attention to.
Sadly, most simply add to Big Tech's databases without a second thought. The idea of doing something to open up the data generated when doing a ChatGPT prompt never enters the mind of people.
AI Services
Web 3.0 is lacking in services overall. This is becoming an even bigger problem in the era of rapidly advancing AI.
Big Tech is all over this. They are progressing ahead at a pace that is never seen.
Jensen Huang, the CEO of NVIDIA, was speaking at the recent Salesforce conference. He said that AI advancement was "Moore's Law squared".
His estimation is that Moore's Law produces roughly a 100x over a decade. AI, in his view, is coming in at around a 100,000x.
If he is even close to being accurate, we have to pay attention to what is going on here. This is crucial for the future. Corporations such as Google and OpenAi are all over this. So are the Microsofts and Amazons of the world.
Where is Web 3.0?
If we want decentralization, that means having the AI services built that can generate the data. Everything, it appears, is boiling down to data. If Zuckerberg is correct, the amount available to these technology companies will be enormous.
After all, OpenAI has tens of millions of people feeding it everyday.
What does Web 3.0 have?
Outside of people chasing the next MemeCoin, what are people doing? While they are distracted with hopes of green candles, understanding that most are going to end up with nothing, the true Great Data Race is on.
Human generated data is crucial. However, it just might be that synthetic data is also of the utmost importance.
Posted Using InLeo Alpha