Big tech and the domination of data.
This is something that we see already taking place. Fortunately, when it comes to online data, there is a solution. Blockchain, for one, can provide an alternative. This helps with the democratization of data.
Of course, many conversations center around the idea of there being a data deficit. This is real as AI models require more data to train ensuing iterations. To me, this is a temporary situation in my view since the answer is already before us.
Unfortunately, it is one that does not bode well for democratization and actually empowers Big Tech even more.
This is what we are going to dive into.
Big Tech Will Dominate Real World Data
At this point, much of the training of AI systems are done utilizing online (or digital) data. Most of it centers around what is on the Internet. This is especially true for LLMs. Some smaller models that were built by individual companies might be done utilizing in-house data such as customer information.
The challenge with the Internet is the silo effect. There are basically a handful of platforms that control most of the traffic. When it comes to data, the same companies are seeing a massive spike as compared to everyone else.
It becomes problematic when startups and other smaller entities want to get their hands on it. Many sites are starting to lock things down, blocking the cralwers that scrape the data. This led to agreements reached where companies pay for access. Naturally, this is prohibitive for everyone but the largest of companies.
Data democratization is something crucial. It is not really discussed but it is a basis for how our future will look. This is multiplied when we consider what comes next: real world data.
Here we are getting into the realm where Big Tech will once again dominate. It is also a realm where individuals and smaller entities really cannot play.
Real World Data To Explode
Real world data is bascially cameras and sensors on steroids. These move around our world, recording everything the host comes across.
Even without the numbers, we can see how this will dwarf what humans do.
So what are we talking about? Essentially, it boils down to robots in all the upcoming forms. These are massive data collectors, compounding the total as they expand.
To illustrate, I will use Tesla as the example. This company now has over 7 million vehicles on the road, each containing at least 6 cameras. It started realeasing the metric of miles drive on FSD. This is one way to collects data. Of course, the cameras are always in operation, allowing the company to garner whatever it needs.
This is from the most recent shardholder deck for the Q3 earnings call.
Cameras and sensors.
Notice the exponential nature of this chart. The data increases, for this metric, with the increase in vehicles on the road along with the time each car spends driving on FSD. Since the technology is not authorized in all countries, it will also expand as other nations come online.
Autonomous vehicles are really nothing more than robots on wheels.
Which brings us to the crux of the situation.
Humaoid robots are going to be the epitome of data gatherers. Each will be housed with cameras and sensors, collecting information for every action taken. From the prototypes I saw, each has a minimum of 5 cameras, recording everything in the environment. Whatever sensors are installed are doing the same. We can only guess as to what sensors are in the hands and how much they can collect. In other words, every motion it makes and whatever it touches is recorded.
This is going to dwarf whatever automobiles provide. WIth robots, once they start production, the numbers will far outpace automobiles.
At present, the automotive industry builds around 80 million new vehicles each year. With robots, would could reach the point where billions are being built. It will take a while to scale once production starts but that is a level that is attainable by the early 2030s if we see it starting to roll out in the next couple years.
Blockchain's Importance
The democratization of data is crucial going forward.
As we can see, real world data is going to dwarf the Internet. The challenge there is small entites are not going to be mass producing robots. Even the startups that are involved are receiving enormous funding from venture capital firms. Once the process starts, it is hard to catch up.
This means companies building models will be basically excluded from this data. Hence, it is back to the online world.
Fortunately, there is still a lot of data generated, and it is growing. The issue is where it is located. Since the Internet is siloed, the same is true for data.
Google has what is generated on its platform. X secured what they are doing. Meta is not sharing as their applications are some of the most widely used. The Chinese tech firms are doing the same thing.
Rinse and repeat.
When dealing with the most powerful technology humanity ever created, one that is dependent upon data, we are seeing how a siloed system could be catastrophic. This is what occured with the Internet, to the detriment of users. However, since the reach was limited, i.e. online activities, the damage was also isolated.
What happens when we are dealing with technologies that are spatial. In other words, no longer is it just online activities. Instead, we are talking about everything we do and all that takes place around us.
As always, planning the future starts with data. It is, indeed, the new oil. Those who are in control of it have great power.
Since this is the game, it is time for people to decide where they stand. It is unfolding before our eyes.
Posted Using InLeo Alpha