If you want to glimpse the future of AI, just stay informed. Trained extensively on vast troves of web-based data. Generative AI is experiencing significant advancements in various fields, including data analytics and machine learning, thanks to the availability of comprehensive and well-curated datasets. What enormous datasets lie waiting to be harnessed?
Recently, a pivotal new clue has come to light.
A leading gaming company, Niantic, is training a cutting-edge AI using millions of real-world images gathered through Pokémon Go and its Scaniverse application. As the linguistic phenomenon continues to gain momentum, researchers have coined a term for their innovative algorithm: “massive geospatial model.” The team hopes that this groundbreaking tool will seamlessly integrate with the physical realm, much like ChatGPT excels in the linguistic sphere.
Observe the Knowledge
The capabilities of this second-generation AI are defined by sophisticated algorithms that produce innovative language, visual content, and increasingly, video outputs. As advancements in AI continue to evolve with OpenAI’s DALL-E and ChatGPT, users can now leverage everyday language to effortlessly generate photorealistic images or grasp complex concepts like quantum physics with uncanny accuracy. The digital revolution is now being applied to the film industry? Several entities are vying for supremacy alongside OpenAI, including Google’s DeepMind, Microsoft-backed Nuance Communications, and Meta AI.
The pivotal insight behind these trends lies in the rapid digitization of recent decades, which has a dual benefit: it not only entertains and informs us, but also provides sustenance for AI’s growth. At its inception, few could have predicted the profound impact of the internet; yet, with the benefit of hindsight, it’s clear that humanity has unwittingly created a vast, ever-growing repository of knowledge comprising language, images, code, and video. While copyright disputes may arise, numerous lawsuits have been filed regarding works containing copyrighted material – an issue AI companies exploited by gathering this data to train high-performing AI models.
As the fundamental recipe’s efficacy is now established, corporations and researchers are actively seeking additional components to further enhance its performance.
In the realm of biotechnology, research laboratories are dedicating significant resources to training artificial intelligence (AI) on vast collections of molecular structures that have taken years to build, thereby enabling the accelerated analysis of complex biological data. Researchers are developing colossal AI frameworks to equip robots with intelligent decision-making capabilities, enabling them to understand instructions and navigate complex environments.
While robots may excel in their digital realm, they still require a certain level of proficiency to navigate and interact effectively with the physical environment. While simplicity may seem an elusive goal in the complexities of robotic functioning, certain environments and situations necessitate consideration of numerous variables and adaptability. While robotic brains coded by hand may exhibit impressive capabilities, they are inherently limited in their ability to accommodate the vast array of complexities and nuances that exist in real-world scenarios. That’s why researchers are actually exploring innovative ways to make a meaningful impact. Despite their vastness, they’re still far from matching the scale of the web, where billions of people have collaborated in parallel over an incredibly extended period.
Can a virtual framework accommodate our physical reality? Niantic thinks so. It’s known as Pokémon Go. Despite its success, the hit recreation is merely a single example of a larger trend. Tech corporations have been . It seems likely that these maps will find their way into AI systems.
Pokémon Trainers
Launched in 2016, Pokémon Go became a global phenomenon and a pioneering force in the world of augmented reality gaming.
Players within the recreation encounter virtual characters, commonly referred to as Pokémon, that are strategically situated globally. Gamers use their smartphones to access augmented reality experiences, where virtual characters are superimposed onto real-world locations – think a character sitting on a park bench or lingering outside a movie theater. Pokémon Playground offers a modern feature allowing customers to embed characters at specific locations for various players. Enabled by the corporation’s meticulously crafted digital maps.
Niantic’s Visible Positioning System (VPS) can pinpoint a smartphone’s exact location, accurate to within centimeters, using just a single image of the surroundings. Partially, VPS assembles 3D maps of places in its classical approach; however, the system relies heavily on a network of machine learning algorithms – multiple per location – trained on years’ worth of participant photos and scans captured from various angles, times of day, and seasons, all geotagged with their corresponding locations on the planet.
As a key component of Niantic’s Visible Positioning System (VPS), we’ve successfully trained more than 50 million neural networks, leveraging over 150 trillion parameters to enable seamless operations across over 1 million locations.
Now, Niantic seeks further expansion.
By leveraging massive datasets from Pokémon Go and Scaniverse, researchers can train a solitary foundation model equivalent to thousands or even millions of individual neural networks. While individual tastes may be limited by the visual cues they’ve encountered, a novel approach could transcend these boundaries through generalization across multiple examples. As one approaches a church’s entrance, they may unconsciously draw upon their mental catalog of various church architectures to envision unobserved aspects of this particular building, including facets not yet explicitly seen or experienced.
As humans, our daily experiences mirror the complex navigation of the digital realm, where we constantly adapt and evolve to find our way through the ever-changing landscape. While we can’t physically peer around a corner, our imagination and experience allow us to make educated guesses about what lies ahead, perhaps envisioning a hallway, building facade, or interior space that informs our planning and decision-making.
Niantic claims that developing a large-scale geospatial model will allow for the creation of more immersive and realistic augmented reality experiences. However, this model could potentially power other applications as well, including those in the fields of robotics and autonomous systems.
Getting Bodily
As a pioneer in its field, Niantic finds itself at a unique crossroads due to the dedication of its community, which yields approximately one million new scans each week. In addition to this, the scans provide a pedestrian’s perspective, akin to that found in Google Maps or utilized by autonomous vehicles. They’re not flawed.
If we consider the vastness of the internet, it’s possible that millions or even billions of people contributing to live performances could collectively gather the most valuable new datasets.
On the same timeline, Pokémon Go remains incomplete. Despite spanning continents, places remain scarce in any particular location, with entire regions shrouded in darkness. Corporations such as Google, notable for their extensive endeavors, have long been engaged in mapping the global landscape. While they diverge from the typical online landscape, these proprietary datasets are fragmented.
The question remains unclear: whether a massive internet-sized dataset is necessary to develop a general AI capable of seamlessly navigating the physical world, akin to language models’ fluency in verbal contexts.
What if the entire physical universe could potentially emerge from something akin to Pokémon Go, simply scaled up? As early as in smartphone technology, the integration of sensors for photography, filmmaking, and three-dimensional scanning has already commenced. As consumers increasingly engage with augmented reality (AR) applications, they are further enticed to leverage their smartphone cameras’ advanced sensors in conjunction with artificial intelligence (AI), such as snapping a photo of their refrigerator and then seeking culinary guidance from a conversational bot on the best dinner options. The introduction of new units could potentially lead to a significant surge in the application of such technology, ultimately resulting in a treasure trove of fresh insights into the physical world.
In reality, the online accumulation of data is already a subject of controversy, with privacy posing a massive hurdle. Extending these issues into the real world is a far cry from being truly profound.
Following the initial scanning, Niantic notes that this functionality is entirely voluntary – users must physically travel to a designated public location and opt-in to initiate the scanning process. This allows Niantic to deploy innovative augmented reality experiences that people can thoroughly enjoy. While simply wandering around our gaming world may not adequately train an artificial intelligence model, various companies might not be transparent enough regarding data collection and usage.
New algorithms inspired by large-scale language models won’t necessarily be straightforward. The Massachusetts Institute of Technology (MIT) recently built a state-of-the-art facility dedicated to advancing the field of robotics. “In the language domain, the data comprise straightforward sentences,” notes Lirui Wang, lead author of a paper detailing the research. “In robotics, where heterogeneity is prevalent in the data, we require a tailored architecture to enable pretraining in a similar manner.”
Despite the uncertainty, researchers and corporations are likely to continue. As novel advancements mature, it is likely that future AI systems will increasingly integrate cognitive abilities, allowing for seamless integration of machines that think, communicate, create written content, and navigate the world with the same ease as humans.