G’day. Here’s what I discovered this week in 3D…
Quelle:
Foto: https://www.youtube.com/watch?app=desktop&v=80GrFXFOayE
G’day. Here’s what I discovered this week in 3D…
Quelle:
Foto: https://www.youtube.com/watch?app=desktop&v=80GrFXFOayE
Masterpiece X will use generative AI for a new „game-ready“ 3D creation platform available on Quest 2 in early access.
Masterpiece Studio is launching Masterpiece X to develop „game-ready 3D“ with a mesh alongside textures and animation. Pitched as compatible with Unity engine and „other popular apps,“ the store page stresses this platform is not for developers who want „to start from scratch“.
Calling Masterpiece X a „3D remixing platform,“ you can start remixing existing models from a ‚Community Library‘ to change their shape, style or how it moves, though importing or generating models isn’t available just yet. Once done, you can then edit your 3D model and share with Masterpiece X’s wider community.
The app is described as „exclusively“ for the Quest platform. It’s still in „early access“ with a waitlist for the generative AI features. Masterpiece X is available now for free on Meta Quest 2 and Quest Pro.
„Although we haven’t released details about the ’secret sauce‘ of how we train our AI, what we can say is that our machine learning is done in a socially and legally responsible way,“ a support page for the app explains. „None of our existing AI has been trained off of copyrighted 3D models.“
Quelle:
https://www.uploadvr.com/masterpiece-x-generative-ai-quest/
PwC treibt innovative Business- und New Work-Ansätze zum Arbeiten im virtuellen Raum voran / Vertrauenswürdige B2B-Lösung für virtuelle Zusammenarbeit mit sensiblen Daten / Entwicklung in Zusammenarbeit mit Digitalagentur Demodern
Mit Virtual Spaces startet die Wirtschaftsprüfungs- und Beratungsgesellschaft PwC Deutschland die erste auf einer eigenen Software basierenden Business-Metaverse-Plattform für den Geschäftsalltag. Die Lösung ermöglicht es Kunden, Partnern und Mitarbeitenden über Avatare in virtuelle Räume einzutreten und unternehmensübergreifend zusammenzuarbeiten: Meetings abzuhalten, Veranstaltungen durchzuführen, sensible Daten zu teilen oder andere, innovative Formen der Kollaboration zu realisieren. So verbindet Virtual Spaces die reale Geschäftswelt mit der digitalen und ermöglicht einen ersten Schritt ins Metaverse.
Die in fachlich enger Zusammenarbeit mit der Digitalagentur Demodern entstandene Lösung wurde konsequent aus den Markenwerten von PwC heraus entwickelt und fördert damit die durchgehende Identifikation mit der Marke. Als Wirtschaftsprüfungsgesellschaft haben Sicherheit und Vertrauen für PwC höchsten Stellenwert, entsprechend wurde Virtual Spaces so umgesetzt, dass Compliance gewährleistet und mit sensiblen Daten gearbeitet werden kann.
„Das Metaverse ist ein großer schnell wachsender Markt. Noch stehen wir ganz am Anfang. Die deutsche Wirtschaft hat aber bereits verstanden, dass es Zeit ist zu handeln. Bei PwC führen wir mit Virtual Spaces langsam an das Arbeiten im virtuellen Raum heran. Unsere Erfahrungen können wir dann an unsere Kunden weitergeben.“ Clemens Koch, Mitglied der Geschäftsführung, Leiter Markets & Financial Services bei PwC Deutschland
Virtual Spaces eröffnet Mitarbeitenden und Kunden viele neue Möglichkeiten. So können beispielsweise HR-Fachbereiche viele ihrer Initiativen im digitalen Raum erweitern. Das umfasst alltägliche Vorgänge wie Bewerbungs- oder Feedbackgespräche, Recruiting-Events sowie Schulungen und andere Weiterbildungsmaßnahmen. Unternehmen sind mit Virtual Spaces zudem in der Lage, Maßnahmen wie Roadshows digital zu verlängern und können damit auch abseits geplanter Tourstopps neue Zielgruppen erreichen. Nutzer:innen können den Inhalt in der virtuellen Umgebung dabei je nach Anforderungen individualisieren, so dass für jedes Format schnell das richtige Setup gefunden ist.
„Mit Virtual Spaces können wir auf einer sicheren Infrastruktur mit Kunden digital auf einem neuen Level zusammenarbeiten und ganz neue Arten von Markenerlebnissen schaffen.“ Holger Kern, Head of Metaverse bei PwC Deutschland
Als webbasierte Plattform bietet Virtual Spaces maximale Zugänglichkeit ohne Einstiegshürden. Die Plattform verfügt über zahlreiche Funktionalitäten für viele verschiedene Anwendungsfälle. Im Zentrum steht die Möglichkeit, mittels Avatar-basierter Interaktion eine neue, direktere und persönlichere Form der digitalen Zusammenarbeit zu etablieren. Nutzer:innen können auf diese Weise den gesamten virtuellen Raum für ihre Projekte nutzen – etwa um Daten zu visualisieren, verschiedene Medien zu kombinieren, das Teamgefühl zu stärken und natürliche Gesprächssituationen zu schaffen. Virtual Spaces ist vollumfänglich in die PwC-IT-Infrastruktur eingebunden und kann als Hub für den Geschäftsalltag genutzt werden.
„Wir freuen uns, gemeinsam mit PwC eine so innovative und funktionsfähige New-Work- und B2B-Metaverse-Plattform entwickelt zu haben und nun online zu bringen.“ Kristian Kerkhoff, Co-Founder und Geschäftsführer von Demodern
Aktuell entwickeln PwC Deutschland und Demodern eine VR-Version von Virtual Spaces. Diese soll in den kommenden Monaten live gehen.
Die vielseitige Kommunikationsplattform für ein immersives Erlebnis
Virtual Spaces ist die vielseitige und sichere Lösung für ein einzigartiges Mitarbeiter- und Kundenerlebnis im Metaverse, entwickelt in enger Zusammenarbeit zwischen PwC und der Digitalagentur Demodern. Modular aufgebaut bietet die browserbasierte Plattform den perfekten Einstieg in virtuelle Welten – von Meetings, Workshops und Projekträumen, über Schulungen und Jobmessen, bis hin zu Kundenevents und Showrooms ist alles möglich.
Mit einem Klick eröffnet sich Mitarbeitenden, Bewerbern und Kunden eine Vielzahl digitaler Räume, in denen sie via persönlicher Avatare interagieren – intuitiv und authentisch. Und das unabhängig davon, wo sie sich gerade in der realen Welt befinden. Auch der Umgang mit sensiblen Daten ist gewährleistet.
„Eintauchen in eine virtuelle Geschäftswelt, nahtlos und sicher zusammenarbeiten und dabei die DNA des Unternehmens atmen – genau das ermöglicht Virtual Spaces.“ Carsten Lukas, Senior Manager, Product Lead Virtual Spaces bei PwC Deutschland
Vielseitig nutzbar: PwCs Virtual Spaces
Vom Daily Business inklusive Video-Calls und Chats, über Workshops und Projekträume, bis hin zu Events inklusive Live-Streamings lässt sich mit Virtual Spaces alles realisieren.
Virtual Spaces ist ideal, um das Unternehmen im virtuellen Raum authentisch zu repräsentieren: Ihre Kunden können von Events bis Showrooms alles online oder hybrid erleben.
Touchpoint für Stellenangebote, Recruiting-Veranstaltungen, Bewerbergespräche, Onboarding-Programme und Trainings – Virtual Spaces ist die authentische Lösung für den HR-Bereich.
Starker Auftritt im Business Metaverse
Authentisch in der virtuellen Welt auftreten, einladend für alle Beteiligten und mit der Möglichkeit, effizient zu kollaborieren – genau das ermöglicht Ihnen Virtual Spaces. Ganz gleich, ob es um rein geschäftliche Meetings, inspirierende Events oder um personelle Angelegenheiten geht: Positionieren Sie sich als innovativer Vorreiter und machen Ihr Unternehmen immersiv erlebbar. Das stärkt die Bindung von Kunden und Mitarbeitenden an das Unternehmen und schafft zugleich ein persönliches Umfeld.
Technisch überzeugend gelöst: PwCs Virtual Spaces
Den Arbeitstag mit dem guten Gefühl beginnen, dass alles reibungslos läuft – ob im Büro, im Homeoffice oder beim Kunden vor Ort. Denn wer Virtual Spaces nutzt, bewegt sich mit einem persönlichen Avatar in der Welt der digitalen Möglichkeiten und ist dabei bestens mit dem realen Leben verbunden.
Die Kollegen schnell auf dem „kurzen digitalen Dienstweg“ erreichen, später im Projektraum alle Informationen und Tools für die gemeinsame Präsentation zur Hand haben und unterwegs zum Bewerbergespräch noch kurz auf dem virtuellen Flur ein paar News austauschen – mit Virtual Spaces ist alles möglich.
Quelle:
https://www.pwc.de/de/metaverse/virtual-spaces.html
Volograms, a provider of tools for creating 3D holograms using a single camera, has announced “Vologram Messages,” a new creation system powered by AI that can turn standard video into professional quality 3D holograms.
Now, anyone looking for a better way to engage people — from marketing announcements to corporate communications to wedding invitations — can simply record a standard video of a person, upload their clip, and receive a 3D version of that person that can be viewed anywhere, according to the company.
“From the moment Volograms was founded, our goal was fairly simple: we wanted to create a way to make volumetric holograms accessible to anyone and everyone,” said Rafa Pagés, CEO at Volograms. “With the introduction of Vologram Messages, we can now offer businesses and organizations a way to simplify the 3D hologram creation process and make it accessible to everyone. For a business looking to announce, promote, sell or even just engage with their own teams, this is an easy and affordable way to stand out.”
Volograms stated that where most comparable technologies require multiple cameras or depth sensors working in unison to create a 3D video, its own proprietary single-camera solution instead uses advanced AI algorithms to analyze a normal recording of a person and extrapolate a 3D volumetric hologram, viewable from any angle.
In 2021, this breakthrough led to the introduction of Volograms’ Volu, a personal 3D hologram creation tool for mobile devices that allowed Apple iOS users to simply record a five-second video, which was converted into a 3D hologram to be viewed and shared within the app. Vologram Messages now takes that technology and goes further, offering companies and organizations a professional holographic solution using any camera, viewable anywhere.
Vologram Messages users can record normal videos with audio, select the length — currently up to two minutes, but longer clips are available upon request — and upload the clip directly to Volograms’ cloud servers. The video is then converted automatically from 2D to 3D, and from there can be integrated into an interactive web-based augmented reality (webAR) template of their choice. Users can customize the template to include specific calls to action, including initiating user sign-ups, steering audiences to specific web pages and more. Additional visual elements and personalized copy can be added as well.
Volograms noted that for customers looking to go even further, advanced interactive options can also be requested. Working with a handful of established partners, the company stated that it can deliver tailored user flows, interactive product demos, customized analytics, gamified experiences and more to create elaborate and unique campaigns.
Once complete, interactive experiences are stored in Volograms’ cloud servers, ensuring that experiences can be accessed on any web browser or smart device. Through a partnership with volumetric video editing and streaming provider Arcturus, the clip is streamed from the cloud using an adaptive bitrate that compresses the file, ensuring a high-quality stream regardless of broadband speeds. Users then receive a link that can be integrated into websites, sent via email or even shared via social media, as well as a personalized QR code, which can be included on any official materials, product labels, business cards and more.
Volograms added that since its founding in 2018 it has raised over USD $3 million in venture funding to help build more accessible volumetric holograms. According to the company, the release of Vologram Messages marks the first in a series of major releases planned for 2023, all focused on increasing the accessibility of 3D volumetric holograms through a comprehensive platform.
For more information on Volograms and its new Vologram Messages platform, along with information on a free trial of the solution, please visit the company’s website.
Quelle:
Katmai has raised $22 million for its new take on remote work with a new 3D virtual office platform.
Katmai is emerging from stealth today with its mission to revolutionize the remote work landscape with a new kind of online workplace that uses a combination of 3D virtual offices, video cameras and avatars.
The company integrates its video conferencing with a real-time 3D engine, bringing people together within immersive, customized, photorealistic environments — all accessible through a browser with no virtual reality headset needed.
By incorporating real video instead of cartoon avatars, Katmai fosters authentic human interactions in keeping with its vision of changing remote work. I tried out a demo and it worked pretty well in making you feel like you’re both in an office with other people and still taking advantage of being remote, said Erik Braund, Katmai CEO, in an interview with GamesBeat.
Recent studies show that 97% of workers desire some form of remote work. Business leaders are
grappling with the challenge of formulating post-pandemic work strategies that maintain the advantages
of a physical office while optimizing productivity, employee well-being, and company culture.
Katmai’s virtual office product merges the benefits of a shared in-office experience with the convenience of working remotely, paving the way for a work paradigm that is truly inclusive. The 3D virtual office boasts a blend of private and co-working spaces, spatial audio, spontaneous interactions, and personalized environments that vividly represent your brand and products in the virtual realm.
“Our vision for the future of work encompasses the flexibility employees have come to expect while
giving leaders instant access to their teams and boosting productivity. In a world where companies are
striving to do more with less, our product is designed to offer businesses a competitive edge,” said Braund.
I joined a demo inside Katmai’s own virtual office, which is in a beta state. The company decided early on that video works better than 3D-animated avatars for speaking with colleagues.
“Using actual videos can convey emotion and help people build a rapport in a way that is more natural, more genuine,” Braund said.
You can move your avatar into a virtual glass office. When you close the virtual door, you can no longer hear someone who is on the other side of the door, but you can see that they are waiting outside to talk to you. When you open the door, they can come in and talk with you around a virtual table.
Katmai is pioneering the future of virtual experiences and hybrid work. The platform brings people
together inside an easy-to-navigate, photo-realistic 3D environment, enabling natural communication,
spontaneous interactions, and a sense of place that’s been missing from the digital world, Braund said.
The simplicity of the user experience means no proprietary software or hardware is required — Katmai runs in-browser on any webcam-enabled computer, he said. The company was founded in 2020, and is partnering with the world’s leading brands to create everything from virtual offices to one-off interactive experiences to digital twins of real-world locations.
“This is where we all work,” Braund said, appearing as a video speaker inside an avatar bubble.
It takes place inside a first-person environment, where you participate in the video-3D hybrid world using your own remote video rather than a 3D-animated avatar.
“Everybody that you see around is actually a person who’s here,” he said. “In the lower right of your screen where it says the participants, that’s who can hear you. But when I back out of this room, my name will disappear from there, and you’ll no longer be able to hear me. When I come back in there, you get a little bit of a doorbell.”
Rewinding back to the beginning of the pandemic, the company was focused on audio and video production. The company pivoted to solving the online meeting challenge for interacting virtually. Braund spent a couple of years self-funding the company with a staff of about 10 people. They refined the technology and got it to work pretty well, always operating in stealth as a “submarine company.”
“Coming up in a month it will be three years of being a submarine,” Braund said. “We’re testing and learning in a few different ways as a B2B product. So we’re not focused on the consumer at the time.”
It’s all about being a digital third place, in addition to your physical home and physical office.
Since launching its closed beta in 2021, Katmai has been working with dozens of companies across both
Fortune 500 firms and startups with a diverse range of use cases. TMS, a technology, marketing, and sourcing company driving transformational change for the world’s leading brands, tapped Katmai to create a digital twin of its new Chicago Headquarters as a catalyst to support a flexible return to office policy.
“As a globally distributed team, working in Katmai has enabled us to cultivate community and company culture,” said Jim Eby, chief creative officer at TMS, in a statement. “Being together in the virtual office allows for deeper connections, real-time collaboration, spontaneity and fun that wasn’t previously achievable in our remote workflow.”
The company isn’t trying to do large events with a hundred or more people. For those occasions, companies can meet in real life. But conducting meetings on a day-to-day basis is more like its focus. Around 30 people in a meeting is where it maxes out now.
“We want to scale our release thoughtfully and carefully,” Braund said.
The Series A round was led by Starr Insurance Companies, with funding from additional investors including NFL Super Bowl Champion Sidney Rice. Katmai is actively expanding its product offerings and
partnering with industry-leading companies to develop the next generation of digital engagement
solutions.
Katmai believes its technology facilitates human connection in a way that is practical, delightful, and meaningful. The company has about 40 people, and they all meet inside Katmai.
“We really benchmark ourselves against IRL, against in real life,” he said.
It uses web-based technologies such as glTF, a 3D file format for lightweight 3D and ecommerce. The aim is to have a download happen in 10 seconds or less.
“There is an element of a digital twin,” he said.
Quelle:
Foto: Katmai wants to change how people work online.
Virbela’s new system brings more customizable avatars and fashions.
Virbela has grown a lot over time but they’ve had the same avatar system for nearly a decade. If you open the app today, you’ll see a whole new avatar system. As impressive as it is, it might still have some growing to do.
Virbela is a platform for remote work, education, and events. The platform consists of an open campus that anyone can download and use for free, and private campuses co-created with clients.
There was nothing wrong with the old campus, but it got a whole lot of new features as well as a beautiful graphics upgrade showcased at the Hands In Enterprise Metaverse Summit last year. An upcoming avatar system was teased at the summit, but no release date was given.
“Avatars are important to the virtual experience because they add fidelity to the world,” Virbela Art Team Manager Nicole Galinato said at the event. “Our users love the playfulness of the current avatars, but they want more features that they can identify with.”
Since the summit, the old avatars roamed the new world. It wasn’t a glaring mismatch, but the avatars were definitely from a different generation. The new avatars certainly fit into the graphically updated virtual world a lot better.
The next time that you boot up Virbela, whether you’re a first-time user or just returning after a while, you will be greeted by the first page of the new avatar generator. Just like with the old system, you can join immediately with a default avatar and personalize it later if you want. If you’re not in that much of a hurry, you have a lot of playing to do.
You select one of three “body types” rather than gender, so all clothing and cosmetic options are open to all users. There’s also a custom gradient for specific skin tones and a number of features have an “advanced settings” button that opens up menus of highly customizable sliders. The update also brings several more hair and facial hair options.
“What really pushed us to create this new avatar system was more about this idea of inclusion and equity,” Virbela co-founder and President Alex Howland told me on the XR Talks podcast. “We are working with a very global population of users and we know the importance of the avatar for people to express themselves and explore their identity through their avatar.”
The update also brings new clothing options and customizations. Many outfits consist of a “top” and a “bottom,” with the top consisting of several layers each with their own color combinations, similar to the system that AltspaceVR used (RIP Altspace). I went with the three-piece suite, which means color options for the jacket, vest, and shirt. (Neckties are under “Accessories”.)
“We also wanted much more variability in terms of the ability to customize the avatar because we sometimes have populations of many thousands in the same space and you’d find too many avatars that looked too similar to one another,” said Howland.
Even after you’ve toured your new avatar through the campus, you can change it at any time by selecting the gear icon in the upper right corner to open the settings dropdown menu and selecting the “change avatar” item at the top. And do keep checking back. According to Howland, more is coming.
“This is what I’ll call the [minimal viable product] of this new system. It’s a system that we can build upon and continue to add assets to, whether that be more hairstyles, more clothing options, more cultural garb, that folks can use over time – eventually leading to things like more facial expressions,” said Howland.
To return to Galinato’s concept that the avatars contribute to the immersion of the world itself, the more detailed and more personal avatars do seem more at home in the more detailed and responsive Virbela campus. I haven’t yet had the opportunity to attend a large event with the new avatars, but I’m sure that they’ll be a lot more colorful now.
Quelle:
XR Today explores the science behind the real-time 3D alter ego
The avatar-based metaverse is adding a new dimension to the world of social media. Following Facebook’s big reveal at its Connect 2021 and 2022 events, the company announced a series of technological updates.
Along with its Meta Quest Pro and Project Aria smart glasses, Meta has recently pledged to advance its avatar development projects. Meta promises to deliver avatars with greater realism, depth, and expression using its latest face-tracking tools on the Quest Pro.
Meta’s metaverse will likely feature hyper-realistic 3D avatars that use artificial intelligence (AI), sophisticated modelling techniques, and electromyography to render human features and movements in a virtual space accurately.
Meta’s Codec 2.0 avatar and Avatar software development kit (SDK) generate photorealistic avatars for users. Once content turnarounds shorten in the next few years, people can use them for everyday communication.
Alternatively, Meta’s big tech teammate, Microsoft, also launched a significant partnership on cross-compatibility with the former’s Quest product lineup and the latter’s Microsoft 365 solutions.
Microsoft is already vying for the metaverse after its massive $67 billion deal with Activision Blizzard. The major acquisition hopes to build future metaverse infrastructure based on serious gaming technologies. Additional players in the immersive market, such as ByteDance’s Pico Interactive will also feature fresh avatar development offerings.
Etymologically, the word avatar comes from the Sanskrit word for ‘descent,’ referring to deities descending to Earth and taking on human form.
In computing, avatars were popularised in the ’80s as an on-screen representation of internet users and gamers. The 1985 game Ultima IV: Quest of the Avatar firmly established the need for an on-screen representation of users that would bring a degree of realism.
Developers believed users should depict themselves accurately on-screen in the first person. Doing so would deepen immersion as they interacted with characters in the story and gameplay.
The same principle now applies to social media as avatars are, quite literally, visualised representatives of users in virtual and gaming worlds. The avatar’s actions and decisions are identical to their operators.
This approach was first presented in Neal Stephenson’s 1992 science fiction novel Snow Crash, which debuted the metaverse concept.
Fast forward three decades and tech companies like Meta Platforms and Microsoft aim to realise the vision of a rich metaverse populated by lifelike avatars.
Avatars are not necessarily hyperrealistic but are on-screen or virtual depictions of the user. They can have any shape or form, usually with moveable limbs, a torso, and facial expressions.
Avatars may look similar to or as different from a user’s appearance in the real world. VR applications have their take on avatars, depending on specific use cases.
Aside from simple avatars, Meta Platforms’ metaverse soon features bespoke, hyper-realistic avatars that closely resemble facial and physical features. They will also support customisation for wearables like hair, clothing, and potential blockchain-linked non-fungible tokens (NFTs).
There are several ways software systems can create avatars for a virtual environment in both 3D and 2D.
Over the last few years, real-time 3D (RT3D) avatars have become standard for VR solutions, replicating real-world movements using sensors. There are two primary forms of avatars:
Simply put, the Metavmetaversees heavily on avatars to represent the psychology and actions of a user in virtual spaces. They can also enable necessary interoperability between the metavmetaverse’s features.
Estonia’s Ready Player Me has led this technology with its interoperable, highly-bespoke avatars. People designing their own can port them on dozens of virtual space platforms, leaving one avatar to “rule them all.”
Users can complete gaming challenges on Metaverse spaces like Timberland’s Fortnite or Izumi World, earn tokens, visit virtual marketplaces, and purchase digital assets.
Conversely, Microsoft Teams users can now create personalised avatars to use for meetings participants using Microsoft Mesh technologies.
Essentially, the avatar behaves as a ‘passport’ that people can use to represent their identity. They will soon link to blockchain technologies to save data, become digital wallets, and travel between worlds.
Quelle:
The power of new file formats in the metaverse
Building engaging metaverse environments isn’t without its challenges. New digital platforms come with their own unique demands, particularly in regard to how content is processed and shared. As consumers and companies transition into metaverse landscapes, developers and designers must find ways of delivering high-quality immersive content in lightweight, agile forms.
Just as the JPEG influenced the evolution of the visual internet we know today, the gITF (Graphics Language Transmission Format) could have a similar impact on the metaverse. Championed by the Khronos Foundation, one of the key members of the Metaverse standards forum, gITF promises a more scalable, efficient, and cost-effective way to bring visuals into the metaverse.
Used correctly, gITF could easily become the standard file format for those investing in metaverse environments and digital twins.
The Graphics Language Transmission Format or gITF is the standard file format used for three-dimensional models and scenes in metaverse environments. It’s a royalty-free specification designed to enhance the delivery of 3D models and content. The file format supports static models, moving scenes, and animated content alike, giving developers plenty of scales to work with.
Perhaps the most attractive aspect of gITF is that it allows developers to convey complicated, high-quality images, models, and videos in XR environments while adding minimal weight to the system. The technology compresses 3D objects and their textures to preserve the quality of content while reducing the pressure on devices and systems.
According to Neil Trevett at Khronos, the gITF file format is also highly complementary to USD (Universal Scene Description), a common tool used for 3D asset development. Companies like NVIDIA have invested heavily in the USD landscape for their omniverse environment, which has been built specifically with a focus on metaverse creation.
For many developers in the metaverse landscape, gITF stands to become the most popular file format for content development. Not only does it work well with the existing tools and technologies most content creators are using, but it’s an easy-to-use and efficient format too.
Built specifically for 3D content, gITF is lightweight and easy to process on any device or platform, including mobile phones and web browsers. The gITF textures can allow creators to take a JPEG-sized file and immediately unpack it into a native GPU solution, reducing both the memory required and data transmission times by 5 to 10 times. This can be crucial for delivering content to consumers as quickly as possible in the metaverse.
It even complements the majority of file formats already used in authoring tools. NVIDIA is currently working on a gITF connector for the omniverse to ensure assets can be easily imported and exported from the metaverse. Going forward, innovative developers will also look for ways to bring additional properties into the file format, such as sound and interactions.
The open-source nature of USD and the consistent evolution of gITF will ensure the format continues to evolve and grow to match the changing needs of the metaverse.
Quelle:
Developing custom AI tools for 3D workflows is easy in NVIDIA Omniverse
Demand for 3D worlds and virtual environments is growing exponentially across the world’s industries. 3D workflows are core to industrial digitalization, developing real-time simulations to test and validate autonomous vehicles and robots, operating digital twins to optimize industrial manufacturing, and paving new paths for scientific discovery.
Today, 3D design and world building is still highly manual. While 2D artists and designers have been graced with assistant tools, 3D workflows remain filled with repetitive, tedious tasks.
Creating or finding objects for a scene is a time-intensive process that requires specialized 3D skills honed over time like modeling and texturing. Placing objects correctly and art directing a 3D environment to perfection requires hours of fine tuning.
To reduce manual, repetitive tasks and help creators and designers focus on the creative, enjoyable aspects of their work, NVIDIA has launched numerous AI projects like generative AI tools for virtual worlds.
With ChatGPT, we are now experiencing the iPhone moment of AI, where individuals of all technical levels can interact with an advanced computing platform using everyday language. Large language models (LLMs) had been growing increasingly sophisticated, and when a user-friendly interface like ChatGPT made them accessible to everyone, it became the fastest-growing consumer application in history, surpassing 100 million users just two months after launching. Now, every industry is planning to harness the power of AI for a wide range of applications like drug discovery, autonomous machines, and avatar virtual assistants.
Recently, we experimented with OpenAI’s viral ChatGPT and new GPT-4 large multimodal model to show how easy it is to develop custom tools that can rapidly generate 3D objects for virtual worlds in NVIDIA Omniverse. Compared to ChatGPT, GPT-4 marks a “pretty substantial improvement across many dimensions,” said OpenAI co-founder Ilya Sutskever in a fireside chat with NVIDIA founder and CEO Jensen Huang at GTC 2023.
By combining GPT-4 with Omniverse DeepSearch, a smart AI librarian that’s able to search through massive databases of untagged 3D assets, we were able to quickly develop a custom extension that retrieves 3D objects with simple, text-based prompts and automatically add them to a 3D scene.
This fun experiment in NVIDIA Omniverse, a development platform for 3D applications, shows developers and technical artists how easy it is to quickly develop custom tools that leverage generative AI to populate realistic environments. End users can simply enter text-based prompts to automatically generate and place high-fidelity objects, saving hours of time that would typically be required to create a complex scene.
Objects generated from the extension are based on Universal Scene Description (USD) SimReady assets. SimReady assets are physically-accurate 3D objects that can be used in any simulation and behave as they would in the real world.
Everything starts with the USD scene in Omniverse. Users can easily circle an area using the Pencil tool in Omniverse, type in the kind of room/environment they want to generate — for example, a warehouse, or a reception room — and with one click that area is created.
The ChatGPT prompt is composed of four pieces: system input, user input example, assistant output example, and user prompt.
Let’s start with the aspects of the prompt that tailor to the user’s scenario. This includes text that the user inputs plus data from the scene.
For example, if the user wants to create a reception room, they specify something like “This is the room where we meet our customers. Make sure there is a set of comfortable armchairs, a sofa and a coffee table.” Or, if they want to add a certain number of items they could add “make sure to include a minimum of 10 items.”
This text is combined with scene information like the size and name of the area where we will place items as the User Prompt.
“Reception room, 7x10m, origin at (0.0,0.0,0.0). This is the room where we meet
our customers. Make sure there is a set of comfortable armchairs, a sofa and a
coffee table”
This notion of combining the user’s text with details from the scene is powerful. It’s much simpler to select an object in the scene and programatically access its details than requiring the user to write a prompt to describe all these details. I suspect we’ll see a lot of Omniverse extensions that make use of this Text + Scene to Scene pattern.
Beyond the user prompt, we also need to prime ChatGPT with a system prompt and a shot or two for training.
In order to create predictable, deterministic results, the AI is instructed by the system prompt and examples to specifically return a JSON with all the information formatted in a well-defined way, so it can then be used in Omniverse.
Here are the four pieces of the prompt that we will send.
This sets the constraints and instructions for the AI
You are an area generator expert. Given an area of a certain size, you can generate a list of items that are appropriate to that area, in the right place.
You operate in a 3D Space. You work in a X,Y,Z coordinate system. X denotes width, Y denotes height, Z denotes depth. 0.0,0.0,0.0 is the default space origin.
You receive from the user the name of the area, the size of the area on X and Z axis in centimeters, the origin point of the area (which is at the center of the area).
You answer by only generating JSON files that contain the following information:
- area_name: name of the area
- X: coordinate of the area on X axis
- Y: coordinate of the area on Y axis
- Z: coordinate of the area on Z axis
- area_size_X: dimension in cm of the area on X axis
- area_size_Z: dimension in cm of the area on Z axis
- area_objects_list: list of all the objects in the area
For each object you need to store:
- object_name: name of the object
- X: coordinate of the object on X axis
- Y: coordinate of the object on Y axis
- Z: coordinate of the object on Z axis
Each object name should include an appropriate adjective.
Keep in mind, objects should be placed in the area to create the most meaningful layout possible, and they shouldn't overlap.
All objects must be within the bounds of the area size; Never place objects further than 1/2 the length or 1/2 the depth of the area from the origin.
Also keep in mind that the objects should be disposed all over the area in respect to the origin point of the area, and you can use negative values as well to display items correctly, since the origin of the area is always at the center of the area.
Remember, you only generate JSON code, nothing else. It's very important.
This is an example of what a user might submit. Notice that it’s a combination of data from the scene and text prompt.
"Reception room, 7x10m, origin at (0.0,0.0,0.0). This is the room where we meet
our customers. Make sure there is a set of comfortable armchairs, a sofa and a
coffee table"
This provides a template that the AI must use. Notice how we’re describing the exact JSON we expect.
{
"area_name": "Reception",
"X": 0.0,
"Y": 0.0,
"Z": 0.0,
"area_size_X": 700,
"area_size_Z": 1000,
"area_objects_list": [
{
"object_name": "White_Round_Coffee_Table",
"X": -120,
"Y": 0.0,
"Z": 130
},
{
"object_name": "Leather_Sofa",
"X": 250,
"Y": 0.0,
"Z": -90
},
{
"object_name": "Comfortable_Armchair_1",
"X": -150,
"Y": 0.0,
"Z": 50
},
{
"object_name": "Comfortable_Armchair_2",
"X": -150,
"Y": 0.0,
"Z": -50
} ]
}
This prompt is sent to the AI from the Extension via Python code. This is quite easy in Omniverse Kit and can be done with just a couple commands using the latest OpenAI Python Library. Notice that we are passing to the OpenAI API the system input, the sample user input and the sample expected assistant output we have just outlined. The variable “response” will contain the expected response from ChatGPT.
# Create a completion using the chatGPT model
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
# if you have access, you can swap to model="gpt-4",
messages=[
{"role": "system", "content": system_input},
{"role": "user", "content": user_input},
{"role": "assistant", "content": assistant_input},
{"role": "user", "content": my_prompt},
]
)
# parse response and extract text
text = response["choices"][0]["message"]['content']
The items from the ChatGPT JSON response are then parsed by the extension and passed to the Omnivere DeepSearch API. DeepSearch allows users to search 3D models stored within an Omniverse Nucleus server using natural language queries.
This means that even if we don’t know the exact file name of a model of a sofa, for example, we can retrieve it just by searching for “Comfortable Sofa” which is exactly what we got from ChatGPT.
DeepSearch understands natural language and by asking it for a “Comfortable Sofa” we get a list of items that our helpful AI librarian has decided are best suited from the selection of assets we have in our current asset library. It is surprisingly good at this and so we often can use the first item it returns, but of course we build in choice in case the user wants to select something from the list.
From there, we simply add the object to the stage.
Now that DeepSearch has returned results, we just need to place the objects in Omniverse. In our extension, we created a function called place_deepsearch_results() that processes all the items and places them in the scene.
def place_deepsearch_results(gpt_results, query_result, root_prim_path):
index = 0
for item in query_result:
# Define Prim
stage = omni.usd.get_context().get_stage()
prim_parent_path = root_prim_path + item[‘object_name’].replace(" ", "_")
parent_xForm = UsdGeom.Xform.Define(stage, prim_parent_path)
prim_path = prim_parent_path + "/" + item[‘object_name’].replace(" ", "_")
next_prim = stage.DefinePrim(prim_path, 'Xform')
# Add reference to USD Asset
references: Usd.references = next_prim.GetReferences()
references.AddReference(
assetPath="your_server://your_asset_folder" + item[‘asset_path’])
# Add reference for future search refinement
config = next_prim.CreateAttribute("DeepSearch:Query", Sdf.ValueTypeNames.String)
config.Set(item[‘object_name’])
# translate prim
next_object = gpt_results[index]
index = index + 1
x = next_object['X']
y = next_object['Y']
z = next_object['Z']
This method to place items, iterates over the query_result items that we got from GPT, creating and defining new primitives using the USD API, setting their transformations and attributes based on data in gpt_results. We also save the DeepSearch query in an attribute in the USD, so it can be used afterwards in case we want to run DeepSearch again. Note that the assetPath “your_server//your_asset_folder” is a placeholder and should be substituted with the real path of the folder where DeepSearch is performed.
And voila! We have our AI-generated scene in Omniverse!
However, we might not like all the items that are retrieved the first time. So, we built a small companion extension to allow users to browse for similar objects and swap them in with just a click. With Omniverse, it is very easy to build in a modular way so you can easily extend your workflows with additional extensions.
This companion extension is quite simple. It takes as argument an object generated via DeepSearch, and offers two buttons to get the next or previous object from the related DeepSearch query. For example, if the USD file contained the Attribute “DeepSearch:Query = Modern Sofa”, it would run this search again via DeepSearch and get the next best result. You could of course extend this to be a visual UI with pictures of all the search results similar to the window we use for general DeepSearch queries. To keep this example simple, we just opted for two simple buttons.
See the code below that shows the functions to increment the index, and the function replace_reference(self) that is actually operating the swap of the object based on the index.
def increment_prim_index():
if self._query_results is None:
return
self._index = self._index + 1
if self._index >= len(self._query_results.paths):
self._index = 0
self.replace_reference()
def replace_reference(self):
references: Usd.references = self._selected_prim.GetReferences()
references.ClearReferences()
references.AddReference(
assetPath="your_server://your_asset_folder" + self._query_results.paths[self._index].uri)
Note that, as above, the path “your_server://your_asset_folder” is just a placeholder, and you should replace it with the Nucleus folder where your DeepSearch query gets performed.
This shows how by combining the power of LLMs and Omniverse APIs it is possible to create tools that power creativity and speed up processes.
One of the main advancements in OpenAI’s new GPT-4 is its increased spatial awareness in large language models.
We initially used ChatGPT API, which is based on GPT-3.5-turbo. While it offered good spatial awareness, GPT-4 offers much better results. The version you see in the video above is using GPT-4.
GPT-4 is vastly improved in respect to GPT-3.5 at solving complex tasks and comprehending complex instructions. Therefore we could be much more descriptive and use natural language when engineering the text prompt to “instruct the AI”
We could give the AI very explicit instructions like:
The fact that these system prompts are being appropriately followed by the AI when generating the response is particularly impressive, as the AI demonstrates to have a good understanding of spatial awareness and how to properly place items. One of the challenges of using GPT-3.5 for this task is that sometimes objects were spawned outside the room, or at odd placements.
GPT-4 not only places objects within the right boundaries of the room, but also places objects logically: a bedside table will actually show up on the side of a bed, a coffee table will be placed in between two sofas, and so on.
With this, we’re likely just scratching the surface of what LLMs can do in 3D spaces!
While this is just a small demo of what AI can do once it’s connected to a 3D space, we believe it will open doors to a wide range of tools beyond scene building. Developers can build AI-powered extensions in Omniverse for lighting, cameras, animations, character dialog, and other elements that optimize creator workflows. They can even develop tools to attach physics to scenes and run entire simulations.
We are working on making this and other experimental generative AI examples available to Omniverse creators and developers soon. You can check out some of our initial AI research projects in Omniverse AI ToyBox.
You can start integrating AI into your extensions today using Omniverse Kit and Python. Download Omniverse today to get started.
Quelle:
https://medium.com/@nvidiaomniverse/chatgpt-and-gpt-4-for-3d-content-generation-9cbe5d17ec15
Spatial, a 3D social and co-experience platform for creatives to build and share interactive online worlds, has today announced the graduation to beta of its Creator Toolkit powered by Unity, enabling the platform to now support gamification & interactive exhibitions.
With the Spatial Creator Toolkit, creators now have the flexibility to design, build, and publish their own games or immersive stories across Web, virtual reality (VR) and mobile. Spatial stated that this will help to save significant time for developers looking to bring their experiences quickly to mainstream audiences. The latest gaming components of the platform include visual scripting, custom avatars, a world linking system, and the ability to set up quests and rewards.
According to Spatial, gamified, immersive experiences on the web are building but friction still remains for developers who must learn and be tied within one system, weighed down with downloads and load times. To address this issue, Spatial’s universal toolkit allows anyone to become a Unity developer in a matter of clicks and enables users to build experiences straight from a browser with no app or platform downloads needed.
Spatial’s alpha toolkit launched in December with a ‘zero infrastructure’ approach, designed to make designing and distribution easier, and marked a first step by the company in expanding the capabilities for major brands on the platform such as Vogue, Hublot, and McDonalds.
“… Unity is the software unlocking 3D games and the new medium of the internet. Spatial is like the YouTube for these games.”
– Anand Agarawala, CEO and Co-founder, Spatial
Spatial’s Creator Toolkit has allowed developers and artists to benefit from enhanced visual quality through real-time lighting, triggers and animations, as well as shaders and textures, and has seen over 5,000 Unity developers join in a matter of weeks, according to the company. The latest gamification features will now open doors to the next level of developers looking to benefit from professional-grade features, zero learning curve and instant scalability.
“This evolution to gamified and interactive co-experiences is a natural expansion for the platform and the internet,” said Jinha Lee, CPO and Co-founder, Spatial. “With more than 1 million registered creators on the platform today, and almost 2 million worlds, we are committed to empowering all creators. Preserving art and culture on the internet has made it a must-see platform for gamers, developers, storytellers and artists alike.”
“As Adobe is for 2D video, Unity is the software unlocking 3D games and the new medium of the internet. Spatial is like the YouTube for these games, enabling instant publishing to the mass market. Anyone can build, the key is unlocking the capabilities to allow the magic to happen,” said Anand Agarawala, CEO and Co-founder, Spatial.
Spatial stated that later this year it will be rolling out an open marketplace with a competitive economic offering that makes building on Spatial an attractive business for any 3D creator. This will complete Spatial’s initial push to empower builders to not only create but also channel their creativity into sustainable businesses.
Spatial will be at the Game Developer Conference in San Francisco this week from March 20 -24 (booth #S1450), where attendees will be able to experience previews of unique Unity-built experiences, demos of interconnected worlds and a look at marketplace prototypes. For more information on Spatial and its solutions for the metaverse and zero infrastructure gaming experiences, please visit the company’s website.
Quelle:
Image / video credit: Spatial / YouTube