Technology Landscape Overview

The AI-generated virtual environments landscape underwent a paradigm shift in 2024-2025 with the emergence of “world models” as distinct from traditional 3D asset generators. Where previous years focused on single-object text-to-3D (Meshy, Tripo, Luma Genie), the field now includes systems capable of generating complete interactive environments from text, image, or video inputs.

The Fundamental Bifurcation

The landscape is bifurcated between two fundamentally different approaches:

Video-Based World Models

  • Examples: Google DeepMind’s Genie 3, Decart’s Oasis, Odyssey
  • Method: Generate interactive video streams frame-by-frame using next-token prediction
  • Strength: Real-time interactivity and infinite variation
  • Limitation: Produce no exportable 3D geometry

Geometry-Based Systems

  • Examples: World Labs Marble, Meta WorldGen, traditional 3D generators
  • Method: Output meshes, Gaussian splats, or USD files
  • Strength: Compatible with game engines
  • Limitation: Require minutes to hours of generation time

Platform Integration

Major metaverse platforms are rapidly integrating AI generation capabilities:

PlatformStatusKey Feature
Meta Horizon WorldsProduction since April 2024Text-to-mesh generation, 31% increase in world publishes
RobloxOpen-sourced Cube 3D (GDC March 2025)1.8B parameter foundation model
NVIDIA OmniverseProduction”Physical AI” with Cosmos World Foundation Models

The Quality-Speed Tradeoff

The quality-vs-speed tradeoff remains fundamental:

System TypeSpeedExportabilityExample
Real-time720p/24fpsNo 3D exportGenie 3
Batch5 min - hoursEngine compatibleMarble, WorldGen

No system currently bridges both requirements.


Paradigm Map

ParadigmDescriptionMaturityTrajectory
Video-Based World ModelsGenerate interactive video streams frame-by-frame; no 3D export; real-time capable SRL-4 Pilot Growing (Google, Decart, Odyssey investing)
Geometry-Native GeneratorsOutput meshes/USD compatible with game engines; batch generation SRL-5 Early Growing (World Labs, Meta, NVIDIA)
Neural Representations (3DGS)Gaussian splat output; photorealistic but requires custom renderers SRL-4 Pilot Growing (Luma AI, World Labs)
Platform-Embedded AIAI tools built into metaverse platforms (Roblox, Meta Horizon) SRL-5 Early Mainstream trajectory
Hybrid PCG+AICombines procedural generation with AI enhancement SRL-5 Early Stable (Unreal PCG, Unity Muse)

Technology Readiness Summary

Standards Readiness Levels by Domain

Video World Models Research/Pilot
SRL-4
Geometry Generators Early Adoption
SRL-5
Neural Formats (3DGS) Prototype/Draft
SRL-3
Platform AI Tools Early Production
SRL-5
Generation Metadata Research Only
SRL-2
Interoperability Fragmented
SRL-3
123456789

Key Observations

  1. Generation Metadata is the least mature area (SRL-2) - representing a greenfield opportunity for MSF standards work

  2. Interoperability remains fragmented (SRL-3) - each major system uses proprietary formats

  3. Platform AI Tools have reached early production (SRL-5) - demonstrating market readiness

  4. Neural Formats lag behind traditional geometry (SRL-3 vs SRL-5) - KHR_gaussian_splatting still in draft


Implications for Standards

The bifurcation between video-based and geometry-based paradigms means:

  • Video-based systems cannot be addressed by traditional 3D standards - they output pixel streams, not geometry
  • Geometry-based systems can leverage existing standards (glTF, USD) but need AI-specific extensions
  • Neural representations (3DGS) require new standards that are only now being drafted
  • Generation metadata has no standards coverage - the highest priority gap