Introducing Spell: Spline's exciting new tool for creating realistic 3D scenes from images
Jan 23, 2025 at 10:30 AM

Introducing Spell: Spline's exciting new tool for creating realistic 3D scenes from images

Popular 3D platform Spline has unveiled Spell, a new model designed to generate 3D worlds directly from images. This browser-based tool can create entire 3D scenes, referred to as “Worlds”, within minutes, maintaining consistency with the original image input. The generated scenes are represented as volumes that can be rendered using Gaussian Splatting or alternative methods like Neural Radiance Fields (NeRFs).

Spell operates as a diffusion model, capable of producing 3D worlds with realistic multi-view consistency across diverse categories, including people, objects, environments, and 3D characters. It can render images from various angles with high precision and detail, generate controlled camera paths, and simulate physical material properties such as reflections, refractions, and surface roughness. Additionally, Spell can simulate camera properties like depth of field and manage camera/object intersections for realistic internal surface views.

The model emphasizes physical consistency, simulating camera intersections rather than relying on interpolation, to maintain a realistic visual flow. Currently, Spline is offering Spell in an early access phase with limited availability and a high price point, targeting early adopters to gather user interaction insights while managing GPU costs.

Jan 23, 2025 by Paul

Spline iconSpline
  21
  • ...

Spline is a web-based 3D modeling tool that allows users to create 3D objects, edit materials, and add interactivity without the need for coding. It provides a streamlined platform for controlling the outcome of 3D design work. Rated 5, Spline is recognized for its ease of use in 3D design. Notable alternatives include Blender, SketchUp, and Autodesk Maya.

Comments

UserPower
Jan 23, 2025
1

It generates 3D models from videos (generated from the input image), so depending of how good the video is generated from the input image (and so, the training set), it can gives weird results for some angles (e.g. eyes that are not totally spherical, distorted ears, etc.). It seems to generate pretty decent models for oversimplified items (like toy houses) and abstract and fur objects, specially when round or cubic. Only one object is generated on each rendering (so no way to create a whole forest or a city). Since it doesn't separate surfaces (e.g. skin from hairs), and generate a lot of complexity by adding way too much vertices, it can be very tough to adjust the model manually once generated (because of the UV mapping and the associated textures). Of course, there is no control of how the final model should look, it's only based on the input image.

Gu