dynamic objects are still largely unsolved problem, I just tried to approach it in this demo. also this particular place doesn't have reflective surfaces, but technology supports it - check for example this splat https://superspl.at/scene/ff1d0393 or this one https://superspl.at/scene/6c822f84
Plays decently smooth on my M4 Max. It's probably still a long way from being a production-ready replacement for meshed environments, but I could imagine a hybrid mode where certain elements like grass and shrubbery are drawn with gaussians, perhaps with support for basic procedural animation. Great work with the playable demo!
Endless fields of grass and other things where you can make copies of a single base thing and just argue in some parameters like position, color, type etc are cheap to render. Making them sway or react to a body also isn't a problem.
> I think it won't be long before the whole world is mapped, and "playable".
People already don't want to use VR, why would they get into/allow scanning with even less immediate value?
I agree with the sprit though, I just think rendering the world is gonna happen from a few generations of iteration on world modeling tech like World Labs/Marble.
tbh I haven't yet approached optimisation for this, i am pretty sure it's possible to improve it further. It runs on my 2020 iphone, but not super smooth though
For me, the biggest issue this solves is the blank canvas paralysis problem. Artists are visual thinkers and need a little nudge in the right (art) direction. This is a great way to fill that blank sheet of paper with something that they can take and run with.
Editing Gaussian Splats is still a pain in the ass in the artist's perspective. Even if you can create a good-enough first try using scanned data or generative AI, you just end up with a rough draft that you cannot polish in any way. Existing mesh-based tools allow you to edit the geometry relatively easily, since they are in a higher level discrete representation rather than just a point cloud data structure.
It seems to me that there's some overheated rhetoric that reminds me of the tech-specific spin on the Appeal to Novelty fallacy [1], where people think a new tech is going to uniformly improve on an old tech, that if it isn't an improvement on every front it is somehow a "failure", and therefore if we like the new tech and we are on Team New Tech that we must defend how the new tech is an improvement on every aspect.
Gaussian splats are definitely interesting and do something things older tech is not very good at, but at the same time, it's definitely going to end up being a tool in the tool chest and not completely murderating mesh-based tech or something because they have a lot of other weaknesses, like editability. Or dynamic animation.
What I think some people may not realize is, that's not particularly uncommon. There's a really, really long line of graphical techs that do something particularly well but their weaknesses have kept them in a limited use. It's not a problem for Gaussian splats to become a tool in the toolchest; they aren't a "failure" if we're still using meshes for a lot of things in 10 years.
Mesh-type techs are the "default" for some good reasons.
There’s some movement in this area to be able to surface quantize the splats but you are right, right now it’s simply just visual language and isn’t useful in the pipeline.
Extracting a surface mesh is possible, but the result is going to be really ugly (like the high-poly meshes from generative AI that are useless to artists)!
Mesh processing is a very difficult research domain in computer graphics that has been iterated for several decades, and we still don't have a good automated solution for retopology (Partly because the problem is hard to define in a mathematical way, but also since it's not a problem you can just solve with AI by throwing data and compute at it)
I used voxelization of the splats in the past so I appreciate the notes on the difficulty but this is sort of what PlayCanvas is doing here. Taking the splats, making voxels, meshing.
It's a novel approach and worked well in BIM a few years ago, though not anything real-time.
With things like the latest dlss (extremely high quality run time .. reinterpretation), I wonder how precise mesh etc has to be now.
1. extract even a super approximate (meaning, like square edges, with some visual details) mesh from gen ai or a scan as a starting point,
2. move things around and define volumes for gameplay needs,
3. name things ("this is a Victorian house in a surprisingly good condition compared to the neighborhood it's in"), have human guided gen ai polish the things a bit more from the labels within the bounds of the gameplay required volumes,
4. let run time dlss fix the lighting etc from the rough geometry
Yeah, but this seems to be just a 3D GS video (captured from several different camera angles), similar to how an ordinary 2D video is just a series of still frames. For 3D games this would be unsuitable since animations often have to be generated on the fly based on game physics. Even for pre-baked animations the memory cost of loading each frame individually would be too inefficient. For polygon meshes you have just a single static mesh that is deformed over time.
> Dreams managed to animate splats on the PS4. Admittedly, not quite the same type of splats, but there is probably a middle ground here where it can be made to work
I'm pretty sure Dreams only allowed animations as translations and rotations, not something that approximates soft skeletal animations. And even translations and rotations would be problematic since 3D GS scenes rely on baked lighting which would then result in objects no longer fitting the scene.
Dreams managed to animate splats on the PS4. Admittedly, not quite the same type of splats, but there is probably a middle ground here where it can be made to work
Question for those making Splats...how do you get such large environments? I've been playing around with them a bit and I'm finding I'm running out of memory with surprisingly little built even on an RTX6000. Any tips or ideas would be awesome!
DonHopkins on July 12, 2021 | parent | context | favorite | on: I Stopped Using Emojis
>What we saw was, if you go too far in that [representational] direction because you want to be inclusive, people don’t see themselves represented and they’re not going to use it. You have to have enough specificity to represent you enough, but not so inclusive that your emoji palette is hundreds of thousands of emoji.
Scott McCloud wrote a whole book about this: "Understanding Comics".
>One of the book's key concepts is that of "masking," a visual style, dramatic convention, and literary technique described in the chapter on realism. It is the use of simplistic, archetypal, narrative characters, even if juxtaposed with detailed, photographic, verisimilar, spectacular backgrounds. This may function, McCloud infers, as a mask, a form of projective identification. His explanation is that a familiar and minimally detailed character allows for a stronger emotional connection and for viewers to identify more easily.
>The masking effect or masking is a visual style, dramatic convention, and literary technique described by cartoonist Scott McCloud in his book Understanding Comics in the chapter on realism. It is the use of simplistic, archetypal, narrative characters, even if juxtaposed with detailed, photographic, verisimilar, spectacular backgrounds. This may function, McCloud infers, as a mask, a form of projective identification. His explanation is that a familiar and minimally detailed character allows for a stronger emotional connection and for viewers to identify more easily.
Scott McCloud and Will Wright discussed masking and other issues in their 2002 GDC discussion, "When Maps Collide":
Understanding Comics and masking influenced The Sims 1 graphics architecture and design (using detailed pre-rendered 2d+z sprites for the environment and simplistic real time 3d graphics for the people), which fortunately ran fast on the common un-accelerated 3d graphics hardware of the time (greatly expanding the user base), and synergistically enabled user created content (which was essential to its success) which I described in this earlier post:
>Going 3D at that time in history meant that the quality of the graphic would take a huge hit, as well as the rendering speed, and fewer people would be able to run it because it would require a high end computer, so it was just not worth it.
>Using 2D pre-rendered sprites means that the artists can use as many polygons, rich textures and lighting techniques as they want in 3D Studio Max, and tweak them until the sprites look perfect, and that's exactly what the user sees. You just could not approach anywhere near that quality with 3D graphics at the time. Of course things are a lot different now!
>That was during the time that The Sims was also in development. One reason The Sims was successful is that it did not try to be full 3D, and ran well on low-end computers (the old computer that little sister inherits from big brother when he upgrades to a gaming machine). It used a hybrid 2D/3D system of z-buffered sprites, with an orthographic projection constrained to four rotations, three zooms, and only the characters were rendered with polygons into the pre-rendered z-buffered scene, using DirectX's software renderer.
>I developed the character animation system and content creation tools for The Sims, and when the EA executives were reviewing the technology to decide if they should buy Maxis, to justify our approach I bought them a copy of Scott McCloud's book Understanding Comics, which explained a concept called "masking" --
>Hergé's Tintin comics are a great example of how that works: The idea is that by making the background environment very realistic (i.e. rich pre-rendered sprites from high poly models), and the characters themselves more abstract (i.e. efficient real time 3d texture mapped low poly models), the readers (players) can more easily project themselves into the scene and identify with the characters. Much in the same way an abstract happy face can represent everyone, while a photograph of a person's face only represents that person.
>The other fortunate consequence was that it was easy for players to create their own characters and objects by editing the textures and sprites with 2D tools like Photoshop, without requiring difficult 3D modeling tools like 3D Studio Max, so that enabled a lot of user created content by kids instead of professional artists, which was essential to the success of the game.
"Back in the day" people were afraid that pupils would create CS (beta 6.5) maps of their schools. Gaussian Splatting would have been very convenient for that :-)
Playing in brave on my Moto g power 2024 was low fps, but as soon as i pressed the shoot button my whole screen went purple pixel distortion lines and I had to restart my phone.
lowest settings? i just approached mobile devices, it's a lot of data and running in a browser .. it's a miracle it works there in the current state (on my 6-years old iPhone it allows me to walk and shoot at least). Streaming LOD is tweakable tbh, it may need to lower settings for mobiles.
Really cool. Out of curiosity, what's the per-frame cost of rendering
the splat scene compared to an equivalent triangle-mesh approximation?
I've been wondering when (if ever) splatting becomes the default for
web-delivered 3D content vs. just a research/SIGGRAPH-paper toy.
Browser support and file-size feel like the two big walls.
i think splats are good for browsers as it's just a render call with relatively simple shader - imagine modern traditional rendering pipeline, where it's thousands of different passes + post processing, shadows etc. Effective splat sorting is a different story in browsers though
This is a really neat bridge between “looks cool” and “feels like you’re there”. Inferring real life properties like lighting is a cool trick and just the beginning I’m sure. I’m excited to explore new and dynamic worlds and bring the AAA experience closer to something you can build yourself.
How practical would it be to include LIDAR in the initial real-world environmental scan to get (or at least seed/constrain with real data samples) an even better collision mesh?
I'm looking forward to seeing what will happen when gaussian splatting can be combined with DLSS 5. Gaussian splatting has a lot of potential in video games yet to be realised.
This for some reason reminded me of the "Killerspiele" debate [1] we had in Germany after a dramatic school shooting. The shooter had previously built a map of the school in Counter-Strike. With this it's not a long stretch from there to having a realistic map of a school... Which would have given him a better rating than the one he got for his map: "I'd like to see the school that actually has lighting like this." [2]
Hopefully this tech will never used for something like this.
It has no dynamic lighting or effects, which makes the video look like a high quality game from 2006.
reply