Abusing DuckDB-WASM by making SQL draw 3D graphics (Sort Of)

183 points by tanelpoder 19 hours ago

pjot 16 hours ago

Author here, wild to see this at the top of HN!

You can play it here: https://patricktrainer.github.io/duckdb-doom/

Pressing “L” enables (very) verbose logging in the dev console and prints much of the sql being executed.

adornKey 18 hours ago

Finally somebody did it! Back in the day my attempts to write a game in SQL were thwarted by buggy query-optimizers. They cached my calls to rand() way too often although documentation promised not to do that.

dspillett 17 hours ago
> They cached my calls to rand() way too often although documentation promised not to do that.
For some DBs (SQL Server definitely), RAND() and similar are handled as if they are deterministic and so are called once per use. For instance:
```
    SELECT TOP 10 RAND() FROM sys.objects
    SELECT TOP 10 RAND() FROM sys.objects
```
just returned ten lots of 0.680862566387624 and ten lots of 0.157039657790194.
```
    SELECT TOP 10 RAND(), RAND(), RAND()-RAND() FROM sys.objects
```
returns a different value for each column (0.451758385842036 & 0.0652620609942665, -0.536618123021777), so the optimisation is per use not per statement or even per column (if it were per column that last value would be 0, or as close to as floating point arithmetic oddities allow).
This surprises a lot of people when they try “… ORDER BY RAND()” and get the same order on each run.
One workaround for this is to use a non-deterministic function like NEWID(), though you need some extra jiggery-pokery to get a 0≤v<1 value to mimic rand:
```
    SELECT TOP 10 CAST(CAST(CAST(NEWID() AS VARBINARY(4)) AS BIGINT) AS FLOAT)/(4.0*1024*1024*1024) FROM sys.objects
```
For the example of sorting, the outer cast is not needed. You might think just using “ORDER BY NEWID()” would be sufficient, but that is an undefined behaviour so you shouldn't rely upon it. It might work now, a quick test has just worked as expected here, but at any point the optimiser could decide it is more efficient to consider all UUIDs as having the same weight for sorting purposes.

enescakir 18 hours ago

Like running Doom on a printer, but now it’s in the same engine powering your BI dashboards. Peak 2025 energy.

dkga 18 hours ago

Very interesting!

You know it gets wild when you read "... Here's the core of the raycasting algorithm in SQL"!

nonethewiser 18 hours ago

Given the first post in the blog says "not made by a [ROBOT EMOJI]", should I assume this one which does not have this message, is made by a [ROBOT EMOJI]?

https://www.hey.earth/posts

NitpickLawyer 18 hours ago

I swear we're gonna start seeing disclaimers like "100% handcrafted code, our devs eat only grass-fed beef, free-range devops teams, specialty coffee sustained QA department, no IDEs, no intelisense, we code in notepad++" soon...
- andhuman 16 hours ago
  
  100% organic!
pjot 16 hours ago

Ha! I made this. I’m not a robot either :)
bstsb 17 hours ago

it's the footer at the bottom of all pages. it's also present on the blog pages
marcellus23 17 hours ago

This one also has that same footer.

zxilly 6 hours ago

You can use requestAnimationFrame to prevent race in the renderer

robertsdionne 16 hours ago

https://x.com/geocucu_t/status/1909291486367166717

r3tr0 16 hours ago

We use duck db wasm to make live system performance dashboards based on eBPF.

It really is magic!

You can check it out here.

https://yeet.cx/play

dndn1 16 hours ago

Neat UI, are you using a library for that?
- r3tr0 13 hours ago
  
  nope. everything off shelf was too slow.

robertclaus 18 hours ago

This is great! I did a similar project a while back to do image processing in a SQL database with pixels being individual records. It's amazing what SQL can do with the right table structures.

xnx 18 hours ago

Impressive project, the subhead might attract even more attention: "Building a SQL-Powered Doom Clone in the Browser"

karmakaze 16 hours ago

I'd like to see something like this done in SpacetimeDB which was made specifically for game backends. I haven't looked into it yet, only seen the 1.0 announcement on HN and in my YT feed, and curious how its feature set makes this sort of thing easier or more natural.

datadrivenangel 18 hours ago

Abusing databases is the way. I'm building a SQLite agent using triggers.

intalentive 17 hours ago

I’ve been using triggers as FTS boilerplate for so long I didn’t pause to think that they’re just event listeners. Say more about your project..?

vd2287 10 hours ago

I'm a bit new to this stuff, but SQL and 3D GRAPHICS???

gitroom 17 hours ago

i think this is super wild honestly, cant believe sql is doing graphics now

cess11 18 hours ago

Nice project. Reminds me of one of my favourite demos, a MySQL raytracer:

https://www.pouet.net/prod.php?which=83222

kevingadd 13 hours ago

> But because tick() and render() involved async database calls, sometimes a new interval would fire before the previous one finished.

This is a tricky one when writing games using async APIs. The game I've been working on is written in C# but I occasionally hit the same issue when game code ends up needing async, where I have to carefully ensure that I don't kick off two asynchronous operations at once if they're going to interact with the same game state. In the old days all the APIs you're using would have been synchronous, but these days lots of libraries use async/await and/or promises and it kind of infects all the code around it.

It does depend on the sort of game you're building though. Some games end up naturally having a single 'main loop' you spend most of your time in, i.e. Doom where you spend all your time either navigating around the game world or looking at a pause/end-of-stage menu - in that case you can basically just have an is_menu_open bool in your update and draw routines, and if you load all your assets during your loading screen(s), nothing ever needs to be async.

Other games are more modal, and might have a dozen different menus/scenes (if not hundreds), i.e. something like Skyrim. And sometimes you have modals that can appear in multiple scenarios, like a settings menu, so you need to be able to start a modal loop in different contexts. You might have the player in a conversation with an NPC, and then during the conversation you show a popup menu asking them to choose what to say to the NPC, and they decide while the conversation menu is open they want to consult the conversation log, so you're opening a modal on top of a modal, and any modal might need to load some assets asynchronously before it appears...

In the old days you could solve a lot of this by starting a new main loop inside of the current one that would exit when the modal went away. Win32 modal dialogs work this way, for example (which can cause unpleasant re-entrant execution surprises if you trigger a modal in the wrong place). I'm still uncertain whether async/await is a good modern replacement for it.

mritchie712 18 hours ago

This is a DuckDB feature that's incredibly hard for Snowflake (or anyone else) to copy. Running the same database client-side (WASM) and server-side can make for a pretty magical experience.

Queries that normally take 1s to 2s can run in 25ms, so you get under the "100ms rule" which is very uncommon in analytics applications.

We DuckDB server side and have experimental support for DuckDB WASM on the client-side at https://www.definite.app/ and sometimes I don't trust that a query ran because of how fast it can happen (we need some UX work there).

esafak 18 hours ago

How does that work? Does the client clone the database at the beginning of the session and work with a shapshot? If so, do you automatically and periodically sync it?
- randomtoast 18 hours ago
  
  With HTTP Range Requests, which is typically used for pausing and resuming large file downloads, to request specific byte ranges from the file. This allows you to retrieve only the data you need. With SQL indexes, the data returned will be minimal because the lookup is optimized. However, if you select *, you will still end up downloading the entire database.
  
  jasonjmcghee 17 hours ago
  
  Parent comment isn't asking how data is requested from the back-end.
  GP comment is (seemingly) describing keeping an entirely client side instance (data stored locally / in memory) snapshot of the back-end database.
  Parent comment is asking how the two are kept in sync.
  It's hard to believe it would be the method you're describing and take 25ms.
  If you're doing http range requests, that suggests you're reading from a file which means object storage or disk.
  I have to assume there is something getting triggered when back end is updating to tell the client to update their instance. (Which very well could just be telling it to execute some sql to get the new / updated information it needs)
  Or the data is entirely in memory on the back end in an in memory duckdb instance with the latest data and just needs to retrieve it / return it from memory.
  
  immibis 15 hours ago
  
  Doesn't that mean you have way more round-trips than necessary? Instead of asking for the row, you ask for the file header, the list of tables and indices, an index page, another index page, another index page, and a table page?
- mritchie712 15 hours ago
  
  Yes, we're still fine-tuning exactly what we cache, but a simple example would be:
  1. user writes a `select` statement that return 20k records. We cache the 20k.
  2. user can now query the results of #1
  we're also working on more complex cases (e.g. caching frequently used tables).