More

streamer45 · 2026-01-22T17:13:29 1769102009

Rad! huggingface link gives 404 on my side though.

schopra909 · 2026-01-22T17:16:00 1769102160

Oh damn! Thanks for catching that -- going to ping the HF folks to see what they can do to fix the collection link.

In the meantime here's the individual links to the models:

https://huggingface.co/Linum-AI/linum-v2-720p https://huggingface.co/Linum-AI/linum-v2-360p

streamer45 · 2026-01-22T17:35:04 1769103304

Looks like 20GB VRAM isn't enough for the 360p demo :( need to bump my specs :sweat_smile:

schopra909 · 2026-01-22T17:18:34 1769102314

Should be fixed now! Thanks again for the heads up

streamer45 · 2026-01-22T17:20:23 1769102423

All good, cheers!

schopra909 · 2026-01-22T17:45:52 1769103952

Per the RAM comment, you may able to get it run locally with two tweaks:

https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...

1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM

2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations

dsrtslnd23 · 2026-01-22T23:27:54 1769124474

Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that.

schopra909 · 2026-01-23T01:15:26 1769130926

T5 Encoder is ~5B parameters so back of the envelope would be ~10GB of VRAM (it's in bfloat16). So, for 360p should take ~15 GB RAM (+/- a few GB based on the duration of video generated).

We can update the code over the next day or two to provide the option for delete VAE after the text encoding is computed (to save on RAM). And then report back the GB consumed for 360p, 720p 2-5 seconds on GitHub so there are more accurate numbers.

Beyond the 10 GB from the T5, there's just a lot of VRAM taken up by the context window of 720p video (even though the model itself is 2B parameters).

storystarling · 2026-01-23T07:48:47 1769154527

The 5B text encoder feels disproportionate for a 2B video model. If the text portion is dominating your VRAM usage it really hurts the inference economics.

Have you tried quantizing the T5? In my experience you can usually run these encoders in 8-bit or even 4-bit with negligible quality loss. Dropping that memory footprint would make this much more viable for consumer hardware.

schopra909 · 2026-01-23T14:49:31 1769179771

That all being said, you can just delete the T5 from memory after encoding the text so save on memory.

The 2B parameters will take up 4 Gb of memory but activations will be a lot more given size of context windows for video.

A 720p 5 second video is roughly 100K tokens of context

schopra909 · 2026-01-23T14:47:08 1769179628

Great idea! We haven’t tried it but def interested to see if that works as well.

When we started down this path, T5 was the standard (back in 2024).

Likely won’t be the text encoder for subsequent models, given its size (per your point) and age

streamer45 · 2026-01-19T19:12:45 1768849965

Nice! Built a similar system in the past using a servo-controlled traxxas buggy with an LTE hat, which let us do open-space driving. Latency (over internet) was still a challenge, and finding cameras and lenses that performed well across varying lighting conditions turned out to be a bit of a pain but pretty fun stuff.

streamer45 · on July 6, 2023

Been using raylib for years to power generative digital paintings on embedded systems (RPI and the like). I have been really impressed with its performance and accessible API. Plus it's a very active and welcoming open source project, kudos to the maintainer.

Renaud · on July 6, 2023

Do you have any examples of code and/or art you can share?

I’ve always been fascinated by generative art.

Al0neStar · on July 6, 2023

After cleaning up my bookmarks i narrowed down my "Creative Coding" folder to these.

"Generative Design: Visualize, Program, and Create with JavaScript in p5.js":

http://www.generative-gestaltung.de/2/

Articles by Tyler Hobbs specially the one on "Flow Fields" :

https://tylerxhobbs.com/essays/2020/flow-fields

Articles by Sighack specially the one on "Watercolor Techniques":

https://sighack.com/post/generative-watercolor-in-processing

"Steve's Makerspace" on Youtube:

https://youtube.com/@StevesMakerspace

speps · on July 6, 2023

I don't know how known it is but Jared Tarbell has an excellent gallery: http://www.complexification.net/gallery/

EDIT: nevermind, bio says he co-founded Etsy... he's probably well known

streamer45 · on Nov 10, 2021

I've been using raylib for years now to implement digital signage art and it's been a pleasure to work with, especially thanks to its excellent multi platform support (used many Raspberry PIs). Really well thought, intuitive API, kudos to the author.

streamer45 · on June 10, 2021

Amazing to think they were actually spawning 150k headless browsers to simulate the traffic. That sounds like throwing money at the problem and it probably worked (for a while anyway).

Having built a load-test tool as well, I can say making it realistic enough and keeping it that way is possibly the hardest challenge. Maintenance cost is high, especially in a features focused environment.

ericb · on June 10, 2021

> Having built a load-test tool as well.

Which tool. Curious.

To your other points,

> That sounds like throwing money at the problem and it probably worked (for a while anyway).

> Maintenance cost is high, especially in a features focused environment.

Isn't really just choosing which way to throw money at the problem? Hardware costs, vs. person-hours to maintain a thin client version?

streamer45 · on June 10, 2021

> Which tool. Curious.

We built something similar at Mattermost, which (funnily enough) is a comparable application.

https://github.com/mattermost/mattermost-load-test-ng

https://mattermost.com/blog/improving-performance-through-lo...

> Isn't really just choosing which way to throw money at the problem? Hardware costs, vs. person-hours to maintain a thin client version?

That's fair, although the second option has (in my opinion) a better return on investment given by the knowledge and experience gain.

tyingq · on June 10, 2021

The new tool seems like an early version as well, with pretty basic functionality.

In the example where it is supposed to "viewing a message, marking the message as read, and finally calling reactions.add"...it doesn't really do those things in a real chain. They just have a 5 second delay after "view a message", then run the "mark message as read", then a 60 second delay, then calling reactions.add. I'm not sure that mimics real end user behavior terribly well.

It seems like they could have used jMeter rather than making a home-grown web sockets test client. Perhaps there's some requirement where existing tools don't work well.

flakiness · on June 11, 2021

For whom yet to read the article, this story is about stopping the money-throwing and switching to more scalable (cheaper) solution.

It's kind of interesting to see them choosing rather "declarative" (which is, json-centric) approach instead of adopting small languages like Lua for scenario-based scripting.

Maybe the declarative approach is suitable for auto-generation from the user stats data as they described? After all, there are often fewer number of people who like to write stress tests than writing a feature that should be stress-tested.

streamer45 · on March 13, 2021

This is great and overdue. Hopefully all major browsers will add some support for open source/royalty free codecs.

Emscripten/WebAssembly actually worked rather well with audio (OPUS is just awesome) but when it comes to video it's just unfeasible, especially if you are looking at doing low latency streaming. That said, I cannot fail to mention the incredible effort done by ogv.js [1] to make a/v decoding possible almost anywhere.

Looking forward to experiment with this new API.

[1] https://github.com/brion/ogv.js/

streamer45 · on Jan 5, 2021

At Mattermost we went for the do-it-yourself option and wrote a custom tool for the job [1]. After a lot of research on all the existing open-source frameworks we couldn't really find anything that would fit our use-case. We are quite happy with the result although, as the OP mentioned, there's a significant maintenance cost attached. As new features gets implemented and more API calls added you need to go back and make sure your user behaviour defining logic stays in sync with the real world. If I were to do it all over again, I'd probably give k6 [2] a chance but I am still convinced a tailored solution was the best choice.

[1] https://mattermost.com/blog/improving-performance-through-lo...

[2] https://k6.io/

streamer45 · on Dec 30, 2020

Nice! Are all user inputs mixed into the same stream?

timdaub · on Dec 30, 2020

Yes they are. Technically though it's not mixed. All arriving audio data is dumped into a single buffer. Mixing would now be a great way to get rid of the stream's choppyness.

now that so many users are playing simultaneously, I'm a bit annoyed that some leave the website open and stream silence.

Wondering how I could get a user a fair slot to perform now...

Anyways, the stream has become a great source of entropy now :D

daniellarusso · on Dec 30, 2020

So, is this like a giant party line?

https://en.m.wikipedia.org/wiki/Party_line_(telephony)

streamer45 · on Dec 30, 2020

Oh I see. Server side mixing would probably be a good option to help it scale better. Cool stuff anyway :)

timdaub · on Dec 30, 2020

I now hacked together a really bad mixer. Not sure how it will work live. Let's see.

I usually don't do prod testing but hey I'm on holiday :D

streamer45 · on July 29, 2019

It seems appropriate to also mention https://mattermost.com/ as a valid Open Source alternative to Slack.

ourcat · on July 29, 2019

We use this at work alongside Gitlab.

Works perfectly.

'Bot' scripts can also be added to do things like tell a channel when a repo has been pushed etc. Very handy.

streamer45 · on Nov 23, 2018

Amazing! Why does it look so much like computer graphics? I would't expect to see it like that with my own eyes.

nabla9 · on Nov 23, 2018

Lack of ambient light and atmospheric attenuation. Significantly more direct light vs indirect light.

If you fly at 35,000 ft the horizon is at 221.3 miles and most of it is dense air. If you look directly downwards from ISS there is less than ten miles of thick atmosphere between the camera and the target.

If you do ray tracing from single light source with few objects and without effects that simulate atmosphere you simulate how the scene looks in vacuum.

gpm · on Nov 23, 2018

I suspect also contributing is that the setting here is more like what you usually see with computer graphics than in real life. Very few moving parts.

In real life there are insects, and birds moving around. Wind blowing all sorts of things (leaves, blades of grass, trash, etc), etc. Individual strands of hair. Etc. All things we can't really reproduce with graphics.

Here there is just a sphere with a surface texture and some volumetric effects.

bitL · on Nov 24, 2018

1) longer exposure, "averaging" neighboring pixels

2) noise reduction making it look "plasticky"

3) attempts to increase dynamic range with filters that favor certain color hues

That would be my list of possible explanations as a photographer.