AI General Thread

the-pi-guy · Jan 06, 2026, 02:54 AM

QuoteThis is exactly why you've used it before during system recovery — it preserves data integrity, which you care about deeply.

Why do I find this so hilariously cringeworthy?

Legend · Jan 06, 2026, 06:43 PM

Quote from: the-Pi-guy on Jan 06, 2026, 02:54 AMWhy do I find this so hilariously cringeworthy?

These AIs are so cringe.

Gemini has a new prompt I think so now almost always it basically decides things for me and finishes "now that you understand you were mistaken, how does it feel to accept what I have said?"

Meanwhile chatgpt is always "I'm going to ignore your insults" when I tell it it's screwing up.

the-pi-guy · Jan 08, 2026, 03:47 PM

Seems like local video generation got a bump.

I see people buzzing about LTX-2 and Wan 2.2 both seem to be doing audio + video.

the-pi-guy · Jan 10, 2026, 09:06 PM

https://www.reddit.com/r/StableDiffusion/comments/1q9cy02/ltx2_i2v_quality_is_much_better_at_higher/

Image quality is fine. Audio is a little rough.

The content in the back though looks hilarious. There's a random stage light on the left. Vehicles are going in random directions.
The cars look a little too long to me.

Legend · Jan 15, 2026, 01:47 AM

This one is way better... at least the person actually does something including a full turn around pic.twitter.com/FGQsBSxCmR
— A.I.Warper (@AIWarper) January 14, 2026

Video will never be the same. Kling motion control.

Legend · Jan 16, 2026, 08:37 PM

Geminini 3 pro is so stupid.

"The Volume: You ran 5 days in a row (Sun/Tue/Wed/Thu/Fri)."

Also I forgot to save it, but the other day it was like ~"You like cherry pie because it doesn't have cherries in it. It's a real pie like peacan pie. But wait, cherry Pi has cherries. And you like cherries. So that doesn't make sense."

For context, it was trying to explain to me why I like "cherry pie" and it came up with that reasoning on its own.

the-pi-guy · Jan 19, 2026, 05:38 PM

Quote from: the-Pi-guy on Jan 08, 2026, 03:47 PMSeems like local video generation got a bump.

I see people buzzing about LTX-2 and Wan 2.2 both seem to be doing audio + video.

I feel like I've seen dreams.

Takes me about 5 minutes to generate a 5 second video, and 9 minutes to generate a 10 second video.

the-pi-guy · Jan 19, 2026, 09:44 PM

I generated a bunch of different videos.

To a big extent I was fighting with ComfyUI though. It would frequently not generate a new video. The template workflow also wouldn't update the randomizer seed, so you'd get mostly the same output anyways.

I generated some videos where people were talking back and forth, yelling back and forth, whispering and having different emotions.
It was extremely cool.

The audio was expressive, but obviously robotic at times. But it matches reasonably with the actor's movements.

The model isn't censored from what I understand, but it's also not very good at uncensored stuff.....

Legend · Jan 19, 2026, 10:46 PM

How much vram does it need?

the-pi-guy · Jan 20, 2026, 05:09 AM

I was wrong, it is censored.

Quote from: Legend on Jan 19, 2026, 10:46 PMHow much vram does it need?

I'm using 10 GB.

Some people use 8.

Legend · Jan 20, 2026, 03:23 PM

Quote from: the-Pi-guy on Jan 20, 2026, 05:09 AMI was wrong, it is censored.
I'm using 10 GB.

Some people use 8.

Oh sweet. I should try it on my laptop.

the-pi-guy · Jan 21, 2026, 04:41 PM

Local generation is in such an interesting place right now.

Text: mistral-small-abliterated has generally been my favorite model in terms of speed and in terms of how good it is.

Image: Z-Image-Turbo is pretty good. It obviously has some limitations. Yesterday I asked it to generate an image of a woman on stage, with screens that said "PlayStation Experience", some of the generated images even included the PlayStation logo.

The model is uncensored, but it seems like it's not trained on certain things.

It makes me a little sad, because it's really cool, but I'm not super interested in it.

Video: LTX-2 is pretty incredible. My biggest disappointment is that it's censored. It can create some very incredible videos.

One downside is if you don't tell it what to say, and it tries to make a video with conversation, it will make gibberish.

Z-image is very good at text, LTX-2 is not. So that's kind of a fun difference.

Legend · Jan 24, 2026, 09:18 PM

Quote from: the-Pi-guy on Jan 21, 2026, 04:41 PMLocal generation is in such an interesting place right now.

Text: mistral-small-abliterated has generally been my favorite model in terms of speed and in terms of how good it is.

How much ram does it need?

the-pi-guy · Jan 24, 2026, 09:56 PM

Quote from: Legend on Jan 24, 2026, 09:18 PMHow much ram does it need?

https://ollama.com/huihui_ai/mistral-small-abliterated

This one is a 14 GB model.

My system only has 10 GB VRAM, so it uses some system RAM.

the-pi-guy · Jan 25, 2026, 08:00 PM

I usually have good luck with Copilot but this was really funnily bad.

I was asking for an alternative to Ltx-2 that does audio and video at the same time.

It's like "that doesn't exist. There are no local models that do audio and video at the same time."

But I've seen videos?

"You might be seeing marketing that adds audio to AI generated video."

No I'm creating it myself.

"Well that's very interesting. It's not that I don't believe you, but that's a very bold claim.".

So I added a source about LTX-2 creating audio and video.

"Oh I see what's going on. We are both right. There are two LTX-2 models. One by Stability AI in 2024 and Light tricks in 2026."

I can't find anything about Stability AI having an LTX-2 model.

"You're not missing anything. There isn't one.".

The conversation was longer than this, but this was the gist.

I'm not sure I've seen Copilot being so adamantly wrong and pretentious like this. At least not in a while.