AI General Thread

Started by Legend, Dec 05, 2022, 04:35 AM

0 Members and 2 Guests are viewing this topic.

Legend

Quote from: the-Pi-guy on Today at 05:35 PMFor some reason, running local models feels goofier than it should be. And I'm not sure why. Like for some reason, prompt adherence is worse when I'm using different Android app.

Not sure if they're formatting the requests differently, or passing different default values for the model.
Different sampling method/temperature?

the-pi-guy

Quote from: Legend on Today at 05:52 PMDifferent sampling method/temperature?
Most of the apps let you change the temperature, top K, top P. 

I'm leaning towards formatting being different, which would be harder to fix as an end user.

the-pi-guy

LM Studio has so many options you can mess with. 

You can even set how the context window is managed (truncate middle, rolling).  

Legend

Quote from: the-Pi-guy on Today at 08:45 PMLM Studio has so many options you can mess with.

You can even set how the context window is managed (truncate middle, rolling). 
Is there a way to run two models in parallel so that the next token can be sampled using both their outputs? Like averaged or multiplied or whatever?