textgen-ui is (a lot!) faster and gives better answers than llama-cpp-python #387

vmajor · 2023-06-16T16:51:03Z

vmajor
Jun 16, 2023

Well this sucks because my code relies on llapa-cpp-python directly, not through an API and I have an entire roadmap for how I am going to build an agent system using the llama-cpp-python as the provider.

textgen-ui gives better answers than llama-cpp-python while using the same model and same settings that are available to both textgen and llama-cpp-python. Textgen has more settings, but I am not sure how relevant they are. I fixed the same seed.
Textgen is approximately 10x faster

But textgen uses llama-cpp-python for some of its functionality, it is the requirement, so I am confused and kind of exhausted. And I am hoping there is an easy answer that would allow me to get the same performance from llama-cpp-python as I get with what is mostly its GUI.

vmajor · 2023-06-16T16:58:15Z

vmajor
Jun 16, 2023
Author

I just noticed that when I use "chat" mode in textgen-ui I get the same poor response as when I prompt the model with llama-cpp-python. I guess now I need to find out why does textgen 'default' mode work better than 'chat' mode and native llama-cpp-python

0 replies

vmajor · 2023-06-16T17:02:09Z

vmajor
Jun 16, 2023
Author

never mind... thank you for participating in my rubber ducky moment. The difference was due to the prompting. I changed the default prompt structure from my formerly trusty 'alpaca' to just asking the question, and the model gained 50 IQ points.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

textgen-ui is (a lot!) faster and gives better answers than llama-cpp-python #387

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

textgen-ui is (a lot!) faster and gives better answers than llama-cpp-python #387

vmajor Jun 16, 2023

Replies: 2 comments

vmajor Jun 16, 2023 Author

vmajor Jun 16, 2023 Author

vmajor
Jun 16, 2023

vmajor
Jun 16, 2023
Author

vmajor
Jun 16, 2023
Author