Set skip_end: true and the response will no longer end with \n\n However sometimes you may not want this suffix. skip_end: by default, every session ends with \n\n, which can be used as a marker to know when the full response has returned.n_predict: The number of tokens to return (The default is 128 if unspecified).threads: The number of threads to use (The default is 8 if unspecified).if specified (for example ws://localhost:3000) it looks for a socket.io endpoint at the URL and connects to it.if unspecified, it uses the node.js API to directly run dalai locally.url: only needed if connecting to a remote dalai server.model: (required) The model type + model name to query. Requires that you have docker installed and running. You can optimize this if you delete the original models (which are much larger) after installation and keep only the quantized versions. The following numbers assume that you DO NOT touch the original model files and keep BOTH the original model files AND the quantized versions. Let's take a look at how much space each model takes up: You do NOT have to install all models, you can install one by one. The model name must be one of: 7B, 13B, 30B, and 65B. You need a lot of space for storing the models. Unless your computer is very very old, it should work.Īccording to a llama.cpp discussion thread, here are the memory requirements:Ĭurrently 7B and 13B models are available via alpaca.cpp 7BĪlpaca comes fully quantized (compressed), and the only space you need for the 7B model is 4.21GB:Īlpaca comes fully quantized (compressed), and the only space you need for the 13B model is 8.14GB: Powered by llama.cpp, llama-dl CDN, and alpaca.cppĭalai runs on all of the following operating systems:.Both alpaca and llama working on your computer!
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |