Skip to main content

💻-coders 2024-10-31

Summary​

In the technical discussions, Ophiuchus successfully cloned the okai repository from GitHub and installed necessary dependencies including onnxruntime-node@1.19.0 for running local models with ollama instead of llama-cpp. The community considered using a smaller 1 billion parameter model to avoid overloading their hardware, as larger models like the 70 billion parameter one caused issues in the past. Ophiuchus mentioned less trouble with ollama and suggested setting up an environment variable for remote access to run pods efficiently. Big Dookie shared that they were using a llama2 7b model but planned to upgrade it, aiming to improve audiovisual input prompt responses by scraping relevant sentences from transcripts. The community acknowledged the innovative approach and agreed on starting with text replies before expanding into more complex functionalities.

FAQ​

  • What is the process to install OKai with CUDA support?

  • Ophiuchus: To install OKai with CUDA support, you need to clone the repository from GitHub using git clone https://github.com/okcashpro/okai.git. Then navigate into the cloned directory and run npm install --include=optional sharp followed by npm install onnxruntime-node@1.19.0 and finally, start OKai with npm start. This setup ensures that CUDA is utilized for GPU acceleration during the installation process.

  • Is there a significant advantage to using Ollama over Llama-cpp?

    • ferric | stakeware.xyz: The main advantage of using Ollama instead of Llama-cpp seems to be better performance and compatibility, especially on M1 chips. It's also mentioned that running larger models like the 70b model might cause issues with an M1 chip freezing, suggesting a preference for smaller models or alternative solutions such as running them in runpod environments.
  • How can I set up Ollama to use remote URLs and start it?

    • Ophiuchus: To set up Ollama using remote URLs, you should first ensure that your environment variable is correctly pointing to the desired URL by setting remote_ollama_url in your configuration. Then, initiate Ollama with the command start ollama. This setup allows you to use a remote version of Ollama for your projects or characters.
  • What are some considerations when using larger models like Llama 7b?

    • big dookie: When working with larger models such as Llama 2 7b, it's essential to evaluate the model's performance and compatibility with your current setup. For instance, you might need to adjust scraping techniques for transcript analysis or consider using audiovisual input prompts that can act as relevant responses based on a conversation. It's also crucial to manage scope creep by starting with text replies before moving onto more complex functionalities.

Who Helped Who​

  • Ophiuchus helped stakeware.xyz with setting up a local model by suggesting to use ollama instead of llama-cpp and providing instructions on how to clone, install dependencies, and start it in runpod for better performance.
  • Whobody encouraged big dookie's efforts in experimenting with different models (7b) for transcript scraping and audiovisual input prompt responses, acknowledging the potential scope creep but advising to focus on text replies first.

Action Items​

  • Technical Tasks

  • Clone the okai repository and install dependencies, including optional sharp package (mentioned by Ophiuchus)

  • Install onnxruntime-node@1.19.0 (mentioned by Ophiuchus)

  • Update README for installation instructions using ollama instead of llama-cpp (mentioned by Ophiuchus)

  • Experiment with running models in runpod and setting remote ollama URL environment variable (suggested by Ophiuchus)

  • Documentation Needs

    • Update the README file to reflect changes for installing using ollama instead of llama-cpp (requested by Ophiuchus)
  • Feature Requests

    • Explore running audiovisual input prompt as a response in transcript scraping with larger models like 7b or higher (suggested by big dookie)
  • Community Tasks

    • Share experiences and progress on using ollama for local model implementations within the community (led by Ophiuchus, whobody, and big dookie)