Vector Theta

Function Calling with llama3 model
A case study of designing and fine-tuning llama3 model to support function calling
By Michael Hu
July 6, 2024 11:25 pm
20 min read
This post discuss how we designed and fine-tuned llama3 model to support function calling, a immense potential for adapting LLM to real-world business cases. We also provide an open-source project Llama3-FunctionCalling that anyone can access.
Retrieval Augmented Generation (RAG) system for a private knowledge base
A simple and clean implementation of Retrieval Augmented Generation (RAG) with open-source embedding models and LLaMA
By Michael Hu
March 28, 2024 2:43 pm
15 min read
In this post, we delve into the workings of the Retrieval Augmented Generation (RAG) system and introduce RAG-LLaMA. This open-source project showcases a straightforward implementation of Retrieval Augmented Generation (RAG), incorporating open-source embedding models and the LLaMA chat model. We build a very simple chatbot that answers questions about Tesla cars, demonstrating RAG's potential for various private knowledge base tasks.
Introducing DPO-LLaMA
A clean PyTorch implementation of Direct Preference Optimization (DPO) to fine-tune LLaMA models
By Michael Hu
March 11, 2024 4:10 pm
2 min read
Introducing DPO-LLaMA. This open-source project features a clean PyTorch implementation of Direct Preference Optimization (DPO) to fine-tune LLaMA model to follow human preference.
QLoRA-LLM - A custom implementation of Quantized LoRA for fine-tuning a LLM
4-bit Quantized Low-Rank Adaptation (LoRA) LLM fine-tuning decoupled from Hugging Face
By Michael Hu
December 28, 2023 9:30 pm
10 min read
In this post, we discuss quantized LoRA and introduce QLoRA-LLM, my new open-source project. This project showcases a straightforward custom implementation of Quantized LoRA (QLoRA) for fine-tuning a language model (LLM), employing fundamental tools like PyTorch and Bitsandbytes, independent of Hugging Face. Such a tailored implementation of QLoRA proves highly beneficial for fine-tuning a personalized LLM model.
Reinforcement Learning from Human Feedback (RLHF) for Large Language Model (LLM)
A PyTorch implementation of OpenAI's InstructGPT paper to train and fine-tune LLaMA model to align with human preferences, with support for k-bit quantization and Low-Rank Adaptation (LoRA) fine-tuning
By Michael Hu
September 17, 2023 8:00 pm
10 min read
In this post, we discuss how reinforcement learning from human feedback (RLHF) works, and we introduce InstructLLaMA. This open-source project showcases a PyTorch implementation of OpenAI's InstructGPT paper, completely independent of third-party tools like Hugging Face. Here, we substitute the GPT model with Meta's LLaMA pre-trained model. InstructLLaMA facilitates various stages including dataset preparation, pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF) to train and fine-tune the LLaMA2 model, enabling it to adhere to human instructions. This is akin to InstructGPT or ChatGPT but on a smaller scale.
Introducing MiniGPT
A PyTorch implementation of pre-training and fine-tuning scripts to train and fine-tune GPT-2 models
By Michael Hu
July 17, 2023 10:00 pm
2 min read
Introducing MiniGPT. This open-source project features PyTorch implementation of OpenAI's GPT-2 model. It supports dataset preparation, pre-training, fine-tuning, and distributed training with PyTorch FSDP.
Introducing MuZero
A PyTorch implementation of DeepMind's MuZero agent to do planning with a learned model
By Michael Hu
July 6, 2022 10:00 pm
2 min read
Introducing MuZero. This open-source project features PyTorch implementation of DeepMind's MuZero agent. MuZero agent supports turn-based, two-player, zero-sum games, as well as single-player games like Atari games.
Introducing AlphaZero
A PyTorch implementation of DeepMind's AlphaZero agent to play two-player, zero-sum strategy board games like Go and Gomoku.
By Michael Hu
June 14, 2022 10:30 am
2 min read
Introducing AlphaZero. An open-source project features PyTorch implementation of DeepMind's famous AlphaZero agent to play Go and Free-style Gomoku board games.
Introducing Deep RL Zoo
A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar
By Michael Hu
May 3, 2022 9:00 pm
2 min read
Introducing Deep-RL-Zoo. This open-source project features a collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar. Including SOTA algorithms like DQN, Rainbow, PPO, RND, R2D2, and Agent57.

A case study of designing and fine-tuning llama3 model to support function calling

A simple and clean implementation of Retrieval Augmented Generation (RAG) with open-source embedding models and LLaMA

A clean PyTorch implementation of Direct Preference Optimization (DPO) to fine-tune LLaMA models

4-bit Quantized Low-Rank Adaptation (LoRA) LLM fine-tuning decoupled from Hugging Face

A PyTorch implementation of OpenAI's InstructGPT paper to train and fine-tune LLaMA model to align with human preferences, with support for k-bit quantization and Low-Rank Adaptation (LoRA) fine-tuning

A PyTorch implementation of pre-training and fine-tuning scripts to train and fine-tune GPT-2 models

A PyTorch implementation of DeepMind's MuZero agent to do planning with a learned model

A PyTorch implementation of DeepMind's AlphaZero agent to play two-player, zero-sum strategy board games like Go and Gomoku.

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar