Text to Minecraft Week 5
Welcome to Week 5 of our Text-to-Minecraft project! This week, we’ll focus on Retrieval Augmented Generation also known as RAG. You may remember hearing about this at Workshop 3 which was held a couple weeks ago. Don't worry if you were not present, the material here should be enough to get you started! This week is going to be pretty conceptual so it's going to be less coding intensive. If you are a bit behind on the code, this would be a good time to catch up!
1. Watch the video lectures
2. Download the files
- What is RAG?
- How is RAG used in the real world and in this project?
RAG (Retrieval-Augmented Generation) is a technique that combines the power of retrieval-based methods and large language models to enhance the accuracy and relevance of generated responses. It's designed to deal with the challenge that LLMs, while powerful, may not have all the required information stored in their internal parameters, especially for specific or rare knowledge.
Here's a breakdown of how RAG works in simple terms:
1. Retriever: This component searches a large dataset or knowledge base (like a document collection or database) to find the most relevant information based on a query. Think of it as looking things up on the web or in a library. The dataset is going to data that the LLM is not trained on. For example, businesses use RAG to integrate their private data into an LLM.
2. Generator (LLM): This is the language model, like GPT, that generates natural language responses. However, instead of relying only on what it already knows, it uses the information retrieved by the retriever to craft a more accurate response.
1. Query Input: The user asks a question or inputs a prompt.
2. Retrieval: The retriever looks through external sources, like documents or databases, to find relevant pieces of information related to the query.
3. Augmented Generation: The generator (LLM) takes both the user's query and the retrieved information to generate a final, informed response.
Let’s say you ask, "What day out of the week am I the most available?" A standard LLM has no way to answer that because it has no information on your private life. However, if you use RAG to give an LLM access to your calendar or to do list, it will be able to use that data to answer that question. Think of it as having your personalized LLM without having to train it from scratch! This saves a lot of time and cost.
For this project, RAG is a neat way to combine the ability to generate language with the ability to fetch Minecraft builds that it might not know about. In this project we'll be using Minecraft schematics as our knowledge base. LLM's are not heavily trained on niche subjects like Minecraft blueprints. Additionally, the file we are using are hard to parse, which add's another layer of complexity. For this project, we'll provide the files converted to JSON, which is a simpler format for the LLM to understand.
We have provided some more resources to learn about RAG and other LLM optimiziation methods. You do not have to watch them in full because some of them are quite lengthy. But watch them until you get a good grasp of what RAG is. Some of the videos are not specifically related to RAG, but are really insightful. You may not need this information now, but they might come in handy in the following weeks.
Again, for our knowledge base we will be using Minecraft schematics in a JSON format. We have provided a folder of schematics here. This is a big file, so make sure you have enough space to download them. Additionally, if you are using git, ignore the Minecraft schematic folder in the .gitignore file. This is useful because you do not need all the data in your repository.
1. Try to implement RAG by yourself! In part 2 we will be doing this, but if you have extra you can try using the resources provided to do it yourself.
2. Do more research into how companies use RAG. It is a simple, but powerful, tool that many companies are using to improve their LLM integrations.