Text generation web UI - Deep Reason Extension
An extension for text-generation-webui inspired by OpenAI's o1 model that makes LLMs analyze your inputs in detail before responding, with the goal of improving response quality.
Like DeepSeek's R1 model, it adds a "thinking step" to any LLM you use.
Compatibility
- Any model you already use
- All backends (Transformers, llama.cpp, ExLlamaV2)
- Both instruct and chat-instruct modes
How it works
The extension automatically activates for every message sent in the Chat tab, with its own settings menu for customization. It also works through the API if you use /chat/completions
.
When you send a message, Deep Reason creates an intermediate reply where the model analyzes your input in detail before generating the main reply.
See it in action
Here's a simple example:
Your input:
"I need to choose between PostgreSQL and MongoDB for my application's database. The app needs to handle user profiles, activity logs, and real-time analytics."
Behind the scenes, the model analyzes:
ANALYSIS:
Alright, let's try to figure this out. The message is about choosing between PostgreSQL and MongoDB for a database, and it’s for an application that needs to handle user profiles, activity logs, and real-time analytics.
First, let's consider the requirements mentioned: user profiles, activity logs, and real-time analytics. Each of these has different characteristics and might benefit from different types of databases.
Starting with user profiles, these typically involve structured data like names, email addresses, dates of birth, and so on. This kind of data fits well with relational databases like PostgreSQL, which are designed for structured, tabular data and can enforce data integrity and relationships between different pieces of information.
Next, there are activity logs. These can get quite large over time and might benefit from a database that can handle large volumes of data efficiently. Both PostgreSQL and MongoDB can handle large datasets, but MongoDB’s schema-less nature might be advantageous here, allowing for more flexible data structures as logging needs evolve.
Now, real-time analytics is another important requirement. This is where things get a bit more complex. Real-time analytics often requires fast read and write operations, and depending on the scale and complexity, a NoSQL database like MongoDB might offer better performance and scalability. However, PostgreSQL also has powerful indexing and querying capabilities that can make it very effective for analytical queries.
One thing to consider is the nature of the analytics. If the analytics involve complex joins and transactions, PostgreSQL’s ACID compliance and strong relational capabilities could be a huge advantage. On the other hand, if the analytics are more about handling massive amounts of data with less rigid structures, MongoDB’s flexibility and speed could be more suitable.
There are also considerations beyond just the data requirements. For example, what is the development team’s familiarity with each database? If the team is more comfortable with SQL and relational databases, PostgreSQL might be the better choice. Conversely, if the team has experience with NoSQL databases and prefers the flexibility they offer, MongoDB could be more appropriate.
Another consideration is the ecosystem and tools available for each database. PostgreSQL has a rich ecosystem of tools for management, monitoring, and analytics, which could be crucial for maintaining and optimizing the database. MongoDB also has its own set of tools and integrations that might be more aligned with certain types of applications and use cases.
Finally, it’s important to think about future scalability and maintenance. How easy will it be to scale the chosen database as the application grows? Will the database be easy to maintain and troubleshoot? These are questions that need to be answered based on the specifics of the project and the resources available.
So, there are a lot of factors to consider here. It’s not just about the current needs but also about future growth and the team’s capabilities. Maybe it would help to create a list of criteria and score each database against those criteria to make a more informed decision.
The result: A more comprehensive and thoughtful response, because the model has taken time to consider different aspects of your question:
For this example, Mistral-Large-Instruct-2407 was used.
Note: While the analysis step isn't visible in the UI, you can see it in your terminal window if you're curious about how the extension processes your inputs:
These analysis steps are temporary and don't get added to your permanent chat history — they're used once and then discarded. When you ask follow-up questions in a conversation, each new question receives its own fresh analysis, independent of previous ones.
Frequently Asked Questions
- I'm using DeepSeek R1 or another model that "thinks" before responding. Can I use this extension?
Yes. The extension works differently from and in addition to any model's built-in reasoning capabilities. You can use it with any model to add an extra layer of analysis to their responses.
- Does this work with models that don't have built-in thinking steps?
Yes. The extension works with models like Llama 3.3, Phi-4, Gemma 2, and Mistral Large, making them analyze inputs before responding.
- Will this make my model slower?
The extension does add one extra generation step, but the improved response quality is often worth the additional time.
- Can I control or customize how the extension analyzes inputs?
Yes, the extension includes optional settings that let you adjust how it processes your inputs if you want to fine-tune its behavior. You can also enable or disable it through the UI.
What's included
- Complete source code for the extension
- Simple installation instructions (extract and enable)
- EULA