Keep It Local, Keep It Yours: Build a Private AI Assistant with Khadas Mind

More and more people are trying to run large AI models locally, eager to enjoy the power of AI privately on their own devices. But once they dive in, reality hits—frequent connection timeouts due to network issues, and models that often spit out irrelevant or nonsensical answers, especially when asked anything remotely complex. In the end, editing the AI's response feels more exhausting than just writing it yourself!

So, is there a localized AI system that is truly suitable for you?

The smartest AI is the one you control.

Imagine if AI could really become your personal assistant, what capabilities would it have?

Can you see the difference?

The AI Large Model + Your Data = Your Own Private AI Assistant!

When it comes to local deployment, mini PCs are the way to go—and Khadas Mind

leads the pack.

With the goal of running AI privately on personal machines, many people are exploring ways to deploy large models either in the cloud or on their desktops and laptops at home. However, after speaking with users, we've found that mini modular PCs may actually be the most effective way to run large models locally. Here's why:

Energy-Efficient, High-Performance, Always-On

Unlike desktops or servers that can consume hundreds of watts, mini modular PCs typically operate at just 20W to 50W—, keeping your electricity costs low. With 24/7 stable operation, your personal AI assistant stays ready whenever you need it.

Compact and Space-Saving

Unlike traditional PCs with bulky towers or laptops that can clutter your workspace, a mini modular PC like Khadas Mind fits neatly into any setup—it can sit quietly in a corner or even tuck away in a drawer without getting in the way. It also runs whisper-quiet and stays cool, outperforming typical desktops in noise control and laptops in thermal efficiency—perfect for long-term, uninterrupted use.

Local AI, No Cloud? No Problem.

Cloud-based AI may seem convenient, but it comes with serious limitations that make it less ideal for everyday use:

Subscriptions and API call fees add up quickly. For developers or power users, interacting with large models through the cloud can quietly rack up a hefty bill—especially when APIs are pay-per-use and increasingly restricted.

Every word you type gets sent to the cloud. Whether it's personal chats, sensitive work documents, or proprietary code, your data is exposed the moment it leaves your device—leaving you vulnerable to leaks, breaches, or unauthorized usage.

No internet? No AI. Even a weak or unstable connection can make cloud AI sluggish or completely unresponsive. Your assistant turns from "super smart" to "super slow"—or worse, just doesn't respond at all.

The combination of a mini PC and a locally distilled model offers the perfect balance of performance, privacy, and reliability. It delivers sufficient computing power while keeping your data secure and your workflow stable—making it the ideal solution for individual users and small teams looking to deploy AI efficiently and affordably.

Now, we’ve unlocked another powerful way to use Khadas Mind—turning it into a compact, always-on AI hub right in your home or office.

Place It Anywhere

As one of the smallest high-performance PCs on the market, Khadas Mind fits effortlessly on your desk, shelf, or even inside a drawer. Its ultra-slim, lightweight design makes it easy to move around without hassle.

Instantly Scalable

Need more power? Just connect Mind Graphics via the Mind Link interface to supercharge your AI performance in seconds.

Always Ready

Thanks to the built-in battery, Khadas Mind stays powered during short moves or power interruptions. Your AI assistant remains on standby—no shutdowns, no reboots, just seamless performance.

Locally deploy large models + RAGFlow to create your own private AI assistant.

Today, we’ll walk you through how to build your very own AI assistant using the Khadas Mind. Paired with RAGFlow, your AI won’t just answer questions — it will understand your study materials, work documents, and personal notes, evolving into a truly intelligent companion that knows you best.

For this demonstration, we’ll be using the Khadas Mind 2s along with the 16GB Mind Graphics eGPU module. This powerful setup offers more than enough performance for most local deployment needs, making it ideal for individual users and small teams alike.

Hardware Environment:

Khadas Mind Maker Kit, equipped with Intel® Core™ Ultra 7 processor 258V, 32GB memory
and 1TB hard drive.
16GB Mind Graphics, equipped with desktop-level 4060 Ti graphics card, and 16GB VRAM.
Power Consumption: 30W standby, 235W at max load.
Total Physical Volume: 2.5L

Installation Process

Prerequisites

Operating System: Windows 11 Home (English)
From Within Windows 11: Control Panel -> Programs -> Turn Windows features on or off, activate the following
–Virtual Machine Platform
–Windows Hypervisor Platform
–Windows Subsystem for Linux
After changing the above settings, restart your compute

We provide two installation methods for you:

1. Quick Deployment

We have created a software package that includes Ollama, Docker, and RAGFlow, and the Deepseek 14b model is included by default. After the installation is complete, you can use it with minimal configuration.

Download the installation package: https://dl.khadas.com/development/llm/ragflow.zip

Install Docker & Ollama:

Install the Docker application in the Installers folder and restart your device after the installation is complete.
Install the Ollama application in the Installers folder and restart your device after the installation is complete.

Run the script: After the above two items are installed, run the !_install.bat script.(Please ensure Khadas Mind is connected to the internet).

After the installation is complete, run RAGFlow according to 2.6 below, add the Ollamamodel and continue configuration.

2. Custom Deployment

You can adjust it according to your needs, including modifying the model version, etc.

2.1 Install & Run Ollama

Ollama download link: https://ollama.com/download/OllamaSetup.exe

After downloading, install Ollama and restart the system.

Press Win + R, then type “cmd” into the dialog box and press Enter, and then input the following command to download the chat model:

ollama pull deepseek-r1:14b

Then enter the following command to download the embed model:

ollama pull nomic-embed-text

Set environment variables. Go to Setting -> System -> About -> Advanced system settings -> Environment Variables... In the system variables window, click New...,

Input “OLLAMA_HOST” as the variable name,

Input “0.0.0.0:11434” as the variable value,

Restart required for changes to take effect.

2.2 Install Docker

Docker download link:

https://desktop.docker.com/win/main/amd64/Docker%20Desktop%20Installer.exe?utm_source=docker&utm_medium=webreferral&utm_campaign=dd-smartbutton&utm_location=module

After the download is complete, launch the installer and follow the instructions to install Docker.

2.3 Install git

Press Win + R, then type “cmd” into the dialog box and press Enter, then input the following command:

winget install --id Git.Git -e --source winget

2.4 Install wsl

Press Win + R, then type “cmd” into the dialog box and press Enter, then input the following command:

wsl –update

Wait for the updating process to complete and proceed to the next step.

2.5 Clone the RAGFlow repository

Create a folder named ragflow, right-click the folder, select Open in Terminal, and input the following command:

git clone https://github.com/infiniflow/ragflow.git

cd ragflow/docker

Do not close the Terminal. Start Docker Desktop and then continue this tutorial. If the following error message appears, execute wsl --update again in the Terminal:

After starting Docker Desktop, click Skip, and the following screen will appear, indicating that everything

is normal:

Return to Terminal and continue to execute the following command:

docker compose -f docker-compose.yml up –d

Under normal circumstances, Docker will pull the required image files as follows:

Wait for the system to download the required files. After downloading, it will be shown as below. RAGFlow will run when it is turned on by default.

2.6 Run RAGFlow and add the Ollama model

After restarting the computer and entering the system, run Docker Desktop.

Determine the server IP: Press Win + R, type “cmd” into the dialog box and press Enter, then input “ipconfig”, and check your IPv4 Address, such as 192.168.1.45.

Access RAGFlow through the browser: In the local browser, enter 127.0.0.1. Machines on the same network can also access it through 192.168.1.45

Click on Ollama and add the chat model first:. Add LLM -> Model Type, select chat.

The model Name is deepseek--r1:14b.

Base url is http://Ipv4 Address:11434 as shown above.

for example: http://192.168.1.45:11434. Fill in 32768 for Max Tokens.

Then add the embedding model:, Add LLM -> Model Type, select embedding.

Model name is nomic-embed-text:latest.

Base url and max tokens are the same as above.

Finally, from System Model Settings, add the chat model and embedding model.

So far, we have built both the local large language model and RAGFlow. Please join us in injecting "soul" or “personality” into your AI assistant.

Build Its Brain: Custom Local Data for a Truly Personal AI

Next, join us in using the deployed large language model to build your own personalized AI assistant.

Setting up the knowledge base

Open the RAGFlow page via the IPV4 address (make sure Docker Desktop is already running).

Go to Knowledge Base >Create knowledge base > Input a name.

From Knowledge Base > Configuration, select nomic-embed-text:latest for Embedding model and General for Chunk method. For first-time setup, it is recommended to select General for a better user experience, then save the settings.

Create New Knowledge Base File

Click Knowledge Base and select the knowledge base created in the previous step.

Click Add file and select the file you want to upload from local files.

After successful uploading, click the ▶️ button to parse the file. Wait for a moment until the parsing is complete before using it.

How to Use

Click Chat > Create an Assistant, enter a custom name in the Assistant Setting page, and select the custom knowledge base name you just created in Knowledge bases.

In the Model Setting page, select the deepseek-r1:14b model and click OK.

Click Chat to start a new conversation and interact with your newly born AI assistant via the chat box.

As shown above, we have completed building your personalized and private AI assistant. You can upload additional personal documents to its knowledge base in order to truly build your own personalized AI assistant.

Final Thoughts

AI is becoming deeply integrated into the hardware ecosystem, and Khadas is redefining the innovation model for modular, multi-scenario smart devices—designed to meet real-world user needs across diverse environments. We're building a closed-loop system driven by user insights, scenario-based development, and bold, iterative breakthroughs. Our focus remains on delivering high performance, high integration, and strong expandability. We’d love to hear your thoughts—every idea and suggestion fuels the evolution of smarter, more adaptable hardware as we shape the future of intelligent, cross-scenario computing together.

Best regards,  

Khadas Team

Keep It Local, Keep It Yours: Build a Private AI Assistant with Khadas Mind

Locally deploy large models + RAGFlow to create your own private AI assistant.

Installation Process

Build Its Brain: Custom Local Data for a Truly Personal AI

Final Thoughts

Recent Posts

Comments