Workflows

End-to-end local AI setups that actually work — coding assistants, RAG pipelines, document chat, and automation stacks. Step-by-step, hardware requirements included.

15 articles

Sort:

Workflow

Aider + Ollama: Running an AI Pair Programmer Entirely Offline

Stop Aider calling OpenAI—lock to local Ollama. 16 GB VRAM runs 30B+8B models at 22 tok/s, but 8 GB cards OOM without architect-only mode.

April 18, 2026

aiderollamalocal-llm

Workflow

Chat with Your PDFs Locally: AnythingLLM + Ollama Setup Guide

PDF RAG returns garbage? Fix embedding, chunking, GPU passthrough — 0.89 top-5 accuracy possible, but 8 GB cards force CPU fallback.

April 18, 2026

anythingllmollamalocal-rag

Workflow

DeepSeek R1 Local Setup: Which Hardware Tiers Run Which Variants

8GB GPUs hit the wall at 14B, 24 GB runs 32B at 18 tok/s — but 70B needs 2 cards or 48 GB unified. Exact VRAM math per quant inside.

April 18, 2026

deepseek-r1local-llmollama

Workflow

LM Studio as a Local OpenAI-Compatible API Server

OpenAI SDK fails on local endpoints—fix 3 lines for 35 tok/s inference, but watch the 4K context trap that silent-truncates.

April 18, 2026

LM StudioOpenAI APIlocal LLM

Workflow

Home Assistant + Local LLM: Truly Private Voice and Chat Automation

Alexa sends everything to cloud. This Home Assistant + Ollama pipeline runs 100% local — 2.3s response time, but requires 7B model minimum.

April 18, 2026

home assistantollamalocal llm

Workflow

LM Studio vs Ollama vs Open WebUI: Which Backend for Which Use Case?

Tried 2 backends? LM Studio wins beginners, Ollama owns Linux, Open WebUI needs a backend — here's the 6-criteria matrix to pick once, skip the rewrite.

April 18, 2026

LM StudioOllamaOpen WebUI

Workflow

Build a Local AI Agent with LangChain + Ollama: Tool Use Without the Cloud

Your local agent ignores tools or loops forever? Qwen3 14B runs 28 tok/s with 91% tool success — but only if you pull the right Ollama tag. Here's the fix.

April 18, 2026

langchainollamalocal llm

Workflow

Build a Local Coding Assistant: Qwen3 + Ollama + Continue.dev in VS Code

Build a private Copilot in 30 min — Qwen3 14B at 28 tok/s locally with 32k context, but only if you disable Continue.dev's hidden cloud fallback first.

April 18, 2026

qwen3ollamacontinue.dev

Workflow

Local LLM + Obsidian: Build a Private Second Brain Assistant

Stop sending notes to OpenAI—build a private Obsidian AI with local embeddings. 10K notes indexed in 90 min on RTX 3060, but 8 GB GPUs hit the wall.

April 18, 2026

obsidianlocal-llmollama

Workflow

Connecting Local LLMs to the Web: Perplexica + SearXNG + Open WebUI

Your local LLM is stuck offline. Add web search with this 3-container stack—2.3 GB RAM, 4.2s latency. Catches: version pins matter, CORS breaks silently.

April 18, 2026

local-llmweb-searchperplexica

Workflow

n8n + Ollama: Build Local AI Automation Workflows Without the Cloud

Cloud automation bills stacking up? Build self-hosted n8n + Ollama workflows for $0 per task — but 8 GB GPUs hit the wall at 7B models. Here's the fix.

April 18, 2026

n8nollamalocal llm

Workflow

Ollama + Open WebUI: Complete Setup Guide with RAG

Docker shows No models found and RAG upload spins forever—this guide delivers 45 tok/s local document chat once you fix the 0.0.0.0 bind.

April 18, 2026

ollamaopen-webuirag

Workflow

Open WebUI Model Router: Use Different Models for Different Tasks

Stop switching models manually. Auto-route 7B for chat, 70B for code—34 tok/s vs 11 tok/s. Needs 22 GB VRAM, 3 Ollama instances. Here's the YAML.

April 18, 2026

open-webuiollamamodel-routing

Workflow

RAG Pipeline from Scratch: Embedding, Chunking, and Retrieval with Local Models

AnythingLLM hiding retrieval failures? Build RAG with nomic-embed-text + ChromaDB in 150 lines. 23ms latency, 16 GB VRAM — but chunking breaks precision.

April 18, 2026

RAGnomic-embed-textChromaDB

Workflow

vLLM Production Setup: OpenAI-Compatible API Server for Your Homelab

24 GB GPU crashes with 3+ users? vLLM production setup serves 4 clients at 28 tok/s — only with correct --max-num-seqs. Config inside.

April 18, 2026

vLLMOpenAI APIlocal LLM