Z-Bot: Discord's Most Advanced AI Chatbot

Z-Bot started as a weekend experiment in my private Discord server. It became something I run for thousands of users.

What is it?

Z-Bot is an AI assistant for Discord that goes beyond a simple ChatGPT wrapper. It has:

Persistent memory — remembers context across conversations per user
Custom personalitites — not just “helpful assistant”, users can pick from dozens of premade personalities or develop their own, for their server, or personal use
Server-specific context — adapts to the server it’s in
Tool use — can search the web, fetch URLs, run code snippets, and most importantly, manage the Discord server it’s in, including editing, creating, deleting, and moving rolls, channels, categories, while also handling moderation

The tech stack

# Core loop (simplified)
async def on_message(message):
    context = await get_user_context(message.author.id)
    response = await llm.generate(
        prompt=message.content,
        context=context,
        tools=AVAILABLE_TOOLS,
    )
    await message.channel.send(response.text)
    await save_context(message.author.id, context)

The backend is Python with discord.py. LLM calls go through the OpenAI API. Memory is stored in a lightweight SQLite database with periodic summarization to keep context windows manageable.

The interesting problems

Memory that doesn’t explode — You can’t just append every message to a context window forever. I built a sliding window with periodic summarization: every N messages, the oldest chunk gets summarized and the summary replaces the raw messages.

Keeping Context — Discord makes it hard to interact with bots of this nature. Through a series of reply checks, @mentions, and in rare cases, message content validations, Z-Bot can keep a conversaion in a very natural way that doesn’t eat as many tokens as other alternatives.

Personality consistency — System prompts alone aren’t enough. The bot’s “character” needed to be consistent even when tools are involved, even in long conversations. This required careful prompt engineering and some fine-tuning experiments.

Rate limiting at scale — Discord servers can be spammy. Built a per-user token bucket rate limiter so one user can’t blow through API budget for a whole server.