Discord Ai Knowledge Bot, Open Source

Discord AI Knowledge Bot, Open Source
The Discord AI Knowledge Bot is an open-source tool designed to enhance communication and knowledge sharing within Discord servers. This bot indexes server content and provides AI-powered chat functionality, all while keeping your data local and secure. By leveraging local vector databases and embeddings, along with OpenAI for final chat completion, the bot offers a powerful yet privacy-conscious solution for managing and retrieving information within your Discord community.
Benefits
- Local Data Storage: All Discord message data, embeddings, and the vector database reside on your machine, ensuring your information stays private and secure.
- AI-Powered Chat: Engage in natural language conversations with the bot, which uses context from server content to provide relevant and accurate responses.
- Context-Aware Search: Perform channel-specific or server-wide searches to quickly find the information you need.
- Easy Indexing: Index all text channels or specific channels with pagination support, making it simple to organize and retrieve server content.
- Self-Contained Operation: The bot runs entirely locally on your host machine, with the only external dependency being the OpenAI API for final chat completion.
Use Cases
- Community Support: Use the bot to quickly find past discussions and provide accurate information to community members.
- Knowledge Management: Keep track of important discussions, decisions, and announcements within your server.
- Educational Purposes: Ideal for educational servers where students and teachers can easily access and share information.
- Project Collaboration: Facilitate better collaboration by allowing team members to quickly find and reference past conversations and project updates.
Requirements
- Python 3.8 or higher
- Discord Bot Token
- OpenAI API Key (required for response generation)
Installation
- Clone the repository:
git clone <repository-url>cd discord_knowledge_bot
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp env.example .env
Edit the.env
file and add your actual values:
DISCORD_TOKEN=your_discord_bot_token_hereDISCORD_APP_ID=your_discord_application_id_hereOPENAI_API_KEY=your_openai_api_key_here
- Configure the bot by editing the
config.yaml
file to customize settings.
Usage
Starting the Bot
python main.py
Bot Commands
All commands are slash commands (/
) that appear in Discord's command interface.
Indexing Commands (Admin Only)
/index-server
: Index all text channels in the server/index-channel [channel]
: Index a specific channel (or current channel)/reindex-server
: Clear and reindex all text channels/reindex-channel [channel]
: Clear and reindex a specific channel
Management Commands
/help
: Show help information and available commands/status
: Show bot and indexing status/stats
: Show detailed statistics/clear
: Clear all indexed data (Admin only)
Chat Commands
/ask <question>
: Ask a question about this channel's content/ask-server <question>
: Ask a question about the entire server's content
Direct Mentions
Mention the bot with@bot_name
to chat directly! The bot will search through indexed content to provide relevant answers.
Context-Aware Search
The bot provides two types of search:1.Channel-Specific Search(/ask
): Searches only within the current channel's indexed content for more focused, relevant responses.2.Server-Wide Search(/ask-server
): Searches across all indexed channels in the server for broader context from the entire server.
Permissions
The bot needs "Read Message History" permission to index channels.
Configuration
The bot can be configured through theconfig.yaml
file, allowing you to customize settings such as the bot prefix, OpenAI model, and indexing parameters.
Architecture
The bot consists of several key components:1.Discord Bot Core: Main bot functionality, slash command handling, and event processing.2.Indexing System: Message collection with pagination, text processing and chunking, and ChromaDB storage management.3.AI Chat System: OpenAI integration, context building with channel filtering, and response generation.4.Utilities: Configuration management and text processing helpers.
Data Flow & Privacy
- Stays local: Discord message data, embeddings, and the vector database all reside on your machine.
- Sent to OpenAI (required today): For each response, the bot sends the user's prompt plus a small, selected set of retrieved context snippets to OpenAI for final response generation.
- Planned: Support for selecting alternative model providers via LlamaIndex so responses can be generated without OpenAI.
Technical Details
- Local Vector Database: Provides local vector storage with no external database required.
- Local Embeddings: Uses for local embedding generation with no external embedding API needed.
- OpenAI Integration: Only external dependency is OpenAI API for final chat completion (required today).
- Text Processing: MVP functionality with basic cleaning and chunking.
- Vector Search: ChromaDB with LlamaIndex for semantic search.
- Context Filtering: Channel-specific search with metadata filtering.
- Slash Commands: Modern Discord slash command interface.
- Permission System: Admin-only commands for destructive operations.
Model Providers (Planned)
We plan to support selecting among multiple LLM providers supported by LlamaIndex for response generation. Once implemented, you will be able to choose a provider in configuration; until then, OpenAI is required.
FAQ
- Do you send our data to OpenAI?
- Yes, today the bot sends the user prompt and selected retrieved context to OpenAI to generate the final response. Raw server data and the full vector database remain local.
- Does OpenAI train on our API data?
- No. Per OpenAI's API data usage policy, data submitted via the API is not used to train OpenAI models unless you explicitly opt in.
- What is "final chat completion"?
- The last step where an LLM takes your prompt plus retrieved context and generates the natural-language reply.
- Can I avoid OpenAI?
- Not yet. OpenAI is required today. Planned support will allow selecting other providers via LlamaIndex.
- How long does indexing take?
- Depends on server size and rate limits. For a mid-size server, initial indexing can take hours.
Command Examples
Indexing
/index-server/index-channel #general/reindex-server
Chat
/ask What was discussed about the new feature?/ask-server Who mentioned the meeting yesterday?
Management
/status/stats
Security
- Admin-Only Commands: Indexing and destructive operations require Administrator permissions.
- Permission Checks: All commands verify user permissions before execution.
- Confirmation Dialogs: Destructive operations require explicit confirmation.
- Error Handling: Comprehensive error handling and logging.
About
A RAG knowledge bot for Discord.
Comments
Please log in to post a comment.