AI-Discord-Bot/DATABASE_MIGRATION.md
milo 5f8c93ff69 🗄️ Add SQLite database system with JSON fallback and memory controls
Implement configurable database backends (SQLite/JSON) with unified memory
management, automated migration, Docker support, and privacy controls.
Maintains full backward compatibility while enabling future PostgreSQL/ChromaDB.
2025-10-10 13:04:48 -04:00

186 lines
No EOL
5.2 KiB
Markdown

# Database System Migration Guide
## Overview
The Discord bot now supports multiple database backends for storing user profiles and conversation memory:
- **SQLite**: Fast, reliable, file-based database (recommended)
- **JSON**: Original file-based storage (backward compatible)
- **Memory Toggle**: Option to completely disable memory features
## Configuration
Edit `src/settings.yml` to configure the database system:
```yaml
database:
backend: "sqlite" # Options: "sqlite" or "json"
sqlite_path: "data/bot_database.db"
json_user_profiles: "user_profiles.json"
json_memory_data: "memory.json"
memory_enabled: true # Set to false to disable memory completely
```
## Migration from JSON
If you're upgrading from the old JSON-based system:
1. **Run the migration script:**
```bash
python migrate_to_database.py
```
2. **What the script does:**
- Migrates existing `user_profiles.json` to the database
- Migrates existing `memory.json` to the database
- Creates backups of original files
- Verifies the migration was successful
3. **After migration:**
- Your old JSON files are safely backed up
- The bot will use the new database system
- All existing data is preserved
## Backend Comparison
### SQLite Backend (Recommended)
- **Pros:** Fast, reliable, concurrent access, data integrity
- **Cons:** Requires SQLite (included with Python)
- **Use case:** Production bots, multiple users, long-term storage
### JSON Backend
- **Pros:** Human-readable, easy to backup/edit manually
- **Cons:** Slower, potential data loss on concurrent access
- **Use case:** Development, single-user bots, debugging
## Database Schema
### User Profiles Table
- `user_id` (TEXT PRIMARY KEY)
- `profile_data` (JSON)
- `created_at` (TIMESTAMP)
- `updated_at` (TIMESTAMP)
### Conversation Memory Table
- `id` (INTEGER PRIMARY KEY)
- `channel_id` (TEXT)
- `user_id` (TEXT)
- `content` (TEXT)
- `context` (TEXT)
- `importance_score` (REAL)
- `timestamp` (TIMESTAMP)
### User Memory Table
- `id` (INTEGER PRIMARY KEY)
- `user_id` (TEXT)
- `memory_type` (TEXT)
- `content` (TEXT)
- `importance_score` (REAL)
- `timestamp` (TIMESTAMP)
## Code Changes
### New Files
- `src/database.py` - Database abstraction layer
- `src/memory_manager.py` - Unified memory management
- `src/user_profiles_new.py` - Modern user profile management
- `migrate_to_database.py` - Migration script
### Updated Files
- `src/enhanced_ai.py` - Uses new memory manager
- `src/bot.py` - Updated memory command imports
- `src/settings.yml` - Added database configuration
- `src/memory.py` - Marked as deprecated
## API Reference
### Memory Manager
```python
from memory_manager import memory_manager
# Store a message in memory
memory_manager.analyze_and_store_message(message, context_messages)
# Get conversation context
context = memory_manager.get_conversation_context(channel_id, hours=24)
# Get user context
user_info = memory_manager.get_user_context(user_id)
# Format memory for AI prompts
memory_text = memory_manager.format_memory_for_prompt(user_id, channel_id)
# Check if memory is enabled
if memory_manager.is_enabled():
# Memory operations
pass
```
### Database Manager
```python
from database import db_manager
# User profiles
profile = db_manager.get_user_profile(user_id)
db_manager.store_user_profile(user_id, profile_data)
# Memory storage (if enabled)
db_manager.store_conversation_memory(channel_id, user_id, content, context, score)
db_manager.store_user_memory(user_id, memory_type, content, score)
# Retrieval
conversations = db_manager.get_conversation_context(channel_id, hours=24)
user_memories = db_manager.get_user_context(user_id)
# Cleanup
db_manager.cleanup_old_memories(days=30)
```
## Troubleshooting
### Migration Issues
- **File not found errors:** Ensure you're running from the bot root directory
- **Permission errors:** Check file permissions and disk space
- **Data corruption:** Restore from backup and try again
### Runtime Issues
- **SQLite locked:** Another process may be using the database
- **Memory disabled:** Check `memory_enabled` setting in `settings.yml`
- **Import errors:** Ensure all new files are in the `src/` directory
### Performance
- **Slow queries:** SQLite performs much better than JSON for large datasets
- **Memory usage:** SQLite is more memory-efficient than loading entire JSON files
- **Concurrent access:** Only SQLite supports safe concurrent access
## Backup and Recovery
### Automatic Backups
- Migration script creates timestamped backups
- Original JSON files are preserved
### Manual Backup
```bash
# SQLite database
cp src/data/bot_database.db src/data/bot_database.db.backup
# JSON files (if using JSON backend)
cp src/user_profiles.json src/user_profiles.json.backup
cp src/memory.json src/memory.json.backup
```
### Recovery
1. Stop the bot
2. Replace corrupted database with backup
3. Restart the bot
4. Run migration again if needed
## Future Extensions
The database abstraction layer is designed to support additional backends:
- **PostgreSQL**: For large-scale deployments
- **ChromaDB**: For advanced semantic memory search
- **Redis**: For high-performance caching
These can be added by implementing the `DatabaseBackend` interface in `database.py`.