Installation Guide¶
Overview¶
This guide covers installing the File Organizer system for deployment and administration.
System Requirements¶
Minimum Requirements¶
- Python 3.11 or higher
- 4GB RAM
- 10GB disk space (for models and application)
- Docker (optional, but recommended)
- Docker Compose 1.29+ (if using Docker)
Recommended Requirements¶
- Python 3.11 or higher
- 8GB+ RAM
- 20GB+ disk space
- Modern Linux distribution (Ubuntu 20.04+) or macOS
- Docker and Docker Compose
Installation Methods¶
Method 1: Docker (Recommended)¶
Prerequisites¶
- Docker 20.10+
- Docker Compose 1.29+
Steps¶
- Clone the repository:
-
Configure environment (see Configuration Guide)
-
Start services:
- Access the web UI:
Method 2: Manual Installation¶
Prerequisites¶
- Python 3.11+
- pip package manager
- Virtual environment tool (venv or poetry)
Steps¶
- Clone the repository:
- Create virtual environment:
- Install dependencies:
- Install Ollama (for AI models):
# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Pull required models
ollama pull qwen2.5:3b-instruct-q4_K_M
ollama pull qwen2.5vl:7b-q4_K_M
- Start the application:
Audio Processing Prerequisites¶
Audio transcription and metadata extraction require additional system dependencies beyond the Python packages.
FFmpeg¶
FFmpeg is required for audio format conversion (e.g. .m4a to .wav) and preprocessing before transcription.
Alternatively, download from ffmpeg.org and add to your PATH.
Note
FFmpeg is required for any audio format other than raw .wav. Without it, audio files in formats like .mp3, .m4a, .flac, and .ogg cannot be processed.
GPU Acceleration (Optional)¶
Audio transcription uses faster-whisper which benefits from GPU acceleration. CPU inference works but is significantly slower.
NVIDIA CUDA¶
For NVIDIA GPUs, install the CUDA Toolkit and cuDNN:
Verifying GPU Support in PyTorch¶
python3 -c "import torch; print('CUDA:', torch.cuda.is_available()); print('cuDNN:', torch.backends.cudnn.version())"
Tip
CPU-only inference works out of the box — GPU acceleration is optional. Apple Silicon users get hardware acceleration via MPS automatically.
Installing the Audio Pack¶
This installs the following packages:
| Package | Version | Purpose |
|---|---|---|
faster-whisper | >= 1.0.0 | Speech-to-text transcription |
torch | >= 2.1.0 | GPU acceleration for transcription |
mutagen | >= 1.47.0 | Audio metadata extraction |
tinytag | >= 1.10.0 | Lightweight metadata fallback |
pydub | >= 0.25.0 | Audio format manipulation |
ffmpeg-python | >= 0.2.0 | FFmpeg Python bindings |
Fallback Behavior¶
Warning
The torch package is approximately 2 GB. For CPU-only environments where download size is a concern, install the CPU-only variant:
If the audio pack is not installed, audio files (.mp3, .wav, .flac, .m4a, .ogg) are still detected and moved by the organizer but will not be transcribed or analyzed for content.
Verifying Audio Support¶
# Verify FFmpeg
ffmpeg -version
# Verify faster-whisper
python3 -c "from faster_whisper import WhisperModel; print('faster-whisper OK')"
# Verify audio metadata
python3 -c "import mutagen; print('mutagen OK')"
# Verify torch device
python3 -c "import torch; print('Device:', 'cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')"
Verification¶
Docker Verification¶
Manual Installation Verification¶
# Verify Ollama is running
ollama ps
# Check available models
ollama list
# Test the API
curl http://localhost:8000/api/v1/health
Next Steps¶
- See Deployment Guide for production setup
- See Configuration Guide for customization
- See Monitoring Guide for monitoring and maintenance