How CosmaSense Works

A deep dive into the technology that powers local-first AI search

The Complete Pipeline

CosmaSense uses a sophisticated multi-stage pipeline to transform your files into searchable, semantically-aware content. Here's how each stage works:

File Discovery & Monitoring

CosmaSense starts by scanning the directories you've configured for indexing. Using efficient file system watchers, it monitors for new, modified, or deleted files in real-time.

Key Technologies:

Native file system watchers (FSEvents on macOS, inotify on Linux)
Efficient directory traversal with configurable exclusion patterns
Smart filtering to ignore temporary files, build artifacts, and hidden files
Change detection using file hashes and modification timestamps

Content Extraction & Parsing

Once a file is discovered, CosmaSense extracts its textual content using specialized parsers for different file types. This ensures that metadata and formatting are preserved where relevant.

Supported File Types:

Documents:

PDF (text extraction)
Microsoft Office (Word, Excel, PowerPoint)
LibreOffice formats (ODT, ODS, ODP)
Plain text and Markdown

Code & Configuration:

Source code (Python, JavaScript, Java, Go, Rust, etc.)
Configuration files (JSON, YAML, TOML, XML)
Shell scripts and batch files
Web files (HTML, CSS)

AI-Powered Summarization

The extracted content is processed by a local Large Language Model (LLM) to generate concise summaries. These summaries capture the key concepts and main ideas, making search results more informative.

How It Works:

Runs entirely on your local machine - no cloud API calls
Uses efficient quantized models optimized for CPU inference
Generates both short (1-2 sentences) and detailed summaries
Extracts key entities, topics, and themes from content

Vector Embedding Generation

The most critical step for semantic search: converting text into high-dimensional vectors that capture semantic meaning. Documents with similar meanings will have vectors that are close together in this vector space.

Technical Details:

Uses state-of-the-art sentence transformer models
Generates 384 or 768-dimensional vectors depending on model choice
Processes text in chunks to handle large documents
Local inference with ONNX runtime for optimal performance

Indexing & Storage

All extracted information - raw text, summaries, embeddings, and metadata - is stored in an optimized local database designed for fast retrieval.

Storage Architecture:

SQLite for structured metadata and full-text search indices
Specialized vector database (e.g., FAISS, Qdrant) for similarity search
Incremental updates to minimize reindexing overhead
Compression to reduce storage requirements

Hybrid Search Execution

When you search, CosmaSense combines multiple search strategies to deliver the most relevant results, balancing semantic understanding with traditional keyword matching.

Search Strategy:

Vector Search: Finds semantically similar content using cosine similarity
Full-Text Search: Traditional keyword matching with BM25 ranking
Result Fusion: Combines and ranks results from both methods
Re-ranking: Optional LLM-based re-ranking for enhanced relevance

Architecture Components

CosmaSense is built with a modular architecture that separates concerns and allows for flexible deployment options.

dns

Backend Core

The heart of CosmaSense, handling all indexing, search, and AI operations. Written in Rust for performance and safety.

Indexing engine and file watchers
Vector and full-text search implementation
REST API server for client interfaces

terminal

CLI Tool

Command-line interface for power users and automation. Perfect for scripting and CI/CD integration.

Index management commands
Query and search operations
Configuration and maintenance

view_in_ar

Terminal UI (TUI)

Interactive terminal interface with rich visualizations. Built with modern TUI frameworks for an intuitive experience.

Real-time search with instant previews
Interactive result browsing
Index statistics and monitoring

desktop_mac

macOS Native App

Beautiful native GUI for macOS with SwiftUI. Seamlessly integrates with macOS features like Spotlight.

Native macOS design language
Menu bar integration
Global keyboard shortcuts

Ready to Try It?

Experience the power of local-first AI search. Download CosmaSense and start exploring your files like never before.

Read the Docs