How CosmaSense Works

A deep dive into the technology that powers local-first AI search

The Complete Pipeline

CosmaSense uses a sophisticated multi-stage pipeline to transform your files into searchable, semantically-aware content. Here's how each stage works:

1

File Discovery & Monitoring

CosmaSense starts by scanning the directories you've configured for indexing. Using efficient file system watchers, it monitors for new, modified, or deleted files in real-time.

Key Technologies:

  • Native file system watchers (FSEvents on macOS, inotify on Linux)
  • Efficient directory traversal with configurable exclusion patterns
  • Smart filtering to ignore temporary files, build artifacts, and hidden files
  • Change detection using file hashes and modification timestamps
2

Content Extraction & Parsing

Once a file is discovered, CosmaSense extracts its textual content using specialized parsers for different file types. This ensures that metadata and formatting are preserved where relevant.

Supported File Types:

Documents:

  • PDF (text extraction)
  • Microsoft Office (Word, Excel, PowerPoint)
  • LibreOffice formats (ODT, ODS, ODP)
  • Plain text and Markdown

Code & Configuration:

  • Source code (Python, JavaScript, Java, Go, Rust, etc.)
  • Configuration files (JSON, YAML, TOML, XML)
  • Shell scripts and batch files
  • Web files (HTML, CSS)
3

AI-Powered Summarization

The extracted content is processed by a local Large Language Model (LLM) to generate concise summaries. These summaries capture the key concepts and main ideas, making search results more informative.

How It Works:

  • Runs entirely on your local machine - no cloud API calls
  • Uses efficient quantized models optimized for CPU inference
  • Generates both short (1-2 sentences) and detailed summaries
  • Extracts key entities, topics, and themes from content
4

Vector Embedding Generation

The most critical step for semantic search: converting text into high-dimensional vectors that capture semantic meaning. Documents with similar meanings will have vectors that are close together in this vector space.

Technical Details:

  • Uses state-of-the-art sentence transformer models
  • Generates 384 or 768-dimensional vectors depending on model choice
  • Processes text in chunks to handle large documents
  • Local inference with ONNX runtime for optimal performance
5

Indexing & Storage

All extracted information - raw text, summaries, embeddings, and metadata - is stored in an optimized local database designed for fast retrieval.

Storage Architecture:

  • SQLite for structured metadata and full-text search indices
  • Specialized vector database (e.g., FAISS, Qdrant) for similarity search
  • Incremental updates to minimize reindexing overhead
  • Compression to reduce storage requirements
6

Hybrid Search Execution

When you search, CosmaSense combines multiple search strategies to deliver the most relevant results, balancing semantic understanding with traditional keyword matching.

Search Strategy:

  • Vector Search: Finds semantically similar content using cosine similarity
  • Full-Text Search: Traditional keyword matching with BM25 ranking
  • Result Fusion: Combines and ranks results from both methods
  • Re-ranking: Optional LLM-based re-ranking for enhanced relevance

Architecture Components

CosmaSense is built with a modular architecture that separates concerns and allows for flexible deployment options.

dns

Backend Core

The heart of CosmaSense, handling all indexing, search, and AI operations. Written in Rust for performance and safety.

  • Indexing engine and file watchers
  • Vector and full-text search implementation
  • REST API server for client interfaces
terminal

CLI Tool

Command-line interface for power users and automation. Perfect for scripting and CI/CD integration.

  • Index management commands
  • Query and search operations
  • Configuration and maintenance
view_in_ar

Terminal UI (TUI)

Interactive terminal interface with rich visualizations. Built with modern TUI frameworks for an intuitive experience.

  • Real-time search with instant previews
  • Interactive result browsing
  • Index statistics and monitoring
desktop_mac

macOS Native App

Beautiful native GUI for macOS with SwiftUI. Seamlessly integrates with macOS features like Spotlight.

  • Native macOS design language
  • Menu bar integration
  • Global keyboard shortcuts

Ready to Try It?

Experience the power of local-first AI search. Download CosmaSense and start exploring your files like never before.