FTS5 with Porter stemming treats 'kindness' and 'kind' as the same root word,
which caused stemmed matches to rank equally with exact matches. This adds a
secondary relevance boost on top of BM25 to prioritize exact matches.
Relevance scoring now:
- BM25 base score (from FTS5)
- +100 for exact phrase match in verse text
- +50 per exact word match (e.g., 'kindness' exactly)
- +10 per partial/stemmed match (e.g., 'kind' via stemming)
Example: Searching for 'kindness'
- Verses with 'kindness': BM25 + 150 (phrase + word)
- Verses with 'kind': BM25 + 10 (partial match)
This ensures exact matches appear first while still benefiting from Porter
stemming to find all word variations.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The search index build was extremely slow (25 v/s) due to individual INSERT
statements without transactions. This refactors to use batch inserts with
SQLite transactions, which should increase speed by 100-1000x.
Changes:
- Add insertVersesBatch() method in searchDatabase.js
- Use BEGIN TRANSACTION / COMMIT for batch operations
- Collect all verses for a version, then insert in one transaction
- Removed per-verse insert calls from buildSearchIndex.js
- Batch insert ~31,000 verses per version in single transaction
Expected performance improvement:
- Before: 25 verses/second (~82 minutes for 124k verses)
- After: 2,000-5,000 verses/second (~30-60 seconds for 124k verses)
SQLite is optimized for batched transactions - this is the standard pattern
for bulk data loading.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Ensures the data directory exists before attempting to open the SQLite database
during Docker image build. This fixes SQLITE_CANTOPEN error during build-search-index.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>