feat: fix Chinese text search and duplicate chunk_id bug
- Add helper functions to extract text from nested content structure - Update SearchResult to include uuid field - Add PostgreSQL function get_chunk_by_chunk_id_and_uuid to handle duplicate chunk_ids - Update Qdrant search functions to extract uuid from payload - Change embedding model to nomic-embed-text-v2-moe:latest - Update Qdrant collection name to momentry_rule1 - Fix MongoDB authentication and disable cache for development - Improve error handling in processor.rs - Update documentation with new embedding model
This commit is contained in:
@@ -18,7 +18,7 @@ MOMENTRY_WORKER_BATCH_SIZE=5
|
||||
DATABASE_URL=postgres://accusys@localhost:5432/momentry
|
||||
|
||||
# MongoDB
|
||||
MONGODB_URL=mongodb://accusys:Test3200Test3200@localhost:27017/admin
|
||||
MONGODB_URL=mongodb://localhost:27017
|
||||
MONGODB_DATABASE=momentry
|
||||
|
||||
# Redis
|
||||
@@ -28,7 +28,7 @@ REDIS_PASSWORD=accusys
|
||||
# Qdrant Vector Database (same as production)
|
||||
QDRANT_URL=http://localhost:6333
|
||||
QDRANT_API_KEY=Test3200Test3200Test3200
|
||||
QDRANT_COLLECTION=chunks_v3
|
||||
QDRANT_COLLECTION=momentry_rule1
|
||||
|
||||
# Paths
|
||||
MOMENTRY_OUTPUT_DIR=/Users/accusys/momentry/output_dev
|
||||
@@ -51,7 +51,7 @@ MOMENTRY_CUT_TIMEOUT=3600
|
||||
MOMENTRY_DEFAULT_TIMEOUT=7200
|
||||
|
||||
# Cache Settings
|
||||
MONGODB_CACHE_ENABLED=true
|
||||
MONGODB_CACHE_ENABLED=false
|
||||
MONGODB_CACHE_TTL_VIDEOS=300
|
||||
MONGODB_CACHE_TTL_SEARCH=300
|
||||
MONGODB_CACHE_TTL_HYBRID_SEARCH=600
|
||||
|
||||
Reference in New Issue
Block a user