feat: Phase 2.6 edges migration to Qdrant (TKG-only architecture)
Phase 2.6.1: co_occurrence_edges migration - build_co_occurrence_edges_from_qdrant() - Qdrant embeddings → frame grouping → YOLO objects - Result: 6679 edges (vs 6701 PostgreSQL) Phase 2.6.2: face_face_edges migration - build_face_face_edges_from_qdrant() - Qdrant embeddings → frame grouping → face pairs - mutual_gaze detection preserved - Result: 6 edges (exact match) Phase 2.6.3: speaker_face_edges migration - build_speaker_face_edges_from_qdrant() - Qdrant embeddings → trace_id frame ranges - SPEAKS_AS edge creation Architecture: - All edges use Qdrant payload (no face_detections queries) - PostgreSQL fallback for empty Qdrant - Estimated 3.6x performance improvement Testing: - Playground (3003): ✓ All Phase 2.6 logs verified - Edge counts: ✓ Close match with PostgreSQL - Fallback: ✓ Working Docs: - docs_v1.0/DESIGN/TKG_PHASE2_6_EDGES_MIGRATION.md - docs_v1.0/M4_workspace/2026-06-21_phase2_6_test.md
This commit is contained in:
@@ -170,7 +170,7 @@ class SelfASRXFixed:
|
||||
|
||||
def process(self, audio_path, output_path=None, file_uuid=None,
|
||||
max_speakers=10, quality_threshold=0.85,
|
||||
checkpoint_path=None):
|
||||
checkpoint_path=None, asr_segments=None):
|
||||
"""7 步 speaker diarization pipeline
|
||||
|
||||
Args:
|
||||
@@ -180,6 +180,7 @@ class SelfASRXFixed:
|
||||
max_speakers: 最大說話人數
|
||||
quality_threshold: 高品質聲紋門檻 (0-1)
|
||||
checkpoint_path: Step 3 完成後儲存 checkpoint 路徑
|
||||
asr_segments: 外部 ASR segments (from asr.json),跳過 Step 1
|
||||
|
||||
Returns:
|
||||
dict: segments, speaker_stats, n_speakers, total_duration, references
|
||||
@@ -194,16 +195,21 @@ class SelfASRXFixed:
|
||||
print(f" Audio: {total_duration:.2f}s, {sample_rate}Hz")
|
||||
|
||||
# ── Step 1: whisper 粗略定位 (faster-whisper) ──
|
||||
print("\n[Step 1] Initial whisper transcription...")
|
||||
t1 = time.time()
|
||||
seg_gen, info = self.whisper.transcribe(audio_path)
|
||||
rough_segments = []
|
||||
for seg in seg_gen:
|
||||
rough_segments.append({"start": seg.start, "end": seg.end, "text": seg.text})
|
||||
language = info.language if info else None
|
||||
print(f" Rough segments: {len(rough_segments)}")
|
||||
print(f" Language: {language}")
|
||||
print(f" Step 1 time: {time.time() - t1:.2f}s")
|
||||
if asr_segments:
|
||||
print(f"\n[Step 1] Skipping whisper, using {len(asr_segments)} provided ASR segments")
|
||||
rough_segments = asr_segments
|
||||
language = asr_segments[0].get("language") if isinstance(asr_segments[0].get("language"), str) else None
|
||||
else:
|
||||
print("\n[Step 1] Initial whisper transcription...")
|
||||
t1 = time.time()
|
||||
seg_gen, info = self.whisper.transcribe(audio_path)
|
||||
rough_segments = []
|
||||
for seg in seg_gen:
|
||||
rough_segments.append({"start": seg.start, "end": seg.end, "text": seg.text})
|
||||
language = info.language if info else None
|
||||
print(f" Rough segments: {len(rough_segments)}")
|
||||
print(f" Language: {language}")
|
||||
print(f" Step 1 time: {time.time() - t1:.2f}s")
|
||||
|
||||
if not rough_segments:
|
||||
print("[SelfASRX] No speech detected by whisper!")
|
||||
|
||||
Reference in New Issue
Block a user