Files
llama.cpp/tools/mtmd/debug/mtmd-debug.md
admin 8e5a449007
Some checks are pending
Copilot Setup Steps / copilot-setup-steps (push) Waiting to run
Check Pre-Tokenizer Hashes / pre-tokenizer-hashes (push) Waiting to run
Python check requirements.txt / check-requirements (push) Waiting to run
Python Type-Check / python type-check (push) Waiting to run
Update Operations Documentation / update-ops-docs (push) Waiting to run
llama.cpp verification source 2026-05-22
2026-05-22 16:44:08 +08:00

607 B

mtmd-debug

Debugging encode pass

Example of debugging an input gray image (raw, not preprocessed):

from transformers import AutoModel

model = AutoModel.from_pretrained(...)

def test_vision():
  img_size = 896 # number of patches per side
  pixel_values = torch.zeros(1, 3, img_size, img_size) + 0.5 # gray image
  with torch.no_grad():
    outputs = model.model.get_image_features(pixel_values=pixel_values)
  print("last_hidden_state shape:", outputs.last_hidden_state.shape)
  print("last_hidden_state:", outputs.last_hidden_state)

test_vision()

Debugging preprocess pass

(TODO)