browser-use - mmns-memo

browser-use

https://github.com/browser-use/browser-use

公式ドキュメント

https://docs.browser-use.com/customize/agent/basics

DeepWiki

https://deepwiki.com/browser-use/browser-use

自分のfork

https://github.com/miminashi/browser-use

テストのときにLLMって必要なんかな？って思ったけど、要らないらしい

LLMのモックを使ってテストしているらしい

https://deepwiki.com/search/-llm-github-actions-llm_2e0e1c64-8fea-4994-b09b-eb98374db0d2

スクリーンショットを取得しているのに、758ものエレメントを評価しているのはなぜか

スクリーンショットはDOM要素のフィルタリングでは使われていないらしい

https://deepwiki.com/search/758-dom-info-browseruseagent-b_f285a1b1-b1db-4f94-a4c3-dfd9c196ddf8

デバッグログを出力するには？

https://deepwiki.com/search/-info-agent-step-5-error-agent_93b69151-4740-4bba-815f-3e8e271c5f1a

code:sh

BROWSER_USE_LOGGING_LEVEL=debug uv run main.py

step_timeoutとllm_timeoutが効かない

なんか値がハードコーディングされてしまってるっぽい？

step_timeout

https://deepwiki.com/search/-debug-browseruseagent-step-6_6cec8bc6-6c86-4a0e-b939-4c1ff56f2a76

https://github.com/browser-use/browser-use/blob/59b56c1037e15868835dbf36a30b451fb6e1a7bc/browser_use/agent/service.py#L1651

llm_timeout

https://deepwiki.com/search/extract3extract-info-browserus_4630cf7d-626a-4f43-bd98-d7cf3ed51d47

https://github.com/browser-use/browser-use/blob/59b56c1037e15868835dbf36a30b451fb6e1a7bc/browser_use/tools/service.py#L678

pr:2496でいちど修正されてるように見える

https://github.com/browser-use/browser-use/pull/2496/files#diff-3736f406057606f65f3ea79952b7f0ec8f8a22f25a49d7a0771f4cfbf43cbb92R1249

プルリク出そうと思って準備してたら同じ趣旨のプルリクが出てたので、コメントしておいた

https://github.com/browser-use/browser-use/pull/3632#issuecomment-3566669135

results.md ファイルの書き込みに失敗する

ファイル未作成なのにappendしようとして失敗してる

プロンプトに問題がありそう

https://deepwiki.com/search/append-true-resultsmd-info-bro_efff2886-1278-42a9-8a2d-2614cb31c976

navigation back を呼べていない

⚠️ Eval: Failed to navigate back to the Google search results page. The current page is still the YouTube channel page, not the search results. Verdict: Failure みたいなのが多発する

go_backアクションの説明が空文字列になっているため、LLMがこのアクションの存在や用途を理解できていないらしい

https://deepwiki.com/search/youtubegoogle-info-browserusea_ff0311e4-8f0a-4105-8133-7890798ecf22

空文字列になってたのは以下のコミットで修正されてる

https://github.com/browser-use/browser-use/commit/e5df80a0d699ad091e96d8452f54df0d69ffcde9

0.9.6 以降を使えば修正済みの状態になってるはず

llm とは別に page_extraction_llm も指定するといいっぽい？

メインで使うLLMより小さなモデルを指定するといいっぽい

https://deepwiki.com/search/agent-pageextractionllm_67402124-9f1f-4ca5-8744-e22e929367b6

Agentのパラメータ page_extraction_llm はどのように使用されるか

このパラメータはちゃんと内部でつかわれてるようだった

https://deepwiki.com/search/agent-pageextractionllm_41e246e9-ec69-4d2c-aaba-f3d8c081c9c7

デモモード

https://deepwiki.com/search/_ed4a5751-fc98-4a43-98e0-043704911157

https://deepwiki.com/search/httpsdocsbrowserusecomcustomiz_0a8f4a41-341a-450e-b2b8-619a8f0b7983

コードリーディング

Agent.run から Tools.extract までの流れ

https://deepwiki.com/search/agentrun-toolsextract_8b2656c0-082a-4037-8881-72104e0e5972

VLMの候補

https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Instruct-GGUF

https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Thinking-GGUF

https://huggingface.co/ggml-org/Kimi-VL-A3B-Thinking-2506-GGUF

そのた

https://deepwiki.com/search/llmvlm_a451edfb-f7ca-41a6-82cd-f2d5280c77ad

https://deepwiki.com/search/judgellm_ada54470-045b-414c-8c21-920910487e4d