Warning
This article was automatically translated by OpenAI (gemini-2.5-pro-exp-03-25).It may be edited eventually, but please be aware that it may contain incorrect information at this time.
Table of Contents
- Installing llama-cpp
- Starting the OpenAI API Server
- Accessing with Spring AI
- Trying Tool Calling
- Integrating with MCP Server
We will run Google's released Gemma 3 using llama-cpp and access it via the OpenAI API. Additionally, we will access this API via Spring AI and try out Tool Calling and MCP integration.
Here is the execution environment (Apple M4 Max, 16-core CPU, 40-core GPU, 128GB Unified Memory):
Installing llama-cpp
brew install llama.cpp
We will try with the following version.
$ llama-server --version
version: 4897 (b3c9a656)
built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.2.0
Starting the OpenAI API Server
We will actually use the GGUF format converted version ggml-org/gemma-3-27b-it-GGUF. The model will be downloaded on the first run (approx. 15GB).
llama-server -hf ggml-org/gemma-3-27b-it-GGUF --jinja --port 8000
Tip
The --jinja option is required to use Tool Calling.
build: 4897 (b3c9a656) with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin24.2.0
system info: n_threads = 12, n_threads_batch = 12, total_threads = 16
system_info: n_threads = 12 (n_threads_batch = 12) / 16 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
main: HTTP server is listening, hostname: 127.0.0.1, port: 8000, http threads: 15
main: loading model
srv load_model: loading model '/Users/toshiaki/Library/Caches/llama.cpp/ggml-org_gemma-3-27b-it-GGUF_gemma-3-27b-it-Q4_K_M.gguf'
common_download_file: previous metadata file found /Users/toshiaki/Library/Caches/llama.cpp/ggml-org_gemma-3-27b-it-GGUF_gemma-3-27b-it-Q4_K_M.gguf.json: {"etag":"\"0692be0ba5f82dfeb68c582e67fb0f8c-1000\"","lastModified":"Wed, 12 Mar 2025 09:33:19 GMT","url":"https://huggingface.co/ggml-org/gemma-3-27b-it-GGUF/resolve/main/gemma-3-27b-it-Q4_K_M.gguf"}
curl_perform_with_retry: Trying to download from https://huggingface.co/ggml-org/gemma-3-27b-it-GGUF/resolve/main/gemma-3-27b-it-Q4_K_M.gguf (attempt 1 of 3)...
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 98303 MiB free
llama_model_loader: loaded meta data with 40 key-value pairs and 808 tensors from /Users/toshiaki/Library/Caches/llama.cpp/ggml-org_gemma-3-27b-it-GGUF_gemma-3-27b-it-Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = gemma3
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Gemma 3 27b It
llama_model_loader: - kv 3: general.finetune str = it
llama_model_loader: - kv 4: general.basename str = gemma-3
llama_model_loader: - kv 5: general.size_label str = 27B
llama_model_loader: - kv 6: general.license str = gemma
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Gemma 3 27b Pt
llama_model_loader: - kv 9: general.base_model.0.organization str = Google
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/google/gemma-3...
llama_model_loader: - kv 11: general.tags arr[str,1] = ["image-text-to-text"]
llama_model_loader: - kv 12: gemma3.context_length u32 = 131072
llama_model_loader: - kv 13: gemma3.embedding_length u32 = 5376
llama_model_loader: - kv 14: gemma3.block_count u32 = 62
llama_model_loader: - kv 15: gemma3.feed_forward_length u32 = 21504
llama_model_loader: - kv 16: gemma3.attention.head_count u32 = 32
llama_model_loader: - kv 17: gemma3.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 18: gemma3.attention.key_length u32 = 128
llama_model_loader: - kv 19: gemma3.attention.value_length u32 = 128
llama_model_loader: - kv 20: gemma3.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 21: gemma3.attention.sliding_window u32 = 1024
llama_model_loader: - kv 22: gemma3.attention.head_count_kv u32 = 16
llama_model_loader: - kv 23: gemma3.rope.scaling.type str = linear
llama_model_loader: - kv 24: gemma3.rope.scaling.factor f32 = 8.000000
llama_model_loader: - kv 25: tokenizer.ggml.model str = llama
llama_model_loader: - kv 26: tokenizer.ggml.pre str = default
llama_model_loader: - kv 27: tokenizer.ggml.tokens arr[str,262144] = ["", "", "", "", ...
llama_model_loader: - kv 28: tokenizer.ggml.scores arr[f32,262144] = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv 29: tokenizer.ggml.token_type arr[i32,262144] = [3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv 30: tokenizer.ggml.bos_token_id u32 = 2
llama_model_loader: - kv 31: tokenizer.ggml.eos_token_id u32 = 1
llama_model_loader: - kv 32: tokenizer.ggml.unknown_token_id u32 = 3
llama_model_loader: - kv 33: tokenizer.ggml.padding_token_id u32 = 0
llama_model_loader: - kv 34: tokenizer.ggml.add_bos_token bool = true
llama_model_loader: - kv 35: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 36: tokenizer.chat_template str = {{ bos_token }}\n{%- if messages[0]['r...
llama_model_loader: - kv 37: tokenizer.ggml.add_space_prefix bool = false
llama_model_loader: - kv 38: general.quantization_version u32 = 2
llama_model_loader: - kv 39: general.file_type u32 = 15
llama_model_loader: - type f32: 373 tensors
llama_model_loader: - type q4_K: 374 tensors
llama_model_loader: - type q6_K: 61 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 15.40 GiB (4.90 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 6414
load: token to piece cache size = 1.9446 MB
print_info: arch = gemma3
print_info: vocab_only = 0
print_info: n_ctx_train = 131072
print_info: n_embd = 5376
print_info: n_layer = 62
print_info: n_head = 32
print_info: n_head_kv = 16
print_info: n_rot = 128
print_info: n_swa = 1024
print_info: n_swa_pattern = 6
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 2
print_info: n_embd_k_gqa = 2048
print_info: n_embd_v_gqa = 2048
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 7.7e-02
print_info: n_ff = 21504
print_info: n_expert = 0
print_info: n_expert_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 1000000.0
print_info: freq_scale_train = 0.125
print_info: n_ctx_orig_yarn = 131072
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 0
print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 27B
print_info: model params = 27.01 B
print_info: general.name = Gemma 3 27b It
print_info: vocab type = SPM
print_info: n_vocab = 262144
print_info: n_merges = 0
print_info: BOS token = 2 ''
print_info: EOS token = 1 ''
print_info: EOT token = 106 ''
print_info: UNK token = 3 ''
print_info: PAD token = 0 ''
print_info: LF token = 248 '<0x0A>'
print_info: EOG token = 1 ''
print_info: EOG token = 106 ''
print_info: max token length = 48
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: offloading 62 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 63/63 layers to GPU
load_tensors: Metal_Mapped model buffer size = 15773.63 MiB
load_tensors: CPU_Mapped model buffer size = 1102.50 MiB
.........................................................................................
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 4096
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: freq_base = 1000000.0
llama_context: freq_scale = 0.125
llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M4 Max
ggml_metal_init: picking default device: Apple M4 Max
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name: Apple M4 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets = true
ggml_metal_init: has bfloat = true
ggml_metal_init: use bfloat = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 103079.22 MB
ggml_metal_init: skipping kernel_get_rows_bf16 (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32_1row (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_f32_l4 (not supported)
ggml_metal_init: skipping kernel_mul_mv_bf16_bf16 (not supported)
ggml_metal_init: skipping kernel_mul_mv_id_bf16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_bf16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_bf16_f32 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h64 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h80 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h96 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h112 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h128 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_bf16_h256 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h128 (not supported)
ggml_metal_init: skipping kernel_flash_attn_ext_vec_bf16_h256 (not supported)
ggml_metal_init: skipping kernel_cpy_f32_bf16 (not supported)
ggml_metal_init: skipping kernel_cpy_bf16_f32 (not supported)
ggml_metal_init: skipping kernel_cpy_bf16_bf16 (not supported)
llama_context: CPU output buffer size = 1.00 MiB
init: kv_size = 4096, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 62, can_shift = 1
init: Metal KV buffer size = 1984.00 MiB
llama_context: KV self size = 1984.00 MiB, K (f16): 992.00 MiB, V (f16): 992.00 MiB
llama_context: Metal compute buffer size = 535.02 MiB
llama_context: CPU compute buffer size = 26.51 MiB
llama_context: graph nodes = 2487
llama_context: graph splits = 2
common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
srv init: initializing slots, n_slots = 1
slot init: id 0 | task -1 | new slot n_ctx_slot = 4096
main: model loaded
main: chat template, chat_template: {{ bos_token }}
{%- if messages[0]['role'] == 'system' -%}
{%- if messages[0]['content'] is string -%}
{%- set first_user_prefix = messages[0]['content'] + '
' -%}
{%- else -%}
{%- set first_user_prefix = messages[0]['content'][0]['text'] + '
' -%}
{%- endif -%}
{%- set loop_messages = messages[1:] -%}
{%- else -%}
{%- set first_user_prefix = "" -%}
{%- set loop_messages = messages -%}
{%- endif -%}
{%- for message in loop_messages -%}
{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
{{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
{%- endif -%}
{%- if (message['role'] == 'assistant') -%}
{%- set role = "model" -%}
{%- else -%}
{%- set role = message['role'] -%}
{%- endif -%}
{{ '' + role + '
' + (first_user_prefix if loop.first else "") }}
{%- if message['content'] is string -%}
{{ message['content'] | trim }}
{%- elif message['content'] is iterable -%}
{%- for item in message['content'] -%}
{%- if item['type'] == 'image' -%}
{{ '
' }}
{%- elif item['type'] == 'text' -%}
{{ item['text'] | trim }}
{%- endif -%}
{%- endfor -%}
{%- else -%}
{{ raise_exception("Invalid content type") }}
{%- endif -%}
{{ '
' }}
{%- endfor -%}
{%- if add_generation_prompt -%}
{{'model
'}}
{%- endif -%}
, example_format: 'user
You are a helpful assistant
Hello
model
Hi there
user
How are you?
model
'
main: server is listening on http://127.0.0.1:8000 - starting the main loop
srv update_slots: all slots are idle
Access the simple UI.
Paste the following prompt into the UI and click Submit.
Among A to D, three are honest people, and one is a liar. Who is the liar?
A: D is a liar.
B: I am not lying.
C: A is not lying.
D: B is a liar.
Yes, D is the correct answer.
Tip
Since the output results varied, I set the temperature to 0.
The next prompt is from the 2012 University of Tokyo Entrance Exam, Mathematics (Liberal Arts), Question 1. Paste it into the UI and click Submit.
A point (x, y) on the coordinate plane satisfies the following equation: 2x² + 4xy + 3y² + 4x + 5y - 4 = 0. Find the maximum possible value of x.
The correct answer is (-2+5√6)/4.
We will also access the OpenAI API using curl for the content tested in the UI.
curl -s http://localhost:8000/v1/chat/completions \
--json '{
"messages": [
{"role": "user", "content": "Among A to D, three are honest people, and one is a liar. Who is the liar?\n\nA: D is a liar.\nB: I am not lying.\nC: A is not lying.\nD: B is a liar.\n"}
], "temperature": 0
}' | jq .
The following JSON is returned.
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "This problem requires logical thinking to identify the liar.\n\n**1. Look for potential contradictions**\n\n* If A is honest, then D is a liar.\n* If B is honest, then B is not lying.\n* If C is honest, then A is not lying.\n* If D is honest, then B is a liar.\n\n**2. Assumption and Verification**\n\nAssume each person is the liar and verify if a contradiction arises.\n\n* **If A is the liar:** A's statement \"D is a liar\" is false, so D is honest. In this case, D's statement \"B is a liar\" is also true, meaning B is a liar. However, this doesn't match the condition of three honest people (A is a liar, B is a liar, D is honest, C is unknown).\n* **If B is the liar:** B's statement \"I am not lying\" is false, so B is lying. In this case, A, C, and D could be honest. If A's statement \"D is a liar\" is true, then D is a liar, which is a contradiction.\n* **If C is the liar:** C's statement \"A is not lying\" is false, so A is a liar. If A's statement \"D is a liar\" is true, then D is a liar, resulting in only two honest people (A and D are liars), which is a contradiction.\n* **If D is the liar:** D's statement \"B is a liar\" is false, so B is honest. A's statement \"D is a liar\" becomes true, and C's statement \"A is not lying\" also becomes true. In this case, A, B, and C are honest, and D is the liar, which matches the condition.\n\n**3. Conclusion**\n\nFrom the above verification, we can identify that **D is the liar**."
}
}
],
"created": 1742193118,
"model": "gpt-3.5-turbo",
"system_fingerprint": "b4897-b3c9a656",
"object": "chat.completion",
"usage": {
"completion_tokens": 459,
"prompt_tokens": 69,
"total_tokens": 528
},
"id": "chatcmpl-jvmZxpPmNZA3CFBA5MpxN4Unt1jyE1Ot",
"timings": {
"prompt_n": 65,
"prompt_ms": 488.227,
"prompt_per_token_ms": 7.511184615384615,
"prompt_per_second": 133.13479180790904,
"predicted_n": 459,
"predicted_ms": 22500.929,
"predicted_per_token_ms": 49.02163180827887,
"predicted_per_second": 20.399157741442586
}
}
One more.
curl -s http://localhost:8000/v1/chat/completions \
--json '{
"messages": [
{"role": "user", "content": "A point (x, y) on the coordinate plane satisfies the following equation: 2x² + 4xy + 3y² + 4x + 5y - 4 = 0. Find the maximum possible value of x."}
]
}' | jq .
The following JSON is returned.
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "The given equation is\n$$2x^2 + 4xy + 3y^2 + 4x + 5y - 4 = 0$$\nConsider this equation as a quadratic equation in $y$ and solve it.\n$$3y^2 + (4x+5)y + (2x^2+4x-4) = 0$$\nFor this quadratic equation to have real solutions, the discriminant must be non-negative.\n$$D = (4x+5)^2 - 4(3)(2x^2+4x-4) \\ge 0$$\n$$16x^2 + 40x + 25 - 24x^2 - 48x + 48 \\ge 0$$\n$$-8x^2 - 8x + 73 \\ge 0$$\n$$8x^2 + 8x - 73 \\le 0$$\nTo solve this quadratic inequality, find the roots of the quadratic equation $8x^2 + 8x - 73 = 0$.\n$$x = \\frac{-8 \\pm \\sqrt{8^2 - 4(8)(-73)}}{2(8)} = \\frac{-8 \\pm \\sqrt{64 + 2336}}{16} = \\frac{-8 \\pm \\sqrt{2400}}{16} = \\frac{-8 \\pm 20\\sqrt{6}}{16} = \\frac{-2 \\pm 5\\sqrt{6}}{4}$$\nTherefore,\n$$\\frac{-2 - 5\\sqrt{6}}{4} \\le x \\le \\frac{-2 + 5\\sqrt{6}}{4}$$\nThe maximum value of $x$ is $\\frac{-2 + 5\\sqrt{6}}{4}$.\n\nFind the maximum value of $x$.\n$$x_{max} = \\frac{-2 + 5\\sqrt{6}}{4}$$\nSince $\\sqrt{6} \\approx 2.449$,\n$$x_{max} \\approx \\frac{-2 + 5(2.449)}{4} = \\frac{-2 + 12.245}{4} = \\frac{10.245}{4} \\approx 2.561$$\n\nTherefore, the maximum possible value of x is $\\frac{-2 + 5\\sqrt{6}}{4}$.\n\nFinal Answer: The final answer is $\\boxed{\\frac{-2+5\\sqrt{6}}{4}}$"
}
}
],
"created": 1742193577,
"model": "gpt-3.5-turbo",
"system_fingerprint": "b4897-b3c9a656",
"object": "chat.completion",
"usage": {
"completion_tokens": 571,
"prompt_tokens": 64,
"total_tokens": 635
},
"id": "chatcmpl-1QirJwNtM0OpCA9asKhgkquN3F0g6VOv",
"timings": {
"prompt_n": 60,
"prompt_ms": 356.042,
"prompt_per_token_ms": 5.934033333333333,
"prompt_per_second": 168.51944433521888,
"predicted_n": 571,
"predicted_ms": 28415.924,
"predicted_per_token_ms": 49.76519089316987,
"predicted_per_second": 20.094366806442757
}
}
Accessing with Spring AI
Let's try accessing from an application using Spring AI.
Since it's OpenAI compatible, we can use Spring AI's Chat Client for OpenAI.
The sample application is here:
https://github.com/making/hello-spring-ai
As shown in the following code, it simply passes the prompt from the GET request parameter to the Chat Client. Also, for long responses, the prompt is sent as the request body via POST, and responses can be returned as Server-Sent Events.
public HelloController(ChatClient.Builder chatClientBuilder /* ... */) {
this.chatClient = chatClientBuilder.build();
// ...
}
@GetMapping(path = "/")
public String hello(@RequestParam(defaultValue = "Tell me a joke") String prompt) {
return this.chatClient.prompt().messages().user(prompt).call().content();
}
@PostMapping(path = "/", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> helloStream(@RequestBody String prompt) {
return this.chatClient.prompt()
.messages()
.user(prompt)
.stream()
.content()
.windowUntil(s -> s.endsWith(".") || s.endsWith("。"))
.flatMap(flux -> flux.collect(Collectors.joining()));
}
Build and run. Specify the llama.cpp server URL in the spring.ai.openai.base-url property.
git clone https://github.com/making/hello-spring-ai
cd hello-spring-ai
./mvnw clean package -DskipTests=true
java -jar target/hello-spring-ai-0.0.1-SNAPSHOT.jar --spring.ai.openai.base-url=http://localhost:8000 --spring.ai.openai.api-key=dummy --spring.ai.openai.chat.options.temperature=0
First, let's send a simple prompt (日本の首都はどこですか? - What is the capital of Japan?).
$ curl "http://localhost:8080/?prompt=%E6%97%A5%E6%9C%AC%E3%81%AE%E9%A6%96%E9%83%BD%E3%81%AF%E3%81%A9%E3%81%93%E3%81%A7%E3%81%99%E3%81%8B%EF%BC%9F"
The capital of Japan is **Tokyo**.
Next, send the same prompt as used in the UI.
$ curl localhost:8080 -H "Content-Type: text/plain" -d "Among A to D, three are honest people, and one is a liar. Who is the liar?\n\nA: D is a liar.\nB: I am not lying.\nC: A is not lying.\nD: B is a liar.\n"
data:This problem requires logical thinking to identify the liar.
data:
data:
data:**Reasoning**
data:
data:* Honest people tell the truth, liars tell lies.
data:
data:* Contradictory statements are likely from the liar.
data:
data:
data:**Examining each statement**
data:
data:* **A: D is a liar.**
data:
data:* **B: I am not lying.**
data:
data:* **C: A is not lying.**
data:
data:* **D: B is a liar.**
data:
data:**Case Analysis**
data:
data:1.
data: **If A is the liar:**
data: * Since A is lying, D is not a liar.
data:Meaning, D is honest.
data:
data: * Since D is honest, D's statement "B is a liar" is true.
data:
data: * Since B is a liar, B's statement "I am not lying" is false.
data:
data: * C's statement "A is not lying" is false, so C is a liar.
data:
data: * In this case, A and C are liars, leaving only two honest people, which is a contradiction.
data:
data:
data:2.
data: **If B is the liar:**
data: * Since B is lying, B is lying.
data:
data: * D's statement "B is a liar" is true, so D is honest.
data:
data: * A's statement "D is a liar" is false, so A is a liar.
data:
data: * C's statement "A is not lying" is false, so C is a liar.
data:
data: * In this case, A, B, and C are liars, leaving only one honest person, which is a contradiction.
data:
data:
data:3.
data: **If C is the liar:**
data: * Since C is lying, A is a liar.
data:
data: * Since A is a liar, A's statement "D is a liar" is false.
data:Meaning, D is honest.
data:
data: * Since D is honest, D's statement "B is a liar" is true.
data:
data: * Since B is a liar, B's statement "I am not lying" is false.
data:
data: * In this case, A, B, and C are liars, leaving only one honest person, which is a contradiction.
data:
data:
data:4.
data: **If D is the liar:**
data: * Since D is lying, B is not a liar.
data:Meaning, B is honest.
data:
data: * Since B is honest, B's statement "I am not lying" is true.
data:
data: * A's statement "D is a liar" is true, so A is honest.
data:
data: * C's statement "A is not lying" is true, so C is honest.
data:
data: * In this case, only D is the liar, and A, B, C are honest, which matches the condition.
data:
data:
data:**Conclusion**
data:
data:Therefore, the liar is **D**.
One more.
$ curl localhost:8080 -H "Content-Type: text/plain" -d "A point (x, y) on the coordinate plane satisfies the following equation: 2x² + 4xy + 3y² + 4x + 5y - 4 = 0. Find the maximum possible value of x."
data:The given equation is,
data:$$2x^2 + 4xy + 3y^2 + 4x + 5y - 4 = 0$$
data:Considering this equation as a quadratic equation in $y$, let's find the condition for real solutions to exist.
data:
data:$$3y^2 + (4x+5)y + (2x^2+4x-4) = 0$$
data:The discriminant $D$ for this quadratic equation to have real solutions is,
data:$$D = (4x+5)^2 - 4(3)(2x^2+4x-4) \ge 0$$
data:Expanding and simplifying,
data:$$16x^2 + 40x + 25 - 24x^2 - 48x + 48 \ge 0$$
data:$$-8x^2 - 8x + 73 \ge 0$$
data:$$8x^2 + 8x - 73 \le 0$$
data:To solve this quadratic inequality, we find the roots of the quadratic equation $8x^2 + 8x - 73 = 0$.
data:
data:Using the quadratic formula,
data:$$x = \frac{-8 \pm \sqrt{8^2 - 4(8)(-73)}}{2(8)} = \frac{-8 \pm \sqrt{64 + 2336}}{16} = \frac{-8 \pm \sqrt{2400}}{16} = \frac{-8 \pm 20\sqrt{6}}{16} = \frac{-2 \pm 5\sqrt{6}}{4}$$
data:Therefore,
data:$$x = \frac{-2 - 5\sqrt{6}}{4} \approx \frac{-2 - 5(2.
data:449)}{4} \approx \frac{-2 - 12.
data:245}{4} \approx \frac{-14.
data:245}{4} \approx -3.
data:561$$
data:$$x = \frac{-2 + 5\sqrt{6}}{4} \approx \frac{-2 + 5(2.
data:449)}{4} \approx \frac{-2 + 12.
data:245}{4} \approx \frac{10.
data:245}{4} \approx 2.
data:561$$
data:The range of $x$ satisfying $8x^2 + 8x - 73 \le 0$ is,
data:$$\frac{-2 - 5\sqrt{6}}{4} \le x \le \frac{-2 + 5\sqrt{6}}{4}$$
data:Therefore, the maximum value of $x$ is $\frac{-2 + 5\sqrt{6}}{4}$.
data:
data:
data:Final Answer: The final answer is $\boxed{\frac{-2+5\sqrt{6}}{4}}$
A result similar to the UI was returned.
Trying Tool Calling
Next, let's try Spring AI's Tool Calling.
llama.cpp supports OpenAI-style Function Calling.
By using Tool Calling, we can implement methods in Java to return information that the LLM doesn't know, thus complementing it.
For example, the LLM doesn't know the current time, so if you ask it, it will answer with a completely different time.
You can check the current time with the following command.
$ LANG=en date
Mon Mar 17 18:12:26 JST 2025
First, let's ask the LLM without Tool Calling.
$ curl "http://localhost:8080?prompt=what%20time%20is%20it%20now%20in%20JST"
It is currently **11:18 PM JST** on Sunday, May 12, 2024.
(JST stands for Japan Standard Time, which is GMT+9)
Completely wrong.
Here, we add a Tool according to the Spring AI documentation sample so that it can tell the current time.
public class DateTimeTools {
private final Logger logger = LoggerFactory.getLogger(DateTimeTools.class);
@Tool(description = "Get the current date and time in the user's timezone")
public String getCurrentDateTime() {
logger.info("Calling getCurrentDateTime()");
return LocalDateTime.now().atZone(LocaleContextHolder.getTimeZone().toZoneId()).toString();
}
}
The /datetime endpoint is configured to use this Tool.
@GetMapping(path = "/datetime")
public String dateTime(@RequestParam(defaultValue = "What time is it now?") String prompt) {
return this.chatClient.prompt().messages().user(prompt).tools(new DateTimeTools()).call().content();
}
Call the /datetime endpoint.
$ curl "http://localhost:8080/datetime?prompt=what%20time%20is%20it%20now%20in%20JST"
It is currently 6:14 PM on March 17, 2025 in Japan Standard Time (JST).
This time, it answered with the correct time.
Looking at the access logs, we can see that the first response is a request to call the function, and the second request includes the result of the function call.
2025-03-17T18:14:17.171+09:00 INFO 91441 --- [nio-8080-exec-4] [c4573c0f2db9afb1bc7aefe75b8908b2-ed211309cf79add3] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=1581 request_body="{\"messages\":[{\"content\":\"what time is it now in JST\",\"role\":\"user\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Get the current date and time in the user's timezone\",\"name\":\"getCurrentDateTime\",\"parameters\":{\"$schema\":\"https://json-schema.org/draft/2020-12/schema\",\"additionalProperties\":false,\"type\":\"object\",\"properties\":{},\"required\":[]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"tool_calls\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":null,\"tool_calls\":[{\"type\":\"function\",\"function\":{\"name\":\"getCurrentDateTime\",\"arguments\":\"{}\"},\"id\":\"caW1kOY7EIt6aG5NE8Y3f0baBJYe0xf0\"}]}}],\"created\":1742202857,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":33,\"prompt_tokens\":262,\"total_tokens\":295},\"id\":\"chatcmpl-5yDuQeRaojGohxTJiiN8M8HjBuraBFEW\",\"timings\":{\"prompt_n\":1,\"prompt_ms\":79.062,\"prompt_per_token_ms\":79.062,\"prompt_per_second\":12.648301333130961,\"predicted_n\":33,\"predicted_ms\":1497.752,\"predicted_per_token_ms\":45.38642424242424,\"predicted_per_second\":22.0330201528691}}"
2025-03-17T18:14:17.172+09:00 INFO 91441 --- [nio-8080-exec-4] [c4573c0f2db9afb1bc7aefe75b8908b2-1bf3a696eb709f5a] com.example.hello.DateTimeTools : Calling getCurrentDateTime()
2025-03-17T18:14:19.776+09:00 INFO 91441 --- [nio-8080-exec-4] [c4573c0f2db9afb1bc7aefe75b8908b2-52a861aa116bd254] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=2601 request_body="{\"messages\":[{\"content\":\"what time is it now in JST\",\"role\":\"user\"},{\"role\":\"assistant\",\"tool_calls\":[{\"id\":\"caW1kOY7EIt6aG5NE8Y3f0baBJYe0xf0\",\"type\":\"function\",\"function\":{\"name\":\"getCurrentDateTime\",\"arguments\":\"{}\"}}]},{\"content\":\"\\\"2025-03-17T18:14:17.172924+09:00[Asia/Tokyo]\\\"\",\"role\":\"tool\",\"name\":\"getCurrentDateTime\",\"tool_call_id\":\"caW1kOY7EIt6aG5NE8Y3f0baBJYe0xf0\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Get the current date and time in the user's timezone\",\"name\":\"getCurrentDateTime\",\"parameters\":{\"$schema\":\"https://json-schema.org/draft/2020-12/schema\",\"additionalProperties\":false,\"type\":\"object\",\"properties\":{},\"required\":[]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"It is currently 6:14 PM on March 17, 2025 in Japan Standard Time (JST).\"}}],\"created\":1742202859,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":38,\"prompt_tokens\":440,\"total_tokens\":478},\"id\":\"chatcmpl-M6HHVS3Y47PInXqwk91GfV1UzEYXct40\",\"timings\":{\"prompt_n\":172,\"prompt_ms\":883.242,\"prompt_per_token_ms\":5.135127906976744,\"prompt_per_second\":194.7371162150351,\"predicted_n\":38,\"predicted_ms\":1715.396,\"predicted_per_token_ms\":45.141999999999996,\"predicted_per_second\":22.15231934783572}}"
2025-03-17T18:14:19.777+09:00 INFO 91441 --- [nio-8080-exec-4] [c4573c0f2db9afb1bc7aefe75b8908b2-c5b884fea15cf957] accesslog : kind=server method=GET url="http://localhost:8080/datetime?prompt=what%20time%20is%20it%20now%20in%20JST" status=200 duration=4190 protocol="HTTP/1.1" remote="0:0:0:0:0:0:0:1" user_agent="curl/8.7.1" response_body="It is currently 6:14 PM on March 17, 2025 in Japan Standard Time (JST)."
Integrating with MCP Server
Next, we will try the Model Context Protocol integration, available since Spring AI 1.0.0.M6, to retrieve information unavailable to the LLM from an existing MCP Server.
As of this writing, the llama.cpp UI cannot integrate with an MCP Server, but via Spring AI, the application can function as an MCP Client. The actual information retrieval mechanism is the same as the Function Calling above, but the source is an MCP Server instead of a method implemented within the app (like DateTimeTools#getCurrentDateTime in the previous example).
This time, we will use the Fetch MCP Server to have the LLM retrieve missing information from the internet.
Since we use uv to start the Fetch MCP Server, install it as follows.
brew install uv
Tried with the following version.
$ uvx --version
uv-tool-uvx 0.6.6 (Homebrew 2025-03-12)
To use this MCP Server, define the following JSON (mcp-servers-config.json).
cat <<EOF > mcp-servers-config.json
{
"mcpServers": {
"fetch": {
"command": "uvx",
"args": [
"mcp-server-fetch"
]
}
}
}
EOF
Add spring.ai.mcp.client.stdio.servers-configuration=file://${PWD}/mcp-servers-config.json to the application arguments.
java -jar target/hello-spring-ai-0.0.1-SNAPSHOT.jar --spring.ai.openai.base-url=http://localhost:8000 --spring.ai.openai.api-key=dummy --spring.ai.openai.chat.options.temperature=0 --spring.ai.mcp.client.stdio.servers-configuration=file://${PWD}/mcp-servers-config.json
Tip
This JSON format can also be used with Claude Desktop.
To retrieve information from the MCP Server, similar to the previous Tool Calling example, you just need to configure the following tool.
private final SyncMcpToolCallbackProvider mcpTools;
public HelloController(ChatClient.Builder chatClientBuilder, SyncMcpToolCallbackProvider mcpTools) {
this.chatClient = chatClientBuilder.build();
this.mcpTools = mcpTools;
}
// ...
@GetMapping(path = "/mcp")
public String mcp(@RequestParam(defaultValue = "What time is it now?") String prompt) {
return this.chatClient.prompt().messages().user(prompt).tools(mcpTools).call().content();
}
Now, similarly, let's query the current time (what time is it now in JST) to the /mcp endpoint.
$ curl "http://localhost:8080/mcp?prompt=what%20time%20is%20it%20now%20in%20JST"
It is currently 7:50:59 PM on Monday, March 17, 2025 in Tokyo, Japan (JST).
The correct current time was returned without using DateTimeTools.
Looking at the access logs, we can see that the LLM was requested to retrieve information from https://www.timeanddate.com/worldclock/japan/tokyo.
2025-03-17T19:50:40.131+09:00 INFO 93163 --- [nio-8080-exec-3] [945a7f32d30d1e8269eea469bebd3421-f49e03550322f274] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=2922 request_body="{\"messages\":[{\"content\":\"what time is it now in JST\",\"role\":\"user\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Fetches a URL from the internet and optionally extracts its contents as markdown.\\n\\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.\",\"name\":\"fetch\",\"parameters\":{\"type\":\"object\",\"properties\":{\"url\":{\"description\":\"URL to fetch\",\"format\":\"uri\",\"minLength\":1,\"title\":\"Url\",\"type\":\"string\"},\"max_length\":{\"default\":5000,\"description\":\"Maximum number of characters to return.\",\"exclusiveMaximum\":1000000,\"exclusiveMinimum\":0,\"title\":\"Max Length\",\"type\":\"integer\"},\"start_index\":{\"default\":0,\"description\":\"On return output starting at this character index, useful if a previous fetch was truncated and more context is required.\",\"minimum\":0,\"title\":\"Start Index\",\"type\":\"integer\"},\"raw\":{\"default\":false,\"description\":\"Get the actual HTML content if the requested page, without simplification.\",\"title\":\"Raw\",\"type\":\"boolean\"}},\"required\":[\"url\"]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"tool_calls\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":null,\"tool_calls\":[{\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://www.timeanddate.com/worldclock/japan/tokyo\\\"}\"},\"id\":\"FAxRs9mzZ0lrqmj6hA6lK3QiyEpFtdyN\"}]}}],\"created\":1742208640,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":57,\"prompt_tokens\":530,\"total_tokens\":587},\"id\":\"chatcmpl-A2knSu0U1OBOfCqPP4lKla8DcXE5S1LN\",\"timings\":{\"prompt_n\":13,\"prompt_ms\":242.454,\"prompt_per_token_ms\":18.650307692307692,\"prompt_per_second\":53.61841833914886,\"predicted_n\":57,\"predicted_ms\":2667.815,\"predicted_per_token_ms\":46.80377192982456,\"predicted_per_second\":21.365799352653763}}"
2025-03-17T19:50:59.705+09:00 INFO 93163 --- [nio-8080-exec-3] [945a7f32d30d1e8269eea469bebd3421-885b6fd416f197cf] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=18339 request_body="{\"messages\":[{\"content\":\"what time is it now in JST\",\"role\":\"user\"},{\"role\":\"assistant\",\"tool_calls\":[{\"id\":\"FAxRs9mzZ0lrqmj6hA6lK3QiyEpFtdyN\",\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://www.timeanddate.com/worldclock/japan/tokyo\\\"}\"}}]},{\"content\":\"[{\\\"type\\\":\\\"text\\\",\\\"text\\\":\\\"Contents of https://www.timeanddate.com/worldclock/japan/tokyo:\\\\n[Home](/) [Time Zones](/time/) [World Clock](/worldclock/) [Japan](/worldclock/japan) Tokyo\\\\n\\\\n\\\\n\\\\n* [Time/General](/worldclock/japan/tokyo \\\\\\\"General/main info about Tokyo\\\\\\\")\\\\n* [Weather](/weather/japan/tokyo \\\\\\\"Current weather and forecast for Tokyo\\\\\\\") \\\\n + [Weather Today/Tomorrow](/weather/japan/tokyo \\\\\\\"Shows a weather overview\\\\\\\")\\\\n + [Hour-by-Hour Forecast](/weather/japan/tokyo/hourly \\\\\\\"Hour-by-hour weather for the coming week\\\\\\\")\\\\n + [14 Day Forecast](/weather/japan/tokyo/ext \\\\\\\"Extended forecast for the next two weeks\\\\\\\")\\\\n + [Yesterday/Past Weather](/weather/japan/tokyo/historic \\\\\\\"Past weather for yesterday, the last 2 weeks, or any selected month available\\\\\\\")\\\\n + [Climate (Averages)](/weather/japan/tokyo/climate \\\\\\\"Historic weather and climate information\\\\\\\")\\\\n* [Time Zone](/time/zone/japan/tokyo \\\\\\\"Past and future time change dates for Tokyo\\\\\\\")\\\\n* [DST Changes](/time/change/japan/tokyo \\\\\\\"Daylight saving time changeover dates and times for Tokyo\\\\\\\")\\\\n* [Sun & Moon](/astronomy/japan/tokyo \\\\\\\"Calculate rising and setting times for the Sun and Moon in Tokyo\\\\\\\") \\\\n + [Sun & Moon Today](/astronomy/japan/tokyo)\\\\n + [Sunrise & Sunset](/sun/japan/tokyo)\\\\n + [Moonrise & Moonset](/moon/japan/tokyo)\\\\n + [Moon Phases](/moon/phases/japan/tokyo)\\\\n + [Eclipses](/eclipse/in/japan/tokyo)\\\\n + [Night Sky](/astronomy/night/japan/tokyo)\\\\n\\\\n19時50分40秒 [JST](/time/zones/jst \\\\\\\"Japan Standard Time\\\\\\\")\\\\n\\\\n2025年3月17日月曜日\\\\n\\\\n[Fullscreen](/worldclock/fullscreen.html?n=248 \\\\\\\"Local time in Japan, Tokyo\\\\\\\")\\\\n\\\\n| | |\\\\n| --- | --- |\\\\n| Country: | [Japan](/worldclock/japan) |\\\\n| Lat/Long: | 35°41'N / 139°42'E |\\\\n| Elevation: | 44 m |\\\\n| Currency: | Yen (JPY) |\\\\n| Languages: | Japanese |\\\\n| Country Code: | +81 |\\\\n\\\\n[](about:/time/map/#!cities=248)\\\\n\\\\n\\\\n\\\\nHoliday Note: [3月20日 (木), Spring Equinox](/holidays/japan/spring-equinox). Businesses may be closed. [See more](/holidays/japan/spring-equinox)\\\\n\\\\n[°C](/custom/site.html \\\\\\\"Change Units\\\\\\\")[](/weather/japan/tokyo)\\\\n\\\\n## Weather\\\\n\\\\n14 °C\\\\n\\\\nPassing clouds. \\\\n10 / 4 °C\\\\n\\\\n| | | |\\\\n| --- | --- | --- |\\\\n| 水曜日 19. | | 9 / 3 °C |\\\\n| 木曜日 20. | | 12 / 2 °C |\\\\n\\\\nWeather by CustomWeather, © 2025\\\\n\\\\n[More weather details](/weather/japan/tokyo)\\\\n\\\\n[](/time/zone/japan/tokyo)\\\\n\\\\n## [Time Zone](/time/zone/japan/tokyo)\\\\n\\\\nJST (Japan Standard Time) \\\\nUTC/GMT +9 hours\\\\n\\\\n[](/time/change/japan/tokyo)\\\\n\\\\n## [No DST](/time/change/japan/tokyo)\\\\n\\\\nNo Daylight Saving Time in 2025\\\\n\\\\n[](/time/difference/japan/tokyo)\\\\n\\\\n## [Difference](/time/difference/japan/tokyo)\\\\n\\\\nSame time as \\\\nTokyo\\\\n\\\\n[About JST — Japan Standard Time](/time/zones/jst)\\\\n\\\\n[Set your location](#)\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Sunrise](/sun/japan/tokyo)\\\\n\\\\n5時49分 \\\\n↑ 91° East\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Sunset](/sun/japan/tokyo)\\\\n\\\\n17時50分 \\\\n↑ 269° West\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Day length](/sun/japan/tokyo)\\\\n\\\\n12 hours, 1 minute \\\\n+2m 16s longer\\\\n\\\\n[](/moon/japan/tokyo)\\\\n\\\\n## [Moon 91.2%](/moon/japan/tokyo)\\\\n\\\\nSet – 6時56分 \\\\nRise – 20時40分\\\\n\\\\n\\\\n\\\\n## [High Tide](#)\\\\n\\\\nHigh – 6時13分 \\\\nHigh – 18時47分\\\\n\\\\n\\\\n\\\\n## [Low Tide](#)\\\\n\\\\nLow – 0時21分 \\\\nLow – 12時35分\\\\n\\\\n[More Sun & Moon in Tokyo](/astronomy/japan/tokyo) \\\\n[+ Show More Twilight and Moon Phase Information](#)\\\\n\\\\n## [Solar Noon](/sun/japan/tokyo)\\\\n\\\\nSun in South: 11時49分 \\\\nAltitude: 53.0°\\\\n\\\\n## [Astronomical Twilight](/sun/japan/tokyo)\\\\n\\\\n4時24分 – 4時54分 \\\\n18時45分 – 19時15分\\\\n\\\\n## [Nautical Twilight](/sun/japan/tokyo)\\\\n\\\\n4時54分 – 5時24分 \\\\n18時15分 – 18時45分\\\\n\\\\n## [Civil Twilight](/sun/japan/tokyo)\\\\n\\\\n5時24分 – 5時49分 \\\\n17時50分 – 18時15分\\\\n\\\\n## [Previous Moon Phase](/moon/phases/japan/tokyo)\\\\n\\\\nFull Moon \\\\n2025年3月14日金曜日 \\\\n15時54分\\\\n\\\\n## [Next Moon Phase](/moon/phases/japan/tokyo)\\\\n\\\\nThird Quarter \\\\n2025年3月22日土曜日 \\\\n20時29分\\\\n\\\\n[Need some help?](/worldclock/city-help.html)\\\\n\\\\n\\\\n\\\\n## Tools & Converters\\\\n\\\\n* [Meeting Planner for Tokyo](/worldclock/meeting.html?p1=248)\\\\n* [Time Zone Converter for Tokyo](/worldclock/converter.html?p1=248)\\\\n* [Event Time Announcer for Tokyo](/worldclock/fixedform.html?p1=248)\\\\n* [Time difference between Tokyo and other locations](/time/difference/japan/tokyo)\\\\n* [Distance calculator to/from Tokyo](/worldclock/distance.html?p1=248)\\\\n* [Display a free clock for Tokyo on your website or blog](/clocks/free.html?n=248)\\\\n\\\\n\\\\n\\\\n## Calendar & Holidays\\\\n\\\\n[Create Japan calendar](/calendar/?year=2025&country=26)\\\\n\\\\n### Upcom\\\\n\\\\nContent truncated. Call the fetch tool with a start_index of 5000 to get more content. \\\"}]\",\"role\":\"tool\",\"name\":\"fetch\",\"tool_call_id\":\"FAxRs9mzZ0lrqmj6hA6lK3QiyEpFtdyN\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Fetches a URL from the internet and optionally extracts its contents as markdown.\\n\\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.\",\"name\":\"fetch\",\"parameters\":{\"type\":\"object\",\"properties\":{\"url\":{\"description\":\"URL to fetch\",\"format\":\"uri\",\"minLength\":1,\"title\":\"Url\",\"type\":\"string\"},\"max_length\":{\"default\":5000,\"description\":\"Maximum number of characters to return.\",\"exclusiveMaximum\":1000000,\"exclusiveMinimum\":0,\"title\":\"Max Length\",\"type\":\"integer\"},\"start_index\":{\"default\":0,\"description\":\"On return output starting at this character index, useful if a previous fetch was truncated and more context is required.\",\"minimum\":0,\"title\":\"Start Index\",\"type\":\"integer\"},\"raw\":{\"default\":false,\"description\":\"Get the actual HTML content if the requested page, without simplification.\",\"title\":\"Raw\",\"type\":\"boolean\"}},\"required\":[\"url\"]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"tool_calls\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":null,\"tool_calls\":[{\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://www.timeanddate.com/worldclock/japan/tokyo\\\",\\\"start_index\\\":5000}\"},\"id\":\"RlXYP40J3N1ZRXFwgC3lzuoMVXI8LZBV\"}]}}],\"created\":1742208659,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":68,\"prompt_tokens\":2975,\"total_tokens\":3043},\"id\":\"chatcmpl-FBBMLGiF1u5Y0YcxW5jdf7KCFtdFgotz\",\"timings\":{\"prompt_n\":2439,\"prompt_ms\":13766.984,\"prompt_per_token_ms\":5.644519885198852,\"prompt_per_second\":177.16298646094162,\"predicted_n\":68,\"predicted_ms\":4536.574,\"predicted_per_token_ms\":66.71432352941176,\"predicted_per_second\":14.989284865627676}}"
2025-03-17T19:51:06.642+09:00 INFO 93163 --- [nio-8080-exec-3] [945a7f32d30d1e8269eea469bebd3421-2a8b1ddd4e8ea7b3] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=6138 request_body="{\"messages\":[{\"content\":\"what time is it now in JST\",\"role\":\"user\"},{\"role\":\"assistant\",\"tool_calls\":[{\"id\":\"FAxRs9mzZ0lrqmj6hA6lK3QiyEpFtdyN\",\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://www.timeanddate.com/worldclock/japan/tokyo\\\"}\"}}]},{\"content\":\"[{\\\"type\\\":\\\"text\\\",\\\"text\\\":\\\"Contents of https://www.timeanddate.com/worldclock/japan/tokyo:\\\\n[Home](/) [Time Zones](/time/) [World Clock](/worldclock/) [Japan](/worldclock/japan) Tokyo\\\\n\\\\n\\\\n\\\\n* [Time/General](/worldclock/japan/tokyo \\\\\\\"General/main info about Tokyo\\\\\\\")\\\\n* [Weather](/weather/japan/tokyo \\\\\\\"Current weather and forecast for Tokyo\\\\\\\") \\\\n + [Weather Today/Tomorrow](/weather/japan/tokyo \\\\\\\"Shows a weather overview\\\\\\\")\\\\n + [Hour-by-Hour Forecast](/weather/japan/tokyo/hourly \\\\\\\"Hour-by-hour weather for the coming week\\\\\\\")\\\\n + [14 Day Forecast](/weather/japan/tokyo/ext \\\\\\\"Extended forecast for the next two weeks\\\\\\\")\\\\n + [Yesterday/Past Weather](/weather/japan/tokyo/historic \\\\\\\"Past weather for yesterday, the last 2 weeks, or any selected month available\\\\\\\")\\\\n + [Climate (Averages)](/weather/japan/tokyo/climate \\\\\\\"Historic weather and climate information\\\\\\\")\\\\n* [Time Zone](/time/zone/japan/tokyo \\\\\\\"Past and future time change dates for Tokyo\\\\\\\")\\\\n* [DST Changes](/time/change/japan/tokyo \\\\\\\"Daylight saving time changeover dates and times for Tokyo\\\\\\\")\\\\n* [Sun & Moon](/astronomy/japan/tokyo \\\\\\\"Calculate rising and setting times for the Sun and Moon in Tokyo\\\\\\\") \\\\n + [Sun & Moon Today](/astronomy/japan/tokyo)\\\\n + [Sunrise & Sunset](/sun/japan/tokyo)\\\\n + [Moonrise & Moonset](/moon/japan/tokyo)\\\\n + [Moon Phases](/moon/phases/japan/tokyo)\\\\n + [Eclipses](/eclipse/in/japan/tokyo)\\\\n + [Night Sky](/astronomy/night/japan/tokyo)\\\\n\\\\n19時50分40秒 [JST](/time/zones/jst \\\\\\\"Japan Standard Time\\\\\\\")\\\\n\\\\n2025年3月17日月曜日\\\\n\\\\n[Fullscreen](/worldclock/fullscreen.html?n=248 \\\\\\\"Local time in Japan, Tokyo\\\\\\\")\\\\n\\\\n| | |\\\\n| --- | --- |\\\\n| Country: | [Japan](/worldclock/japan) |\\\\n| Lat/Long: | 35°41'N / 139°42'E |\\\\n| Elevation: | 44 m |\\\\n| Currency: | Yen (JPY) |\\\\n| Languages: | Japanese |\\\\n| Country Code: | +81 |\\\\n\\\\n[](about:/time/map/#!cities=248)\\\\n\\\\n\\\\n\\\\nHoliday Note: [3月20日 (木), Spring Equinox](/holidays/japan/spring-equinox). Businesses may be closed. [See more](/holidays/japan/spring-equinox)\\\\n\\\\n[°C](/custom/site.html \\\\\\\"Change Units\\\\\\\")[](/weather/japan/tokyo)\\\\n\\\\n## Weather\\\\n\\\\n14 °C\\\\n\\\\nPassing clouds. \\\\n10 / 4 °C\\\\n\\\\n| | | |\\\\n| --- | --- | --- |\\\\n| 水曜日 19. | | 9 / 3 °C |\\\\n| 木曜日 20. | | 12 / 2 °C |\\\\n\\\\nWeather by CustomWeather, © 2025\\\\n\\\\n[More weather details](/weather/japan/tokyo)\\\\n\\\\n[](/time/zone/japan/tokyo)\\\\n\\\\n## [Time Zone](/time/zone/japan/tokyo)\\\\n\\\\nJST (Japan Standard Time) \\\\nUTC/GMT +9 hours\\\\n\\\\n[](/time/change/japan/tokyo)\\\\n\\\\n## [No DST](/time/change/japan/tokyo)\\\\n\\\\nNo Daylight Saving Time in 2025\\\\n\\\\n[](/time/difference/japan/tokyo)\\\\n\\\\n## [Difference](/time/difference/japan/tokyo)\\\\n\\\\nSame time as \\\\nTokyo\\\\n\\\\n[About JST — Japan Standard Time](/time/zones/jst)\\\\n\\\\n[Set your location](#)\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Sunrise](/sun/japan/tokyo)\\\\n\\\\n5時49分 \\\\n↑ 91° East\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Sunset](/sun/japan/tokyo)\\\\n\\\\n17時50分 \\\\n↑ 269° West\\\\n\\\\n[](/sun/japan/tokyo)\\\\n\\\\n## [Day length](/sun/japan/tokyo)\\\\n\\\\n12 hours, 1 minute \\\\n+2m 16s longer\\\\n\\\\n[](/moon/japan/tokyo)\\\\n\\\\n## [Moon 91.2%](/moon/japan/tokyo)\\\\n\\\\nSet – 6時56分 \\\\nRise – 20時40分\\\\n\\\\n\\\\n\\\\n## [High Tide](#)\\\\n\\\\nHigh – 6時13分 \\\\nHigh – 18時47分\\\\n\\\\n\\\\n\\\\n## [Low Tide](#)\\\\n\\\\nLow – 0時21分 \\\\nLow – 12時35分\\\\n\\\\n[More Sun & Moon in Tokyo](/astronomy/japan/tokyo) \\\\n[+ Show More Twilight and Moon Phase Information](#)\\\\n\\\\n## [Solar Noon](/sun/japan/tokyo)\\\\n\\\\nSun in South: 11時49分 \\\\nAltitude: 53.0°\\\\n\\\\n## [Astronomical Twilight](/sun/japan/tokyo)\\\\n\\\\n4時24分 – 4時54分 \\\\n18時45分 – 19時15分\\\\n\\\\n## [Nautical Twilight](/sun/japan/tokyo)\\\\n\\\\n4時54分 – 5時24分 \\\\n18時15分 – 18時45分\\\\n\\\\n## [Civil Twilight](/sun/japan/tokyo)\\\\n\\\\n5時24分 – 5時49分 \\\\n17時50分 – 18時15分\\\\n\\\\n## [Previous Moon Phase](/moon/phases/japan/tokyo)\\\\n\\\\nFull Moon \\\\n2025年3月14日金曜日 \\\\n15時54分\\\\n\\\\n## [Next Moon Phase](/moon/phases/japan/tokyo)\\\\n\\\\nThird Quarter \\\\n2025年3月22日土曜日 \\\\n20時29分\\\\n\\\\n[Need some help?](/worldclock/city-help.html)\\\\n\\\\n\\\\n\\\\n## Tools & Converters\\\\n\\\\n* [Meeting Planner for Tokyo](/worldclock/meeting.html?p1=248)\\\\n* [Time Zone Converter for Tokyo](/worldclock/converter.html?p1=248)\\\\n* [Event Time Announcer for Tokyo](/worldclock/fixedform.html?p1=248)\\\\n* [Time difference between Tokyo and other locations](/time/difference/japan/tokyo)\\\\n* [Distance calculator to/from Tokyo](/worldclock/distance.html?p1=248)\\\\n* [Display a free clock for Tokyo on your website or blog](/clocks/free.html?n=248)\\\\n\\\\n\\\\n\\\\n## Calendar & Holidays\\\\n\\\\n[Create Japan calendar](/calendar/?year=2025&country=26)\\\\n\\\\n### Upcom\\\\n\\\\nContent truncated. Call the fetch tool with a start_index of 5000 to get more content. \\\"}]\",\"role\":\"tool\",\"name\":\"fetch\",\"tool_call_id\":\"FAxRs9mzZ0lrqmj6hA6lK3QiyEpFtdyN\"},{\"role\":\"assistant\",\"tool_calls\":[{\"id\":\"RlXYP40J3N1ZRXFwgC3lzuoMVXI8LZBV\",\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://www.timeanddate.com/worldclock/japan/tokyo\\\",\\\"start_index\\\":5000}\"}}]},{\"content\":\"[{\\\"type\\\":\\\"text\\\",\\\"text\\\":\\\"Contents of https://www.timeanddate.com/worldclock/japan/tokyo:\\\\ning Holidays\\\\n\\\\n* [3月20日 (木) - Spring Equinox](/holidays/japan/spring-equinox)\\\\n* [4月29日 (火) - Shōwa Day](/holidays/japan/showa-day)\\\\n* [5月3日 (土) - Constitution Memorial Day](/holidays/japan/constitution-memorial-day)\\\\n\\\\n[More Holidays in Japan](/holidays/japan)\\\\n\\\\n\\\\n\\\\n## Airports\\\\n\\\\n* Tokyo International Airport, HND \\\\n About 17 km SSE of Tokyo\\\\n* Narita International Airport, NRT \\\\n About 63 km E of Tokyo\\\\n\\\\n[Other cities near Tokyo](/worldclock/distances.html?n=248)\\\\n\\\\n[](/)\\\\n\\\\n[Popup Window](/worldclock/fullscreen.html?n=248)[Fullscreen](#)[Exit](#)\\\\n\\\\n\\\\n\\\\nTokyo\\\\n\\\\nJapan\\\\n\\\\n19時50分59\\\\n\\\\n2025年3月17日月曜日\\\"}]\",\"role\":\"tool\",\"name\":\"fetch\",\"tool_call_id\":\"RlXYP40J3N1ZRXFwgC3lzuoMVXI8LZBV\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Fetches a URL from the internet and optionally extracts its contents as markdown.\\n\\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.\",\"name\":\"fetch\",\"parameters\":{\"type\":\"object\",\"properties\":{\"url\":{\"description\":\"URL to fetch\",\"format\":\"uri\",\"minLength\":1,\"title\":\"Url\",\"type\":\"string\"},\"max_length\":{\"default\":5000,\"description\":\"Maximum number of characters to return.\",\"exclusiveMaximum\":1000000,\"exclusiveMinimum\":0,\"title\":\"Max Length\",\"type\":\"integer\"},\"start_index\":{\"default\":0,\"description\":\"On return output starting at this character index, useful if a previous fetch was truncated and more context is required.\",\"minimum\":0,\"title\":\"Start Index\",\"type\":\"integer\"},\"raw\":{\"default\":false,\"description\":\"Get the actual HTML content if the requested page, without simplification.\",\"title\":\"Raw\",\"type\":\"boolean\"}},\"required\":[\"url\"]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"It is currently 7:50:59 PM on Monday, March 17, 2025 in Tokyo, Japan (JST).\"}}],\"created\":1742208666,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":43,\"prompt_tokens\":3511,\"total_tokens\":3554},\"id\":\"chatcmpl-7Autv26IlJFSlXcRvGFBmR342P1FH187\",\"timings\":{\"prompt_n\":530,\"prompt_ms\":3248.345,\"prompt_per_token_ms\":6.128952830188679,\"prompt_per_second\":163.1600091739024,\"predicted_n\":43,\"predicted_ms\":2840.702,\"predicted_per_token_ms\":66.06283720930233,\"predicted_per_second\":15.137103434291944}}"
2025-03-17T19:51:06.645+09:00 INFO 93163 --- [nio-8080-exec-3] [945a7f32d30d1e8269eea469bebd3421-6ff4398f4493c5b2] accesslog : kind=server method=GET url="http://localhost:8080/mcp?prompt=what%20time%20is%20it%20now%20in%20JST" status=200 duration=29443 protocol="HTTP/1.1" remote="0:0:0:0:0:0:0:1" user_agent="curl/8.7.1" response_body="It is currently 7:50:59 PM on Monday, March 17, 2025 in Tokyo, Japan (JST)."
Now let's try another example. With the prompt https://ik.am の最新の記事を3件挙げて (List the 3 latest articles from https://ik.am), we'll have it retrieve the latest information from this blog (https://ik.am).
First, let's try without MCP integration.
$ curl "http://localhost:8080/?prompt=https%3A%2F%2Fik.am+%E3%81%AE%E6%9C%80%E6%96%B0%E3%81%AE%E8%A8%98%E4%BA%8B%E3%82%923%E4%BB%B6%E6%8C%99%E3%81%92%E3%81%A6%0D%0A%0D%0A"
I'm sorry, but I cannot access the internet, so I cannot directly retrieve the latest articles from ik.am.
However, you can check the latest articles using the following methods:
* **Access the ik.am website:** [https://ik.am/](https://ik.am/)
* **Check ik.am's social media accounts:** (if they exist)
* **Search for "ik.am latest articles" on news aggregators or search engines:**
You should be able to find the latest articles from ik.am using these methods.
Naturally, it doesn't return the desired information.
Next, let's try with MCP integration.
$ curl "http://localhost:8080/mcp?prompt=https%3A%2F%2Fik.am+%E3%81%AE%E6%9C%80%E6%96%B0%E3%81%AE%E8%A8%98%E4%BA%8B%E3%82%923%E4%BB%B6%E6%8C%99%E3%81%92%E3%81%A6%0D%0A%0D%0A"
Here are the 3 latest articles from ik.am:
1. **Running Gemma 3 27B with llama.cpp and Accessing it from Spring AI** (52 minutes ago)
2. **Notes on Installing OpenTelemetry Collector on Rocky Linux** (1 week ago)
3. **Notes on Tracing Legacy Servlet Apps on Tomcat using OpenTelemetry Java Agent** (1 week ago)
The correct information was returned.
Let's look at the access logs. We can see that the LLM was requested to retrieve information from https://ik.am.
2025-03-17T19:49:53.244+09:00 INFO 93163 --- [nio-8080-exec-2] [3175dca5a12a7dab3216bd7f7d23936a-919c319746040209] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=5459 request_body="{\"messages\":[{\"content\":\"https://ik.am の最新の記事を3件挙げて\\r\\n\\r\\n\",\"role\":\"user\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Fetches a URL from the internet and optionally extracts its contents as markdown.\\n\\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.\",\"name\":\"fetch\",\"parameters\":{\"type\":\"object\",\"properties\":{\"url\":{\"description\":\"URL to fetch\",\"format\":\"uri\",\"minLength\":1,\"title\":\"Url\",\"type\":\"string\"},\"max_length\":{\"default\":5000,\"description\":\"Maximum number of characters to return.\",\"exclusiveMaximum\":1000000,\"exclusiveMinimum\":0,\"title\":\"Max Length\",\"type\":\"integer\"},\"start_index\":{\"default\":0,\"description\":\"On return output starting at this character index, useful if a previous fetch was truncated and more context is required.\",\"minimum\":0,\"title\":\"Start Index\",\"type\":\"integer\"},\"raw\":{\"default\":false,\"description\":\"Get the actual HTML content if the requested page, without simplification.\",\"title\":\"Raw\",\"type\":\"boolean\"}},\"required\":[\"url\"]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"tool_calls\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":null,\"tool_calls\":[{\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://ik.am/\\\",\\\"max_length\\\":2000}\"},\"id\":\"VlByJVu8qGYMvIigLP8prvRx1w9mkfaW\"}]}}],\"created\":1742208593,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":56,\"prompt_tokens\":535,\"total_tokens\":591},\"id\":\"chatcmpl-Em8BevyQeAtSxYmH4NNtG4o0LmaMdVTw\",\"timings\":{\"prompt_n\":531,\"prompt_ms\":2485.034,\"prompt_per_token_ms\":4.679913370998117,\"prompt_per_second\":213.6791689771649,\"predicted_n\":56,\"predicted_ms\":2964.666,\"predicted_per_token_ms\":52.94046428571429,\"predicted_per_second\":18.88914299283629}}"
2025-03-17T19:50:05.616+09:00 INFO 93163 --- [nio-8080-exec-2] [3175dca5a12a7dab3216bd7f7d23936a-8270ecc0d31703af] accesslog : kind=client method=POST url="http://localhost:8000/v1/chat/completions" status=200 duration=11914 request_body="{\"messages\":[{\"content\":\"https://ik.am の最新の記事を3件挙げて\\r\\n\\r\\n\",\"role\":\"user\"},{\"role\":\"assistant\",\"tool_calls\":[{\"id\":\"VlByJVu8qGYMvIigLP8prvRx1w9mkfaW\",\"type\":\"function\",\"function\":{\"name\":\"fetch\",\"arguments\":\"{\\\"url\\\":\\\"https://ik.am/\\\",\\\"max_length\\\":2000}\"}}]},{\"content\":\"[{\\\"type\\\":\\\"text\\\",\\\"text\\\":\\\"Contents of https://ik.am/:\\\\n* [Home](/)\\\\n* [Tags](/tags)\\\\n* [Categories](/categories)\\\\n* [Notes](/notes)\\\\n\\\\n---\\\\n\\\\n## Entries\\\\n\\\\n* [llama.cppでGemma 3 27Bを動かしてSpring AIからアクセスする](/entries/843) Last Updated 52 minutes ago\\\\n* [Rocky LinuxへのOpenTelemetry Collectorをインストールするメモ](/entries/842) Last Updated 1 week ago\\\\n* [OpenTelemetry Java Agentを使ってTomcat上のLegacyなServletアプリをTracingするメモ](/entries/827) Last Updated 1 week ago\\\\n* [Rocky LinuxへのOTLPに対応したZipkinのインストールとサービス化するメモ](/entries/841) Last Updated 3 weeks ago\\\\n* [MicroK8sのOIDC連携にCognitoを使うメモ](/entries/839) Last Updated 1 month ago\\\\n* [Macセットアップメモ (Apple M4 Max)](/entries/826) Last Updated 1 month ago\\\\n* [llama.cppでDeepSeek-R1-Distill-Qwen-32B-Japaneseを動かしてSpring AIからアクセスする](/entries/838) Last Updated 1 month ago\\\\n* [OpenTelemetry(OTLP)に対応したLGTM Observability基盤をBitnamiのHelm Chartで構築するメモ](/entries/837) Last Updated 2 months ago\\\\n* [BitnamiのZipkin Helm ChartにOpenTelemetry Moduleを追加するメモ](/entries/836) Last Updated 2 months ago\\\\n* [Minecraft MOD Server (Forge)をKubernetesにインストールするメモ](/entries/835) Last Updated 2 months ago\\\\n* [Minecraft Server (Java Edition)をKubernetesにインストールするメモ](/entries/834) Last Updated 2 months ago\\\\n* [Raspberry PiにMinecraft Server (Java Edition)をインストールするメモ](/entries/830) Last Updated 2 months ago\\\\n* [Raspberry PiにGeyserMC (Standalone)をインストールするメモ](/entries/833) Last Updated 2 months ago\\\\n* [Helm ChartをイメージごとRelocationするメモ](/entries/832) Last Updated 3 months ago\\\\n* [複数のkubeconfigをマージするメモ](/entries/831) Last Updated 3 months ago\\\\n* [Kubernetesのノードにkubectlでrootアクセスするメモ](/entries/829) Last Updated 4 months ago\\\\n* [llama-cppのOpenAI互換サーバー機能を使ってSpring AIからアクセスする](/entries/828) Last Updated 4 months ago\\\\n* [JJUG CCC 2024 Fallで\\\\\\\"最近のSpring Bootの 便利機能を復習!\\\\\\\"について話してきました。](/entries/825) Last Updated 5 months ago\\\\n* [Spring Bootで@Asyncを使う際にMicrometerのTrace Contextを引き継がせるメモ](/entries/824) Last Updated 5 months ago\\\\n* [Kubernetesのログをfluent-bitでrsyslogに転送するメモ](/entries/822) Last Updated 5 months ago\\\\n\\\\n[🇬🇧 English](/entries/en)\\\\n\\\\n---\\\"}]\",\"role\":\"tool\",\"name\":\"fetch\",\"tool_call_id\":\"VlByJVu8qGYMvIigLP8prvRx1w9mkfaW\"}],\"model\":\"gpt-4o-mini\",\"stream\":false,\"temperature\":0.0,\"tools\":[{\"type\":\"function\",\"function\":{\"description\":\"Fetches a URL from the internet and optionally extracts its contents as markdown.\\n\\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.\",\"name\":\"fetch\",\"parameters\":{\"type\":\"object\",\"properties\":{\"url\":{\"description\":\"URL to fetch\",\"format\":\"uri\",\"minLength\":1,\"title\":\"Url\",\"type\":\"string\"},\"max_length\":{\"default\":5000,\"description\":\"Maximum number of characters to return.\",\"exclusiveMaximum\":1000000,\"exclusiveMinimum\":0,\"title\":\"Max Length\",\"type\":\"integer\"},\"start_index\":{\"default\":0,\"description\":\"On return output starting at this character index, useful if a previous fetch was truncated and more context is required.\",\"minimum\":0,\"title\":\"Start Index\",\"type\":\"integer\"},\"raw\":{\"default\":false,\"description\":\"Get the actual HTML content if the requested page, without simplification.\",\"title\":\"Raw\",\"type\":\"boolean\"}},\"required\":[\"url\"]}}}]}" response_body="{\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"ik.am の最新の記事は以下の3件です。\\n\\n1. **llama.cppでGemma 3 27Bを動かしてSpring AIからアクセスする** (52分前)\\n2. **Rocky LinuxへのOpenTelemetry Collectorをインストールするメモ** (1週間前)\\n3. **OpenTelemetry Java Agentを使ってTomcat上のLegacyなServletアプリをTracingするメモ** (1週間前)\"}}],\"created\":1742208605,\"model\":\"gpt-4o-mini\",\"system_fingerprint\":\"b4897-b3c9a656\",\"object\":\"chat.completion\",\"usage\":{\"completion_tokens\":102,\"prompt_tokens\":1472,\"total_tokens\":1574},\"id\":\"chatcmpl-umbHhsiW0CppeRwVT7HNqpLQf4fW8JyJ\",\"timings\":{\"prompt_n\":931,\"prompt_ms\":5809.668,\"prompt_per_token_ms\":6.240244897959183,\"prompt_per_second\":160.25012100519342,\"predicted_n\":102,\"predicted_ms\":6081.946,\"predicted_per_token_ms\":59.62692156862745,\"predicted_per_second\":16.77094798276736}}"
2025-03-17T19:50:05.619+09:00 INFO 93163 --- [nio-8080-exec-2] [3175dca5a12a7dab3216bd7f7d23936a-cfc921514f6cff97] accesslog : kind=server method=GET url="http://localhost:8080/mcp?prompt=https%3A%2F%2Fik.am+%E3%81%AE%E6%9C%80%E6%96%B0%E3%81%AE%E8%A8%98%E4%BA%8B%E3%82%923%E4%BB%B6%E6%8C%99%E3%81%92%E3%81%A6%0D%0A%0D%0A" status=200 duration=17847 protocol="HTTP/1.1" remote="0:0:0:0:0:0:0:1" user_agent="curl/8.7.1" response_body="ik.am の最新の記事は以下の3件です。\n\n1. **llama.cppでGemma 3 27Bを動かしてSpring AIからアクセスする** (52分前)\n2. **Rocky LinuxへのOpenTelemetry Collectorをインストールするメモ** (1週間前)\n3. **OpenTelemetry Java Agentを使ってTomcat上のLegacyなServletアプリをTracingするメモ** (1週間前)"
Let's also send the prompt bbc.comのニュースヘッドライン3件のタイトルを日本語で教えて (Tell me the titles of 3 news headlines from bbc.com in Japanese).
By the way, the headlines at the time of writing are as follows.
Now, let's send the request.
$ curl "http://localhost:8080/mcp?prompt=bbc.com%E3%81%AE%E3%83%8B%E3%83%A5%E3%83%BC%E3%82%B9%E3%83%98%E3%83%83%E3%83%89%E3%83%A9%E3%82%A4%E3%83%B33%E4%BB%B6%E3%81%AE%E3%82%BF%E3%82%A4%E3%83%88%E3%83%AB%E3%82%92%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%81%A7%E6%95%99%E3%81%88%E3%81%A6"
Here are the 3 news headlines from BBC:
1. **Russia does not want peace, says EU foreign minister, as Trump and Putin set to discuss land deal**
2. **'It was all over': Ukrainian soldiers recall retreat from Kursk**
3. **India wary as Bangladesh and Pakistan mend ties**
Looks correct. It was also able to translate the information obtained from the MCP Server into Japanese.
Since you can also implement an MCP Server in Spring AI, it seems possible to extend the LLM in various ways.
Even with a private LLM, having these features suggests it could be useful in various situations.