こちらの記事でllama-cpp-pythonを試しましたが、今度はllama.cppのJavaバインディングを試します。
次の依存ライブラリを追加
<dependency>
<groupId>de.kherud</groupId>
<artifactId>llama</artifactId>
<version>2.2.1</version>
</dependency>
次のコードを記述
package org.example;
import de.kherud.llama.InferenceParameters;
import de.kherud.llama.LlamaModel;
import de.kherud.llama.ModelParameters;
public class Main {
public static void main(String... args) {
ModelParameters modelParams = new ModelParameters.Builder()
.setNGpuLayers(1)
.build();
InferenceParameters inferParams = new InferenceParameters.Builder()
.setTemperature(1f)
.build();
try (LlamaModel model = new LlamaModel("/opt/models/mistral-7b-instruct-v0.1.Q4_K_M.gguf", modelParams)) {
Iterable<LlamaModel.Output> outputs = model.generate("""
[INST]
Tell me a joke.
[/INST]
""", inferParams);
for (LlamaModel.Output output : outputs) {
System.out.print(output);
}
}
}
}
実行
$ mvn -q compile exec:java -Dexec.mainClass=org.example.Main 2> /dev/null
/de/kherud/llama/Mac/aarch64
Extracted 'ggml-metal.metal' to '/var/folders/6p/vxhp1wpj2mq5w8drct9k8t4w0000gq/T/ggml-metal.metal'
Extracted 'libllama.dylib' to '/var/folders/6p/vxhp1wpj2mq5w8drct9k8t4w0000gq/T/libllama.dylib'
Extracted 'libjllama.dylib' to '/var/folders/6p/vxhp1wpj2mq5w8drct9k8t4w0000gq/T/libjllama.dylib'
Why don't scientists trust atoms?
Because they make up everything!%
JavaアプリにローカルLLMを組み込むのが簡単になりそうです。