Java-native tools for running and fine-tuning language models — on-prem, air-gapped, or in the cloud. No hype, no SaaS, no Python.
Java-native distributed LLM inference and fine-tuning. Runs open-source GGUF models locally, in a cluster, or embedded as a JVM library. OpenAI-compatible REST API included. No Python. No NCCL. No InfiniBand required.
--pTypecab.ml:juno-bom:0.1.0juno-deploy.shNo Python. No Spring Boot. No framework bloat. JVM reads GGUF directly and runs inference end to end.
On-prem and air-gapped first. No mandatory cloud dependency, no telemetry, no SaaS lock-in.
Tests before features. A module without a test suite is a module that cannot be trusted.
Honest documentation. Known gaps, open issues, real benchmarks — not marketing copy.
Pick an open issue, send a PR. All modules have their own test suite. github.com/ml-cab/juno
GPU numbers on real hardware are the most useful contribution right now. CUDA 12.x access and the integration suite is all you need.
Tried Juno on your setup? Found a rough edge? Open an issue. Specific, reproducible reports move things forward fastest.