Projects Mission Contribute GitHub
Open source · Java ecosystem

Machine Learning Cabinet

Java-native tools for running and fine-tuning language models - on-prem, air-gapped, or in the cloud. No hype, no SaaS, no Python.

See projects GitHub
Projects
What we're building
Active · Session 31

Juno

Java-native distributed LLM inference and fine-tuning. Runs open-source GGUF models locally, in a cluster, or embedded as a JVM library. OpenAI-compatible REST API included. No Python. No NCCL. No InfiniBand required.

More info GitHub

At a glance

  • LLaMA · Mistral · TinyLlama · Meta-Llama 3 · Gemma architectures; Phi-3 under development
  • CUDA 12.x and ROCm 6+ GPU backends via Panama FFI - FP16 resident weights, OOM fallback to CPU
  • Pipeline and tensor parallelism
  • FLOAT32 / FLOAT16 / INT8 activation wire formats
  • LoRA fine-tuning, inference overlay, and merge to standalone GGUF
  • OpenAI-compatible REST API - swap base URL, no glue code needed
  • Maven BOM on Central - cab.ml:juno-bom:0.1.0
  • AWS cluster automation via juno-deploy.sh
Principles
Our mission

We are introducing Java in the ML field providing pure distributed JVM inference and training

On-prem first and network-ready. No mandatory cloud dependency, no SaaS lock-in. No Python, GIL, pure JVM orchestration using performance-proven engine

JFR metrics. Utilising custom flight-recorder events across hot paths - instrumentation-driven development that tells you exactly what is happening inside your runtime

Honest documentation. Solid testing. Clear issue tracking. Real benchmarks.

Contribute
How to get involved

Code

Fork the project, pick up an open board issue, move to QA and add a comment ref to your fork.

Benchmarks

Please send us your performance report - just provide a Metrics summary, especially if you are using GPU. Mail to dev@ml.cab and list your: GPU card details, juno startup command, conversation log, JFR Metrics Summary section.

Bug reports

Tried Juno on your setup? Found a rough edge? Open an issue on the board. Specific, reproducible reports move things forward fastest.