Open source · Java ecosystem

Machine Learning Cabinet

Java-native tools for running and fine-tuning language models - on-prem, air-gapped, or in the cloud. No hype, no SaaS, no Python.

See projects GitHub

Projects

What we're building

Active · Session 31

Juno

Java-native distributed LLM inference and fine-tuning. Runs open-source GGUF models locally, in a cluster, or embedded as a JVM library. OpenAI-compatible REST API included. No Python. No NCCL. No InfiniBand required.

More info GitHub

At a glance

LLaMA · Mistral · TinyLlama · Meta-Llama 3 · Gemma architectures; Phi-3 under development
CUDA 12.x and ROCm 6+ GPU backends via Panama FFI - FP16 resident weights, OOM fallback to CPU
Pipeline and tensor parallelism
FLOAT32 / FLOAT16 / INT8 activation wire formats
LoRA fine-tuning, inference overlay, and merge to standalone GGUF
OpenAI-compatible REST API - swap base URL, no glue code needed
Maven BOM on Central - cab.ml:juno-bom:0.1.0
AWS cluster automation via juno-deploy.sh

Principles

Our mission

We are introducing Java in the ML field providing pure distributed JVM inference and training

On-prem first and network-ready. No mandatory cloud dependency, no SaaS lock-in. No Python, GIL, pure JVM orchestration using performance-proven engine

JFR metrics. Utilising custom flight-recorder events across hot paths - instrumentation-driven development that tells you exactly what is happening inside your runtime

Honest documentation. Solid testing. Clear issue tracking. Real benchmarks.

Contribute

How to get involved

Code

Fork the project, pick up an open board issue, move to QA and add a comment ref to your fork.

Benchmarks

Please send us your performance report - just provide a Metrics summary, especially if you are using GPU. Mail to dev@ml.cab and list your: GPU card details, juno startup command, conversation log, JFR Metrics Summary section.

Bug reports

Tried Juno on your setup? Found a rough edge? Open an issue on the board. Specific, reproducible reports move things forward fastest.