A small group of engineers working to make large language models a natural part of the Java world — not a foreign dependency bolted on from the outside.
Each project is independent — a focused tool that stands on its own.
Distributed LLM inference engine, fully on the JVM. Reads GGUF models directly, runs the transformer forward pass in pure Java, shards across commodity GPU nodes via gRPC. No Python, no JNI wrappers, no subprocess.
-PgpuPick an open issue, send a PR. All modules have their own test suite — contributions keep them green. Java 25+, Maven 3.9. github.com/ml-cab/juno
GPU numbers on real hardware are the most useful thing right now. If you have CUDA access, run the integration suite and report results in an issue.
Tried Juno on your cluster? Found a rough edge? Open an issue or start a discussion. Specific reports are more useful than general opinions.
The architecture doc is thorough but dense. Better guides, worked examples, and diagrams for specific use-cases are genuinely welcome.
This is volunteer work. Infrastructure, GPU time for testing, and time itself cost something. If Juno is useful to your organization, consider supporting its development.
ML Cabinet is an informal group of engineers with a shared premise: the Java ecosystem deserves production-quality ML tooling, not wrappers around Python libraries.
We're not a company. There's no roadmap committee or product manager. Projects start because someone has a real problem and grow because others find them useful.
The name is the idea of a cabinet of tools — each made carefully for one purpose, all interoperable.