Jun 27, 2026 - eRacks Publishes a Private-AI Sizing Guide: 70B-Class Models Run On-Premise from $5,995

Fremont, CA - Jun 27, 2026

FOR IMMEDIATE RELEASE

eRacks Publishes a Private-AI Sizing Guide: 70B-Class Models Run On-Premise from $5,995

Plain-English guidance on how much GPU memory and system RAM each model size needs, when self-hosting beats cloud APIs, and how to keep AI data entirely in-house, for healthcare, legal, government, and research teams that cannot send data to a third party.

eRacks Open Source Systems today published "Private AI, Sized & Priced," a vendor-neutral buyer's guide for organizations that want to run large language models on their own hardware rather than through a cloud API, and announced that a 70-billion-parameter model now runs on-premise on its AILSA server starting at $5,995.

The guide addresses the question eRacks hears most from prospective AI buyers: how big a machine do I actually need? It explains that AI servers are priced largely by GPU memory (VRAM), gives a model-size-to-VRAM table (a 70B model in 4-bit quantization needs roughly 42 GB, and about 80 GB in 8-bit), and adds the part most buyers miss: system RAM should be sized to about 1.5 to 2 times the VRAM, and a multi-GPU server needs a server-class CPU for the PCIe lanes. It then lays out the cost crossover where owning hardware beats a per-user cloud subscription. A 30-person team paying $30 per user per month for a cloud AI service spends about $10,800 a year, every year, with all prompts transiting a third party's servers. An on-premise server covers the same everyday inference and pays for itself in under a year.

For teams in regulated fields, such as protected health information, attorney-client material, government data, or unpublished research, the calculus is simpler still: the data legally or contractually cannot leave their control, and a private server is the only option.

eRacks AI servers run entirely on open-source software, with Ubuntu Linux plus Ollama, Open WebUI, vLLM, llama.cpp, and PyTorch pre-installed and tested, so staff reach the AI from a browser on day one, with no per-seat fees, no per-token billing, and no data leaving the building. The line spans the 2U AILSA (from $5,995, up to 96 GB of GPU memory) through the 4U AISHA (up to 256 GB for 70B models at full 8-bit precision).

The guide is published at https://eracks.com/guides/private-ai-sizing/. eRacks has built custom open-source servers since 1999, and every system is configured to order and tested before shipping. The company will size a private AI server to a customer's exact models and user count at no charge.

About eRacks Open Source Systems

eRacks Open Source Systems is an open-source server and storage specialist founded in 1999 and headquartered in Fremont, CA. The company builds rackmount servers, NAS, HPC clusters, and AI inference servers configured to customer requirements, running Linux and open-source software. eRacks serves businesses, research institutions, healthcare providers, and government agencies worldwide.

Media Contact

Joseph Wolff eRacks Open Source Systems joe@eracks.com https://eracks.com