Photo by Conner Baker on Unsplash

Photo by Conner Baker on Unsplash

Operating large language models in-house

At Home

Article from ADMIN 88/2025
By
An internal AI server is an interesting way to retain data sovereignty. We show you how to set up an in-house AI server on your hardware and use it in parallel with AI services such as ChatGPT in the cloud.

Operating your own artificial intelligence (AI) server in your data center offers a number of advantages over cloud services. One decisive factor is retaining complete control over sensitive company data, which will always remain on your network, which improves data security, and which helps you comply with strict data protection requirements, especially in highly regulated industries. Moreover, an in-house AI server enables consistent performance without dependencies on an Internet connection or external providers. Data processing latency is reduced, which is particularly beneficial for computationally intensive tasks such as image or speech analysis.

Another advantage is the ability to customize your hardware and software environments. You can scale and configure your servers individually to meet the specific requirements of your AI applications, without being restricted by standardized services from cloud providers. In the long term, an in-house server can also prove to be more cost efficient, because regular billing for cloud services is eliminated, and the infrastructure can be fully amortized. Being independent of price adjustments or service conditions imposed by external providers also gives you financial and operational peace of mind.

Hardware Requirements

The equipment for your large language model (LLM) environment depends on the requirements and the number of users, but the choice of graphics processing unit (GPU) is crucial for AI workloads: GPUs such as the NVIDIA A100 or the newer H100 are the market leaders because they are specifically optimized for deep learning and machine learning. These GPUs support technologies such as tensor cores, which specialize in computing neural networks, and offer a massive speed boost in terms of training and inference.

The H100 is based on the Hopper architecture and offers significant performance gains with lower power consumption


...

Use one of the options below to read the full article

Buy this article as PDF

Download Article PDF now with Express Checkout
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Related content

  • Building a Persistent Local AI Stack
    With Ollama and Open WebUI, you can build a streamlined local AI stack. Docker Compose helps you keep models and application data persistent, update safely, back up key files, and recover quickly if something breaks.
  • Integrating AI Systems
    The Model Context Protocol helps standardize and simplify communication between applications and large language models. We take a closer look at the protocol and offer some practical examples.
  • Chatbots put to the scripting test
    The AI skillset is currently limited, so you don't yet have to worry about AI replacing programmers. We look at the capabilities of AI scripting with free large language models and where it works best.
  • Intelligent Stack

    openEuler prepares for the AI future with a complete software stack tailored for artificial intelligence.

  • Configuration management with Chef
    Ever dream of rolling out a complete computer farm with a single mouse click? If you stick to Linux computers and you speak a little Ruby, Chef can go a long way toward making that dream come true.
comments powered by Disqus