Frequently Asked Questions

Questions Answered

Everything you need to know about on-prem enterprise AI deployment, NVIDIA DGX hardware, data sovereignty, pricing, and managed operations.

Getting Started

How quickly can we have production AI running?

Most engagements go from kickoff to first production workflow in 3-4 weeks. Week 1 is discovery, weeks 2-3 are hardware deployment and configuration, and by week 4 you have a working AI system with real data.

Do we need AI specialists on staff?

No. That's the point. We're your AI team. We handle hardware, models, configuration, and ongoing operations. Your IT team learns alongside us, but you don't need to hire AI engineers.

What's the typical engagement look like?

We start with a paid discovery sprint to identify the highest-impact AI opportunities in your business. If it's a fit, we deploy hardware, configure your first workflow, and provide ongoing managed operations.

Technical

What hardware do you deploy?

NVIDIA DGX Spark — enterprise-grade AI infrastructure with 1 petaflop of compute and 128GB unified memory. It sits in your server room, on your network.

What models can it run?

Open-source and NVIDIA-optimized models including Llama, Mistral, and domain-specific fine-tuned models. We also support cloud burst to NVIDIA NIM when needed.

How does this integrate with our existing systems?

We build integrations to your specific systems as part of the implementation. This isn't a generic plug-and-play — it's configured for your environment.

Security & Compliance

Does our data leave the building?

No. The entire system runs on hardware in your building, on your network. No API calls to external AI providers, no data in transit to third-party servers.

Is this HIPAA and SOC 2 compliant?

Compliant by architecture. Because your data never leaves your network, you inherit your existing security posture. We deploy with audit logging, access controls, and encryption built in.

Can AI models be trained on our data without consent?

No. Unlike cloud AI providers, there's no third party involved. Your models run on your hardware. Your data stays yours.

Pricing

How does pricing work?

Fixed annual cost covering hardware, implementation, and managed operations. No per-token metering, no usage surprises. You know exactly what you're paying.

What does a pilot engagement cost?

Discovery sprints start at $10-25K depending on scope. This isn't a free trial — it's a real engagement where we identify specific AI opportunities and prove value before you commit to a full deployment.

How does this compare to cloud AI costs?

At enterprise scale, cloud AI per-token costs compound fast and become unpredictable. Fixed-cost on-prem eliminates the variable — you get unlimited inference on hardware you own.

On-Prem vs Cloud AI

What is on-premises AI and how is it different from cloud AI?

On-premises AI runs on hardware physically located in your building, on your network. Unlike cloud AI services like OpenAI or Azure AI that process your data on external servers, on-prem AI keeps 100% of your data inside your firewall. You get predictable fixed costs instead of per-token pricing, full data sovereignty, and no dependency on internet connectivity for AI operations.

Why are enterprises moving AI from cloud to on-premises?

The same pattern that happened with cloud computing is happening with AI: companies moved to the cloud for convenience, then realized the costs compound fast at scale and their data is leaving their control. Enterprise AI repatriation is driven by unpredictable per-token costs, data sovereignty concerns (especially in regulated industries), and the need for consistent performance without cloud variability.

Is on-prem AI more expensive than cloud AI?

At enterprise scale, on-prem AI is typically less expensive. Cloud AI charges per token — which compounds quickly when running multiple workflows across teams. On-prem AI has a fixed annual cost covering hardware, implementation, and operations. You get unlimited inference on hardware you own, with no usage surprises.

What hardware does OpenGate use for on-prem AI deployment?

We deploy NVIDIA DGX Spark systems — enterprise-grade AI hardware with 1 petaflop of compute and 128GB unified memory. The DGX Spark sits in your server room, connects to your network, and runs AI models locally. As an NVIDIA Inception Partner, we have access to hardware discounts and the broader NVIDIA AI ecosystem.

Still Have Questions?

Talk to our founders directly. No sales pitch — just an honest conversation about whether on-prem AI is right for your organization.

Schedule a Strategy Call