The HP Z2 Mini G1a “David” vs. “Goliath” 9950X3D Workstation Review

Photo of author

HP sent us a five-pound Z2 Mini G1a Workstation, a remarkably compact system that occupies roughly half the space of a shoebox. Designed for professionals in AI/machine learning, 3D modeling, and digital content creation, it’s powered by AMD’s Ryzen AI Max PRO Series processor and notably omits a discrete GPU.

Our 50-pound comparison “workstation” is a high-end consumer PC featuring a Ryzen 9 9950X3D, RTX 5090, and 192 GB of Corsair DDR5-6000 memory. We tested both systems extensively to see how this seemingly uneven matchup would play out.

Before continuing, it’s important to clarify that our PC is not a true workstation like HP’s Z2 Mini G1a. It’s better described as a pseudo-workstation—capable of running most workstation-class tasks but lacking:
• ISV certifications
• ECC memory supported by workstation-grade motherboards
• Professional support and reliability guarantees that HP provides for mission-critical environments

Ours is instead an ultra-high-end Creator PC, ideal for elite gamers, game developers, media and entertainment professionals, and digital creators engaged in 3D rendering, video editing, graphic design, and AI workloads—but who don’t require absolute rendering accuracy.

Although we refer to our PC as a workstation throughout this review (and in the charts), since it handles SPECworkstation and SPECviewperf effectively, it shouldn’t be mistaken for one capable of absolute precision rendering. The distinction ultimately lies in driver optimizations and ISV certifications: A professional graphics card ensures consistent precision across its lineup, while even an RTX 5090 using Studio Drivers cannot guarantee identical, error-free renders or accurate viewport visualization demanded by professionals.

Our pseudo-workstation’s one notable limitation lies in the ASRock X870E Taichi motherboard, which restricts the 4 × 48 GB Corsair DDR5-6000 modules to 4200 MT/s. For performance evaluation, we employed a comprehensive suite of benchmarks, real-world workloads, multiple large language models (LLMs), and games to assess whether HP’s Z2 Mini G1a truly earns its workstation credentials.

Test Configuration

Z2 Mini G1a

Supplied by HP

Z2 Mini G1a Configuration
Processor:AMD Ryzen AI Max+ PRO 395 (Zen 5, 120 W, 3.0/5.1 GHz base/boost, 64 MB L3 cache, 16 cores / 32 threads)
Graphics:AMD Radeon 8060S (20 × RDNA 3.5 workgroup processors up to 2.9 GHz)
NPU: AMD Ryzen AI XDNA 2 (up to 50 TOPS)
Memory:8 × 16 GB Micron LPDDR5X-8533 ECC (128 GB total, effective 8000 MT/s, soldered to motherboard)
Storage:2 × Kioxa XG8 1 TB PCIe 4.0 NVMe SSDs (Gen 4 × 4)
Front/Top Ports:1 × USB-A 10 Gbps, 1 × USB-C 10 Gbps
Rear Ports: 2 × Thunderbolt 4, 1 × USB-A 10 Gbps, 2 × USB-A 2.0 (480 Mbps), 2 × Mini DP 2.1, 1 × RJ-45 Ethernet, 2 × Flex I/O options (dual USB-A 10 G / dual USB-C 10 G / 1, 2.5, and 10 GbE plus 1 GbE fiber)
Connectivity: MediaTek MT7925 Wi-Fi 7 + Bluetooth 5.4
Audio: 3.5 mm headset jack
Power: HP 300 W PSU (internal)
Display: BenQ EW3270U 32″ 4K UHD 60 Hz FreeSync

Custom ‘Workstation’ PC

Supplied by AMD / ASRock / Corsair / NVIDIA

Custom ‘Workstation’ PC Configuration
Processor: AMD Ryzen 9 9950X3D
Motherboard: ASRock Taichi X870E (BIOS 3.50, PBO −0.3 mV, TDP cap 85 °C, SVM disabled)
Memory: 4 × 48 GB Corsair Vengeance DDR5-6000 @ 4200 MT/s (CL30-36-36-76)
Storage:Samsung 990 Pro 1 TB PCIe 4.0 SSD (C:)
TEAMGROUP MP44 4 TB PCIe 4.0 SSD (Storage)
GPU: NVIDIA RTX 5090 Founders Edition
Cooling: DeepCool Castle 360EX AIO
PSU: Super Flower Leadex SE 1200 W 80+ Platinum
Case: Corsair 5000D
Display: LG C1 48″ 4K 120 Hz

Software Stack

  • Windows 11 Professional 64-bit 2024 2H2 
  • NVIDIA Studio driver 581.29 for apps (577.00 for PCMark 10 only) /581.29 GRD for games/3DMark. All drivers, games & apps updated
  • UL 3DMark Professional, courtesy of UL
  • UL Procyon Suite, courtesy of UL
  • UL PCMark 10 Professional, courtesy of UL
  • SPEC Workstation 4.0
  • SPECviewperf 15
  • PugetBench for DaVinci Resolve (Basic), courtesy of Puget Bench
  • AIDA64 Engineer, courtesy of FinalWire
  • 7zip
  • CrystalDiskMark
  • Blender Benchmark 4.5
  • Cinebench 2020
  • LuxMark v3.1 & 4.0
  • CPUZ Benchmark
  • GeekBench CPU
  • GeekBench AI
  • HWiNFO64
  • LLM Studio 03.28 – OpenAI gpt-oss 20B & 120B / Meta Llama 3.3 70B
  • MLPerf Client 1.0 – Phi 3.5 Mini Instruct / Phi 4 Reasoning 14B / Llama 3.1 8B Instruct / Llama 2 7B Chat
  • MaxxMem2
  • PassMark Performance Test
  • Novabench
  • Vovsoft RAM Benchmark
  • V-Ray CPU Benchmark
  • y-cruncher
  • Five games – Cyberpunk 2077, Indiana Jones and the Great Circle, Total War: Pharaoh Dynasties & Warhammer, and Sid Meier’s Civilization VII AI benchmark.

Let’s look closely at the Z2 Mini G1A, which retails for just over $5,000, but can be purchased at BHPhoto, which is an authorized HP dealer, for under $3,350 with the same specs as this review workstation.

Unboxing and First Impressions

HP includes a three-year warranty with the Z2 Mini G1a, which arrives as a complete ready-to-use kit — keyboard, mouse, DP mini-to-full-size adapter, and power cord all included. Just add your own monitor and DP cable.

The system ships with Windows 11 Professional preinstalled (Linux is also available) and includes HP’s Proprietary Software Suite, featuring HP Assistant, Premium+ Enterprise Support, and Wolf Security, which extends far beyond typical antivirus protection.

We were surprised that HP Performance Advisor was not preinstalled, as it’s one of HP’s best workstation utilities. The tool allows users to configure BIOS and performance settings directly from Windows, optimize applications per workload, generate diagnostic reports, and recommend optimal drivers. It’s a must-have for any HP workstation and should be downloaded separately.

We were slightly disappointed that no HDMI ports or adapters were included. However, the Z2 Mini G1a supports up to four 4K displays simultaneously, providing strong multi-monitor flexibility despite its compact form.

Our review unit lacked Flex I/O modules, though they’re affordable upgrades. A fully equipped configuration can provide:
• Up to four Ethernet ports (including 10 GbE)
• Two DB9 serial ports
• Two additional USB ports, allowing for ten total I/O connections when both Flex modules are installed.

The included keyboard is full-sized and functional, though fairly basic; the mouse is equally serviceable and responsive. A key highlight: there’s no external power brick. The 300 W PSU is built directly into the chassis, which opens easily via a rear-mounted touch-to-unlock mechanism for maintenance.

Two 40 mm squirrel-cage fans direct airflow through a copper heatsink atop a vapor-chamber cooling system. Under normal loads, the Z2 Mini G1a operates nearly silently, although at full load, the fans are audible across a room. By default, the system runs in maximum performance mode, but fan RPM can be manually tuned down for quieter operation with minimal performance loss.

During the SPECworkstation Blender benchmark, peak CPU temperatures reached 101.6 °C with a brief 240 W spike measured via Kill-A-Watt. Under sustained workloads, the CPU stabilized around 4.3–4.4 GHz with an average maximum 220 W power draw, typical for full-load operation.

Source: HP

The Z2 Mini G1a features a sleek, professional design and a robust plastic case. It can be oriented vertically or horizontally, mounted under a desk, racked (up to five units per 4U), or attached to the rear of an HP display.

The rotating HP logo ensures correct orientation whether placed vertically or horizontally.

Rubberized feet prevent slippage and protect surfaces in either orientation.

Convenient side (or top) USB ports and a 3.5 mm headphone jack provide quick access to peripherals and adequate onboard audio.

The rear I/O panel includes two Mini DisplayPort 2.1, two USB 2.0, two USB 3.0 (10 Gbps), two Thunderbolt 4, one USB-C (40 Gbps), 2.5 Gb Ethernet, and a cable-lock slot. Wireless connectivity is handled by Wi-Fi 7 and Bluetooth 5.4.

A rear label provides key system details and regulatory information.

HWiNFO64

Let’s take a closer look at the Z2 Mini G1a’s system specifications and performance metrics using HWiNFO64. The HWiNFO64 system summary provides a detailed breakdown of the Z2 Mini G1a’s hardware configuration.

The memory clock speed registers correctly at 4000 MHz, and as it’s DDR (Double Data Rate), data transfers occur twice per clock cycle—yielding an effective rate of 8000 MT/s (megatransfers per second). The installed ECC LPDDR5X memory maintains strong bandwidth with modest latency, as shown below.

Gaming Performance Summary Chart & 3DMark

Although gaming is not the Z2 Mini G1a’s primary focus, we benchmarked five modern, CPU- and GPU-intensive titles alongside two synthetic 3DMark tests to illustrate its performance capabilities. All tests were run at 1920×1080, except for the Civilization VII AI benchmark at 3040×2160 to compare average turn times against the RTX 5090-equipped PC.

The HP Z2 Mini G1a is foremost a workstation, yet its RDNA 3.5 integrated graphics stand among the most capable on the market. It can handle modern games at 1080p, and even 1440p with upscaling, though ultra ray tracing remains out of reach. Remarkably, it trailed the RTX 5090 only slightly in the Civilization VII AI benchmark.

3DMark

3DMark’s CPU Profile test scales from single-threaded to maximum-thread workloads to evaluate efficiency and throughput. We also compared FireStrike Extreme’s physics test at 4K to assess real-world CPU performance.

As expected, the Ryzen 9 9950X3D outperformed the Z2 Mini G1a’s Ryzen AI Max+ PRO 395 in synthetic workloads, despite both featuring 16 cores and 32 threads.

Next, we turn to non-gaming synthetic benchmarks designed to evaluate memory bandwidth, storage performance, GPU compute, and CPU efficiency across both systems.

Synthetic Benchmark Analysis

AIDA64 Engineer v8.00.80000

AIDA64 is an important industry tool for benchmarkers. Its memory bandwidth benchmarks (Memory Read, Memory Write, and Memory Copy) measure the maximum available memory data transfer bandwidth. AIDA64’s benchmark code methods are written in Assembly language, and they are well-optimized. We use the Engineer’s full version of AIDA64 courtesy of FinalWire. AIDA64 is free to try and use for 30 days.

The Memory Latency test in AIDA64 measures the delay between issuing a read command and the data arriving in the CPU’s integer registers. It also evaluates read/write/copy bandwidth and cache performance, providing a full view of system memory behavior.

Below are the AIDA64 Cache & Memory benchmark results for the Z2 Mini G1a.

The chart below compares Z2 Mini G1a results with the full-sized destop, emphasizing CPU benchmarks sensitive to memory speed and latency.

The Z2 Mini G1a’s LPDDR5X-8000 ECC memory delivers exceptional bandwidth, significantly outperforming the desktop’s DDR5 4200 kit. Although its latency is higher, the Ryzen AI Max+ PRO 395 consistently trades blows with the Ryzen 9 9950X3D, even leading in several tests.

Finally, here are the AIDA64 GPGPU results, highlighting compute performance across CPU and integrated GPU workloads.

Both the CPU and integrated GPU show strong compute throughput, reaffirming the Z2 Mini G1a’s efficiency despite its compact design.

CPU-Z Benchmark

The CPU-Z benchmark provides a quick synthetic test that measures both single-threaded and multi-threaded CPU performance, allowing direct comparison across processor generations.

Included here primarily for context, the Z2 Mini G1a performed surprisingly well, staying competitive against the higher-clocked Ryzen 9 9950X3D.

MaxxMem² v3.00.24.109

MaxxMem² is a lightweight, free memory benchmarking utility used to measure bandwidth and latency performance.

The results aligned closely with AIDA64’s findings, validating the Mini’s fast memory throughput and efficient cache design.

PassMark Software’s PerformanceTest

PassMark Software’s PerformanceTest tool evaluates system-wide performance through composite tests including Memory Mark, Graphics Mark, and Disk Mark.

The scores are relatively close except for graphics, where the RTX 5090 comes into play. Instead of just a score, we also want to compare SSD performance between our desktop and the Mini.

CrystalDiskMark

CrystalDiskMark measures sequential and random read/write throughput of installed drives, making it an industry-standard tool for SSD benchmarking

The Kioxa XG8 SSDs demonstrate solid Gen 4×4 NVMe performance, though the 2 × 1 TB capacity may be limiting for heavy workloads. Users can configure HP’s 2 × 2 TB or 2 × 4 TB options, or upgrade later—though the Mini’s design requires more tools and care than a typical desktop SSD swap.

Novabench Summary

Novabench provides a rapid system benchmark that scores CPU, GPU, memory, and storage performance. Here is a summary:

The desktop leads in GPU score thanks to the RTX 5090, while CPU and memory scores remain surprisingly close between both systems.

Vovsoft RAM Benchmark

Vovsoft RAM Benchmark simulates real-world memory utilization by testing large, user-defined data blocks. Multiple block sizes were tested; an ‘x’ denotes configurations too large for the memory kit.

The desktop’s 192 GB DDR5 kit handled blocks up to 100 GB, while the Z2 Mini G1a remained competitive across all other tested sizes, demonstrating impressive stability and memory throughput.

7-Zip Compression Benchmark

The 7-Zip built-in benchmark is a synthetic benchmark that tests LZMA and LZMA2 algorithms compression/decompression and gives a rating in GIPS (giga or billion instructions per second), which is calculated from the measured speed.

Increasing directory size revealed memory scalability limits — only the desktop’s 192 GB configuration could process the largest 768 MB test, though it also outpaced the Mini in compression throughput.

Y-cruncher Benchmark

Y-cruncher is a synthetic benchmark that tests the raw computational power of a system by calculating Pi, pushing the CPU and RAM to their limits. It is a free test that is fully multi-threaded. We chose the 20 billion digit test and kept the test running only in RAM, with no swapping to disk allowed.

The Z2 Mini G1a decisively outperformed the desktop in Y-cruncher, demonstrating the advantage of ultra-fast LPDDR5X-8000 memory in compute-intensive tasks.

With synthetic testing complete, we now move on to semi-real-world workloads that simulate practical performance across productivity and AI applications.

Semi-Real World Benchmarks

Geekbench 6 CPU and Geekbench AI Benchmarks

GeekBench 6 is a well-established cross-platform benchmark that evaluates CPU performance by timing the completion of diverse, real-world workloads

Both Geekbench and Geekbench AI straddle the line between synthetic and real-world testing. They simulate practical workloads such as image compression, machine learning inference, and object recognition.

Below is a summary comparison of both systems’ results in Geekbench’s CPU and AI benchmarks:

The desktop scored higher in general CPU performance but failed to complete certain AI WinML inference tests, highlighting the Z2 Mini G1a’s hardware-accelerated NPU advantage.

Blender 4.5.0 Benchmark

Blender 4.5.0 serves as a real-world benchmark by rendering multiple professional-grade 3D scenes. While rendering is just one part of the full workflow, it offers a practical performance snapshot for creators.

We ran the latest Blender 4.5.0 benchmark, focusing on CPU rendering performance. The benchmark outputs results in samples per minute, automatically rendering each scene multiple times. It may be downloaded from here.

Below is the performance summary:

Cinebench 2024 (MAXON Redshift Engine)

Cinebench 2024 is based on MAXON’s professional 3D content creation suite, Cinema 4D. The latest version of Cinebench uses real-world rendering tasks from the Redshift engine, although it only runs the Maxon Renderer on the Export/Render phase of the workflow. It provides scores for multi-core and single-core CPU (as well as GPU) accurately and automatically, and is an excellent tool for quick comparison purposes.

Here’s the Cinebench 2024 summary chart comparing both systems.

V-Ray CPU Benchmark

The V-Ray benchmark leverages the same rendering engine used by professionals in visual effects and design to measure real-world render performance. We tested CPU-only mode to isolate raw computational throughput.

LuxMark Benchmark (v3.1 & v4.0 Alpha)

Luxmark is a GPU benchmark using the LuxCore renderer, which is an open-source ray-tracing renderer used for visual effects and architectural applications. These two benchmarks are relevant for evaluating workstation graphical rendering capabilities, where accurate light simulation is crucial. It measures OpenCL performance, and it’s based on Intel Embree. Although narrow in scope since rendering is only one part of the workflow, it’s useful for comparing workstation performance.

We used the older 3.1 version and the newest 4.0 (Alpha). Each version offers 3 separate tests, and we chose the most complex scenes to give our workstations a real workout.

Below are the LuxMark benchmark results for each platform.

The two CPUs deliver comparable compute results, but unsurprisingly, the RTX 5090’s GPU compute power far exceeds the integrated Radeon 8060S in LuxMark workloads.

Next, we’ll move on to broader hybrid benchmark suites that incorporate office, AI, and mixed workload testing.

PCMark 10 Professional Edition

PCMark 10 offers a range of real-world timed benchmarks, simulating everyday tasks such as web browsing, video conferencing, photo and video editing, and casual gaming. We used the Extended Suite in this review (courtesy of UL) for a comprehensive evaluation.

Our analysis focuses on overall system performance along with photo and video editing subtests. While UL’s Procyon Office Suite has largely superseded PCMark 10, the benchmark remains fully supported and relevant, especially for systems lacking Adobe Creative Cloud licenses.

Since PCMark 10’s Extended Suite relies on GPU acceleration for certain tasks, the desktop with the RTX 5090 achieved higher overall scores.

UL Procyon Benchmark Suite

UL’s Procyon benchmark suite blends real-world applications and simulated workloads to measure performance across different domains. It includes Office, AI Inference, and Image Generation benchmarks, offering a balanced view of productivity and AI capabilities.

Procyon Office Benchmark

The Procyon Office suite evaluates performance in Microsoft Word, Excel, PowerPoint, and Outlook. The Z2 Mini G1a delivered detailed, consistent results across these productivity tasks.

For fairness, we disabled the RTX 5090 and used the Ryzen 9 9950X3D’s integrated graphics (IG) for comparison. Below is the summary of results:

The Z2 Mini G1a performed well for all Office tasks, though the desktop retained a modest lead.

Procyon AI Computer Vision Benchmark

The Procyon AI Computer Vision Benchmark evaluates AI inference performance across models from five vendors. It includes:
MobileNet V3 – lightweight image classification
Inception V3 – deeper image recognition
ResNet-50 – neural network training and classification
DeepLab V3 – image segmentation via pixel clustering
Real-ESRGAN – intensive upscaling and image enhancement

We first ran the Windows ML Float32 version of the benchmark on both systems to establish baseline CPU inference performance.

Performance results were mixed—each system excelled in different inference workloads. We then compared the Z2 Mini G1a’s Ryzen AI NPU (50 TOPS) against the RTX 5090’s Tensor Cores, representing distinct AI acceleration architectures.

The Z2 Mini G1a’s Ryzen AI NPU performed surprisingly well, achieving results competitive with discrete GPU inference despite its 50 TOPS limit. Notably, it exceeds Microsoft’s CoPilot+ PC hardware requirements, confirming its future readiness.

UL Procyon AI Image Generation Benchmark

The Procyon AI Image Generation benchmark measures text-to-image inference performance using Stable Diffusion models. We tested both using FP16 workloads:
• Standard (512×512, batch size 1)
• Stable Diffusion XL (1024×1024, batch size 4)
While primarily optimized for GPUs, it can also execute on CPU-integrated graphics for comparison.

Unsurprisingly, the RTX 5090 completed these tasks roughly 10× faster thanks to its dedicated Tensor Cores, whereas the 9950X3D’s integrated graphics required over two hours to finish the same workload.

We next compared Stable Diffusion XL (FP16) performance using ONNX Runtime with MS Olive, AMD-Optimized ONNX, and NVIDIA TensorRT, as well as the 9950X3D’s integrated GPU.

Technically, the 9950X3D’s integrated graphics can complete the Stable Diffusion XL workload—but full inference would take over 24 hours, underscoring the importance of dedicated AI acceleration.

UL Procyon AI Text Generation Benchmark

The Procyon AI Text Generation benchmark evaluates performance across several large language models (LLMs). Since it primarily leverages GPU inference, the RTX 5090-equipped PC unsurprisingly held a clear advantage.

The summary results are presented below:

With synthetic and hybrid workloads complete, we now transition to fully real-world application benchmarks, beginning with PugetBench for DaVinci Resolve.

PugetBench for DaVinci Resolve

Unlike many benchmarks that isolate final rendering, PugetBench for DaVinci Resolve evaluates the entire creative workflow — from timeline playback to rendering and export. It runs directly atop DaVinci Resolve, pulling data from real in-app performance. We used the free version of BlackMagic Design’s DaVinci Resolve for testing.

While not matching the RTX 5090-equipped desktop in raw performance, the Z2 Mini G1a performed exceptionally well, delivering smooth, stable results for most creative workloads.

SPECworkstation 4.0 Benchmark Suite

All of the SPECworkstation 4.0 benchmarks are based on professional applications, most of which are in the CAD/CAM or media and entertainment fields. All of these benchmarks are free except for vendors of computer-related products and/or services.

The most comprehensive SPECworkstation benchmark is the newly revamped 4.0 version. It’s a free-standing benchmark that does not require ancillary software. It measures GPU, CPU, storage, and all other major aspects of workstation performance based on actual applications and representative workloads. It features a new category of tests focusing on AI and ML workloads and has updated many others over the older version.

Below is our SPECworkstation 4.0 performance summary:

In all but three subtests, the Z2 Mini G1a outperformed the official SPEC reference workstation, affirming its true workstation-class capability. It also came remarkably close to the desktop’s scores — and even outperformed it in categories such as Hidden Line Removal, LAMMPS, OpenFOAM, Rodinia Life Sciences, SRMP, and Poisson.

SPECviewperf 15 Benchmark Suite

SPECviewperf 15, a major revision of the 2020 edition, benchmarks professional visualization workloads that leverage both system RAM and GPU resources. It simulates large data manipulation and rendering typical of engineering, architecture, and media workflows.

The RTX 5090-equipped desktop was clearly superior in GPU-heavy visualization tasks, falling behind only in the medical-04 test. However, the Z2 Mini G1a proved exceptionally capable for daily professional workloads, including CAD, BIM, and general visualization. Crucially, it delivers 100% accuracy and ISV-certified rendering—an area where the RTX 5090 falls short. In mission-critical fields like medical imaging, even a single visual artifact is unacceptable.

While the Radeon 8060S excels at handling large memory-bound models, it cannot match discrete GPUs in raw rendering throughput. In heavy 3D projects, viewport navigation may slow. However, professionals can connect an external Radeon Pro or other workstation GPU via Thunderbolt for expanded performance.

Beyond conventional workstation tasks, the Z2 Mini’s integrated GPU can access up to 96 GB of shared system memory, enabling it to run very large AI models—significantly larger than what the RTX 5090 can fit in its 32 GB of VRAM. AMD claims that the Ryzen AI Max PRO architecture can handle LLMs up to 128 billion parameters, effectively offering ChatGPT 3.0-scale performance locally, without the cloud!

With that in mind, we moved on to testing Large Language Model performance using LM Studio.

LM Studio (Local AI Inference Testing)

LM Studio is a powerful open-source desktop application that enables users to download, run, and experiment with large language models (LLMs) locally. It’s ideal for those prioritizing data privacy and offline inference — unlike cloud-based platforms such as ChatGPT, which require subscriptions and transmit user prompts to remote servers. Once a model is downloaded, LM Studio can operate entirely offline, though search and API-dependent features are disabled.

We tested multiple open-source large language models over several days, focusing on chat performance and creative writing capabilities. The OpenAI GPT-OSS 120B and Meta Llama 70B models stood out for their natural language fluency, context retention, and coherent generation. However, the Llama 70B model’s monolithic structure demands far more memory and compute than either of our systems could comfortably provide, resulting in extremely slow responses.

In contrast, the GPT-OSS 120B model is highly optimized for efficiency. It uses MXFP4 quantization and reduced Mixture-of-Experts (MoE) weights down to 4.25 bits per parameter, enabling excellent inference performance even on lower-end hardware. The model’s Int4 deployment runs smoothly on the Ryzen AI Max+ PRO 395, with minimal quality loss, according to AMD and OpenAI.

However, when loading the 63 GB GPT-OSS 120B model on our RTX 5090 desktop, LM Studio displayed a “Partial GPU Offload Possible” warning — indicating memory constraints for full VRAM allocation.

The model is so large that it cannot fit into the 32GB vRAM of the RTX 5090, and we actually need 192GB of system RAM (over 96GB) to have good performance. Without enough RAM, this large model makes performance compromises as system RAM is used as a “swap file”. When the model is too big for the GPU’s vRAM, LM Studio offloads part of the model to the system RAM, and constant data transfers between them create a performance bottleneck.

This is not an issue with the Z2 Mini G1 since it has 128GB of memory which is shared between the CPU and the GPU. It is not unified memory since Windows treats CPU and GPU memory as separate banks, however, the BIOS allows GPU-dedicated memory to be configured from 512 MB up to 96 GB in steps. The issue with 96 GB dedicated to the GPU is that LM Studio needs to load the Open AI 120B model fully into system RAM before loading to vRAM and 32GB is insufficient.

We tried dedicating 64GB of memory, and the model did load, but its performance was no different from using 32GB of dedicated GPU memory and allowing the driver to allocate memory to the GPU on the fly.

One other issue is that LM studio configures its models for a default 4096 token context window as the amount of data it can use while generating responses. We ran out of context after about three to five prompts in conversations with the AI. GPT-OSS 120B actually supports a massive 131,072 token context window. We were able to use a 128,000 token window by configuring as shown below by selecting “manually load advanced settings”, “show advanced settings”, and turning on “Flash Attention”.

Afterward, the AI responded just like using ChatGPT 3.0 or even perhaps 3.5, except all of the prompts and conversations are completely private. In-depth discussions with the AI continued for hours without running out of context or having to pay for tokens used.

GPT-OSS 120B is an outstanding, fully customizable transformer model offering excellent reasoning for research, tool usage, problem-solving, and agentic workflows. It’s a top industry model for local inference, which doesn’t require using the cloud to ensure data privacy and to manage costs over using GPT-5. Even complex production workloads are supported.

Here are our results comparing three large language models using the high effort preset for deeper multi-step reasoning to produce more complex answers. Simpler prompts should use low-effort reasoning for less computation with faster responses.

OpenAI GPT-OSS 20B with 3.6B parameters per token gives less complex answers, responds far more quickly, and at times is somewhat inaccurate compared to the 120B model, which activates 5.1B. For our purposes, the 120B model is better for information gathering and research. Although the RTX 5090 is incredibly fast using the 20B model, around 20 tokens per second produced by the Z2 Mini G1a is faster than the average human can read.

When moving to the 120B model, the Z2 Mini G1a was faster than the RTX 5090 desktop. We still consider 10 tokens per second acceptable, especially for advanced math formulas with complex toy universe-building responses generated by our third prompt. But both PCs were very slow generating responses using the Meta Llama 3.3 70B model, as 1-2 tokens per second is just painful.

MLPerf Client 1.0 (LLM and AI Inference Testing)

The LLM tests in MLPerf Client v1.0 make use of multiple different language models and are also considered a real-world benchmark. The benchmark measures and compares Code analysis, Content generation, Creative writing, and light and moderate Summarization measurements. Just like with LM Studio, you can use command lines or the new GUI.

Below are the MLPerf Client 1.0 benchmark results:

In all the benchmarks we ran, the desktop with the RTX 5090 was significantly faster than the Z2 Mini G1a, yet the small workstation provided a very good tokens per second output well above 20 and often above 40.

So, how well did the HP Z2 Mini G1a do overall? Let’s head to our analysis and conclusion.

Final thoughts

First and foremost, the HP 72 Mini G1a is a true workstation for professionals whose livelihood depends on 100% accurate results. It offers (1) ISV certification, (2) EEC memory in a workstation motherboard that enables it, and (3) the kind of support that HP provides to its professional and business customers for mission-critical applications.

The Z2 Mini G1a holds ISV certifications for the following professional applications:

Z2 Mini G1a ISV certifications
Adobe After EffectsAdobe IllustratorAdobe Photoshop
Adobe Premiere ProAdobe Substance 3D ModelerAdobe Substance 3D Painter
Adobe Substance 3D DesignerAdobe Substance 3D SamplerAdobe Substance 3D Stager
Ansys Ansys CFXAnsys Ansys DiscoveryAnsys Ansys Electromagnetics Suite
Ansys Ansys MechanicalAutodesk AutoCADAutodesk Fusion
Autodesk InventorAutodesk RevitAvid Media Composer
Bentley iTwin Capture ModelerBentley LumenRTBentley MicroStation
Blackmagic Design DaVinci ResolveBlackmagic Design DaVinci Resolve StudioDassault Systemes eDrawings
Dassault Systemes ICEM SurfDassault Systemes SOLIDWORKSDassault Systemes SOLIDWORKS Visualize
Epic Games Unreal EngineMaxon Cinema 4DMaxon ZBrush
PTC CreoSiemens Digital Industries Software NXSiemens Digital Industries Software Solid Edge
Siemens Digital Industries Software Teamcenter VisualizationSiemens Digital Industries Software Tecnomatix Process SimulateVectorworks Vectorworks

In contrast, ours is a pseudo workstation; An ultra-high-end “Creator” PC that is ideally-suited for elite gamers, game developers, media & entertainment, and creative professionals who use 3D rendering, video editing, graphics design, and AI/ML tasks but who don’t require 100% accuracy. It’s often faster than the HP Z2 Mini G1a, and many freelance 3D artists buy regular gaming cards because the render performance is often better for less money than professional cards with equivalent specifications.

Another key difference is support. If a self-built creator PC fails, the user is responsible for troubleshooting and downtime. HP workstations include a three-year warranty and enterprise support, which is critical for businesses where even a single day of downtime can exceed the price premium of a certified workstation.

The Z2 Mini G1a’s $5,000 list price for the HP Z2 Mini G1a is very expensive, but fortunately, an authorized HP dealer, BHPhoto, offers it in stock at the time of writing for below $3,350, and you can sometimes find it for less money on HP’s own site. That’s about the same price as an AIB RTX 5090!

We were totally impressed with the HP Z2 Mini G1a, and it exceeded our high expectations for it. As we have noted, there is more to this system than its specifications indicate. The AMD Ryzen PRO brings entire-system memory encryption and ensures a secure, isolated environment for VMs. On top of this, HP adds Premium+ enterprise support with its own robust security stack.

Compared to our massive, power-hungry pseudo-workstation, the Z2 Mini G1a represents a more efficient and reliable alternative. While the desktop creator PC may execute certain workloads faster, it lacks the guaranteed accuracy, certifications, and support that professional users depend on. For businesses and professionals, the Z2 Mini G1a simply makes sense.

If you are primarily an elite gamer who maxes out every setting of AAA games at 4K and who also may make a living as a 3D artist or who specializes in creative tasks, then the Ryzen 9 9950X3D/RTX 5090/192GB DDR5 combination is right for your deep pockets. It can also run very large LLMs and other AI workflows locally, although even the mighty RTX 5090 runs out of vRAM and must depend on the much slower system RAM. This is where the Ryzen AI MAX platform featured in the Z2 Mini G1a excels, filling a specific niche in a very compact power-sipping package once exclusively the domain of large workstations with datacenter GPUs.

Pros & Cons

Pros

  • Professional look and solid build quality
  • Excellent CPU performance
  • Powerful GPU with shared memory pool
  • Flexible I/O configuration options
  • Robust enterprise-grade security
  • Handles very large LLMs efficiently
  • ISV-certified for 37 professional applications
  • Three-year warranty and premium HP support

Cons

  • Noticeable fan noise under heavy load
  • High MSRP compared to consumer desktops

We are proud to present the Highly Recommended BTR Award to the HP Z2 Mini G1a, in recognition of its outstanding performance, compact design, and workstation-level reliability!