Skip to content
Back to blog
Product 2024-04-03 6 min read

Complete Guide to VDI Performance Testing | Citrix, AVD, Omnissa Horizon (formerly VMware)

Why VDI testing matters, platform-specific load profiles, scenario design, live cockpit metrics. Citrix, AVD, Omnissa Horizon (formerly VMware), RDS.

LoadGen Engineering

Product Strategy

VDI workloads behave nothing like web traffic. A logon storm is not a request spike, an HDX channel is not an HTTP connection, and a slow-rendering Office add-in does not register on a load balancer's RPS chart. Generic load-testing tools — k6, JMeter, Locust — measure the wrong layer.

This guide walks through how to do VDI performance testing properly: why it matters, which load profiles fit which platform, how to design scenarios that mirror real users, and the five mistakes that quietly invalidate most VDI test runs.

Why VDI testing matters

Performance issues in EUC environments rarely look like outages. They look like login latency creeping up, HDX channels saturating during graphics-heavy work, profile-load times climbing on AVD as user counts rise, and broker queues backing up at peak. By the time helpdesk tickets surface, the platform team is reverse-engineering a problem that was discoverable weeks earlier — if anyone had measured.

Four reasons to invest in real VDI testing:

  • Capacity planning — Know how many users your farm or host pool can support before you onboard them. Right-size session-host SKUs, slot counts, and broker tiers on measurement, not on a vendor's reference architecture.
  • Change validation — Prove that an OS patch, a GPU driver bump, a Citrix CU, or a broker update doesn't degrade UX. Re-run the same scenario before and after; the comparison view makes drift visible.
  • Migration confidence — Baseline Citrix today, run the same scenario shape on AVD or Horizon next week, compare side-by-side. The cutover decision becomes data-backed.
  • SLA validation — Capture P90 / P95 response times and error rates on a schedule. The contract obligation becomes an automated, signed evidence trail rather than a quarterly fire drill.

Platform-specific load profiles

LoadGen ships dedicated wizards per platform. Each one understands the protocol, the connection topology, and the agent strategy that matches the target stack — instead of forcing a generic template onto a workload it can't represent.

| Platform | Wizard route | Agent type | What's configured | |----------|--------------|------------|-------------------| | Citrix | /config/load-profiles/new/citrix | Full + VDI | StoreFront, Gateway, PNAgent, External Login auth · ICA or HDX display · session config · Activate / Reset / Kill batch ops | | AVD (WVD) | /config/load-profiles/new/wvd | VDI (port 4841) | Subscription, Resource Group, Host Pool · ARM-native discovery | | RDS | /config/load-profiles/new/rdp | Full (port 4840) | RDP connection · RD Gateway · multi-session host config | | Generic / Horizon / Web / FAT | /config/load-profiles/new/generic | Core / Full / VDI as appropriate | Custom workloads · Connection Server · Playwright · .lgs scripts |

Every wizard ships the same six-step shape: connection → users → agents → display → scenario → advanced. The phase model — Idle, RampUp, Sustain, RampDown — is shared across all platforms, so the same scenario shape runs on Citrix and on AVD without re-authoring.

That shared shape is the foundation of comparison. If your Citrix run and your AVD run use the same workload, the same think time, and the same ramp curve, the comparison view tells you something honest.

Scenario design that mirrors real users

The single biggest mistake in VDI load testing is authoring a scenario that doesn't look like the production workload. Symptoms: a "stress" run that the farm passes effortlessly, followed by a Monday-morning logon storm in production that brings StoreFront to its knees. The test was wrong.

Four practices keep scenarios honest:

  • Workload selection — Match user personas. An Office worker is not a CAD operator is not a knowledge worker with 14 browser tabs. LoadGen's .lgs workloads at /config/workloads capture per-persona action sequences and timing — version-controlled, reusable, and shared across scenarios.
  • Phase timing — RampUp simulates how users actually log in over the morning window; Sustain holds the load through the working hours; RampDown logs them off in the afternoon. Avoid instant-spike templates unless you specifically want to test logon-storm survival.
  • Think time — Real users don't click every second. They read, type, scroll, and switch context. Add randomized delays between actions; results without realistic think time overstate concurrency limits by a wide margin.
  • Concurrency vs. throughput — More vUsers does not equal a better test. Match your target peak concurrency. A 1,500-vUser test on infrastructure sized for 800 concurrent sessions tells you nothing — it's noise.

Live cockpit — watch tests in real time

A test run that only produces a report after it finishes is a test run that ends with the wrong answer too late. The live cockpit at /testing/active is built on SignalR, so KPIs update as agents stream data:

  • Per-agent state — logon, running, error
  • Active sessions and concurrency curve
  • Real-time response times and throughput
  • Error counts by phase

When the curve bends in a way you didn't expect, abort and investigate — the audit trail still captures the partial run. A 90-minute test that ends in a 5-minute investigation beats a 90-minute test that ends in a 5-day post-mortem.

Results comparison — overlay up to five runs

/testing/results is where VDI testing earns its keep. Filter by date range, profile, or status. Overlay up to five runs to see how P90, P95, and error rates moved across changes. Drill into the per-run detail modal:

  • Overview — top-line KPIs
  • Moments — per-phase events (logon spike, broker latency, etc.)
  • Measurements — per-action timing
  • Errors — failure categorization

A trend strip shows the same scenario across releases — the long view that catches gradual drift before it becomes a customer-visible problem.

Five mistakes that quietly kill VDI test validity

  1. Wrong agent type. Use VDI agents (/agents/vdi, port 4841) for session-based load on AVD, Horizon, and Citrix VDI pools. Use Full agents (/agents/full, port 4840) for RDS and FAT-client workloads. Core agents (/agents/core, port 4850) are for web and API loads, not VDI sessions. Putting a Core agent in a VDI test is the fastest way to produce numbers that look fine and mean nothing.
  2. Too few agents. For session-based load you need an agent per active vUser. Scale agents before you scale vUsers — otherwise the test bottleneck is your test infrastructure, not your VDI farm.
  3. Ignoring SUT monitoring. Counter templates and SUT machines (/configuration/systems-under-test) bind broker, StoreFront, FSLogix, vCenter, or RD Gateway counters to every run. Without them, you're testing in a vacuum — you see the symptom, not the cause.
  4. No baseline. Always capture a baseline before any change. Without one, "we got slower" is an opinion. With one, it's a measurement.
  5. Generic profile for Citrix. Don't force the generic wizard onto a Citrix target. Use /config/load-profiles/new/citrix — connection model, display config, and VDI-agent strategy are all platform-specific.
Reference

Routes reference

| Surface | Route | |---------|-------| | Run a test | /testing/run | | Live cockpit | /testing/active | | Results + 5-test overlay | /testing/results | | Citrix load profile | /config/load-profiles/new/citrix | | AVD load profile | /config/load-profiles/new/wvd | | RDP load profile | /config/load-profiles/new/rdp | | Generic / Horizon / Web / FAT | /config/load-profiles/new/generic | | Workloads (.lgs library) | /config/workloads | | Agents | /agents/core · /agents/full · /agents/vdi | | SUT counter templates | /configuration/systems-under-test |

Closing thought

Conclusion

VDI testing isn't web load testing. It's a different protocol stack, a different concurrency model, and a different definition of "fast." The platforms that host your users — Citrix, AVD, Horizon, RDS — each have their own quirks, and a serious testing practice meets each one on its own terms.

Use the right wizard. Author scenarios that look like your real workload. Watch the live cockpit, not just the post-run report. Compare against a baseline every time. Avoid the five mistakes above, and the difference between a healthy VDI farm and a Monday-morning incident becomes a measurable, defensible engineering decision — not a hand-wave.

Ready to baseline your environment?

Run the wizard, hit the cockpit, watch the audit trail build itself.

LoadGen Official Logo