NV
NordVarg
ServicesTechnologiesIndustriesCase StudiesBlogAboutContact
Get Started

Footer

NV
NordVarg

Software Development & Consulting

GitHubLinkedInTwitter

Services

  • Product Development
  • Quantitative Finance
  • Financial Systems
  • ML & AI

Technologies

  • C++
  • Python
  • Rust
  • OCaml
  • TypeScript
  • React

Company

  • About
  • Case Studies
  • Blog
  • Contact

© 2025 NordVarg. All rights reserved.

November 11, 2025
•
NordVarg Team
•

Use std::variant + std::visit to avoid virtual dispatch in C++

When the set of types is known ahead of time, prefer std::variant and visitors to eliminate virtual calls and improve performance and ownership semantics.

GeneralC++performancestd::variantdesigntutorials
8 min read
Share:

Why this matters#

Virtual functions are the canonical way to express polymorphism in C++. They are flexible, familiar, and work well when code must be extensible at runtime. But virtual dispatch comes with measurable cost: an indirect call through a vtable, potential cache effects, and lost optimization opportunities for the compiler. When your type hierarchy is closed (you know all concrete variants ahead of time), std::variant combined with std::visit gives you a zero- or low-overhead alternative: static dispatch, clearer ownership, and often fewer surprises for move/copy semantics.

This article explains the trade-offs, shows a concrete Shape example with both virtual-based and std::variant approaches, presents a visitor helper for ergonomic code, and describes a simple benchmark harness readers can reproduce.

TL;DR#

  • Use virtual polymorphism when your hierarchy must be extended by external modules, or when dynamic code loading/plugins are required.
  • Use std::variant + std::visit when the set of concrete types is closed and performance/ownership clarity matter.
  • std::variant trades some code verbosity for static dispatch and sometimes lower memory overhead and better inlining.

Contract (small)#

  • Input: a heterogenous set of shapes (Circle, Rectangle, Triangle)
  • Output: ability to compute area and draw (simulated) via either virtual dispatch or std::variant visit
  • Error modes: visiting the wrong type is prevented at compile time for std::variant; runtime null/vtable corruption is still a concern for virtuals if used incorrectly
  • Success criteria: readable example, copy/move behavior described, and a benchmark harness skeleton included

When virtual calls matter (short)#

Virtual calls cost is context-dependent. In tight loops where a single indirect call is on the hot path, extra cycles per call add up. Besides the raw call cost, virtual objects commonly live behind pointers (heap allocation), increasing pointer chasing and cache-miss penalties. If you can store values directly and dispatch statically, you often gain significant performance.

But virtuals are excellent when you need runtime extensibility, so don't remove them reflexively—use them where their strengths are required.

C++ basics: std::variant and std::visit#

std::variant<Ts...> is a type-safe union that holds exactly one of the alternatives. std::visit dispatches to a visitor callable based on the active alternative. The compiler resolves the visitor overloads and can often inline and optimize the calls.

Minimal example:

cpp
1#include <variant>
2#include <iostream>
3
4struct A { void run() const { std::cout << "A\n"; } };
5struct B { void run() const { std::cout << "B\n"; } };
6
7int main() {
8    std::variant<A, B> v = A{};
9    std::visit([](auto&& x){ x.run(); }, v);
10}
11

Shape example: virtual hierarchy vs variant#

We'll demonstrate two versions of the same idea: compute area for a heterogenous collection of shapes.

Virtual-based design (familiar):

cpp
1// virtual_shapes.cpp
2#include <memory>
3#include <vector>
4
5struct Shape {
6    virtual ~Shape() = default;
7    virtual double area() const = 0;
8};
9
10struct Circle : Shape { double r; double area() const override { return 3.14159 * r * r; }};
11struct Rect   : Shape { double w,h; double area() const override { return w*h; }};
12
13int main() {
14    std::vector<std::unique_ptr<Shape>> shapes;
15    shapes.push_back(std::make_unique<Circle>(Circle{2.0}));
16    shapes.push_back(std::make_unique<Rect>(Rect{3.0,4.0}));
17    double total = 0;
18    for (auto &s : shapes) total += s->area();
19}
20

Notes:

  • Each object typically sits on the heap (unique_ptr) and the loop does an indirect call through the vptr.
  • Polymorphic storage is natural (heterogenous container), but the cost includes allocation and pointer indirection.

Variant-based design (closed set):

cpp
1// variant_shapes.cpp
2#include <variant>
3#include <vector>
4
5struct Circle { double r; double area() const { return 3.14159 * r * r; } };
6struct Rect   { double w,h; double area() const { return w*h; } };
7
8using Shape = std::variant<Circle, Rect>;
9
10int main() {
11    std::vector<Shape> shapes;
12    shapes.emplace_back(Circle{2.0});
13    shapes.emplace_back(Rect{3.0,4.0});
14    double total = 0;
15    for (auto &s : shapes)
16        total += std::visit([](auto const &x){ return x.area(); }, s);
17}
18

Notes:

  • Shapes are stored directly in the vector (no heap allocations per element).
  • std::visit dispatches to the correct overload; the compiler can inline the call when the variant alternatives are known.

Visitor ergonomics: overload helper#

Writing visitors with many lambdas can be noisy. A common helper is the overload utility:

cpp
1template<class... Ts> struct overload : Ts... { using Ts::operator()...; };
2template<class... Ts> overload(Ts...) -> overload<Ts...>;
3

Usage:

cpp
1std::visit(overload{
2    [](Circle const &c){ return c.area(); },
3    [](Rect const &r){ return r.area(); }
4}, s);
5

This is concise and extendable.

Memory layout and copying semantics#

  • std::variant stores the active alternative in-place (subject to alignment) and adds a small discriminator to indicate which alternative is active. There is no per-element heap allocation by default.
  • Copying a variant will copy the active alternative. This can be more efficient than copying a unique_ptr to heap-allocated polymorphic objects, which requires allocation logic if deep-copying.
  • Variants are larger when alternatives have large sizes; choose alternatives carefully (e.g., avoid including very large-but-rare members if that matters).

Edge cases:

  • If one alternative is much larger than others, the variant's size is the size of the largest alternative plus some overhead. You can wrap large members in std::unique_ptr inside the variant alternative to keep variant size small at the cost of pointer indirection for that alternative.

When variant is a bad fit#

  • Open hierarchies: if you must add new concrete types at runtime or in third-party plugins, std::variant requires recompilation of all translation units that mention the variant.
  • ABI boundaries: if you need a stable ABI where concrete types may be added without recompiling consumers, virtual dispatch is the right choice.
  • Heterogenous third-party ownership: if types come from external libraries with their own polymorphism, mixing them into a variant is not always straightforward.

Interop: mixing variant-based and virtual-based code#

Sometimes the best design mixes both: a closed core that uses std::variant for hot paths and an interface layer (virtual) at boundaries. Example patterns:

  • Use variant internally for performance and expose a virtual wrapper at library boundary.
  • Implement an adapter that converts polymorphic external types into an internal variant via visitor or explicit mapping.

Benchmarks: what to measure and a simple harness#

Benchmark goals:

  • Raw dispatch cost (area computation in a tight loop)
  • End-to-end cost including allocation and cache effects (e.g., vector of variant vs vector of unique_ptr)
  • Effect of inlining and LTO/PGO on both implementations

A minimal harness: compile two binaries (virtual version and variant version) with the same inputs, run each in tight microbench loops, and measure cycles or ns per operation. Use std::chrono::steady_clock for simple measurements; for more precise results, use perf, likwid, or platform PMUs.

Benchmark skeleton (conceptual):

cpp
1// bench.cpp (pseudo)
2// Build two versions: with VIRTUAL_IMPL or VARIANT_IMPL defined
3
4#include <vector>
5#include <random>
6#include <chrono>
7#include <iostream>
8
9int main(){
10    const size_t N = 10'000'000;
11    // build vector of shapes (either variant or unique_ptr based)
12    auto start = std::chrono::steady_clock::now();
13    double total = 0.0;
14    for(size_t i=0;i<N;++i) total += compute_area(i);
15    auto end = std::chrono::steady_clock::now();
16    std::cout << "time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count() << " ms\n";
17}
18

Try these compile flags for fair comparison:

  • -O3 -march=native -flto (optionally PGO)
  • Run each binary multiple times and take median of wall-clock or cycles.

Interpretation notes:

  • If the variant version is faster, the cause might be fewer allocations and better inlining.
  • If the virtual version is similar or better, check whether the virtual objects were optimized away or your test has other bottlenecks.

Try it yourself (quick)#

Files to create locally (suggested):

  • variant_shapes.cpp (example above)
  • virtual_shapes.cpp (example above)
  • bench_variant.cpp, bench_virtual.cpp (thin harness that runs a hot loop)

Build commands (example using g++):

bash
1g++ -std=c++17 -O3 -march=native -flto -pipe variant_shapes.cpp -o variant_shapes
2g++ -std=c++17 -O3 -march=native -flto -pipe virtual_shapes.cpp -o virtual_shapes
3# Run each a few times and compare
4./variant_shapes
5./virtual_shapes
6

For more precise profiling, use perf stat -r 10 ./variant_shapes and the same for virtual_shapes.

Edge cases and pitfalls#

  • Don't assume std::visit will always be faster: the actual performance depends heavily on context, object sizes, memory layout, and whether heap allocations dominate the cost.
  • Be careful with exception-safety: if visiting code can throw, ensure you understand how variant state is preserved.
  • Beware of object slicing: std::variant stores values; if you previously relied on pointers to polymorphic base classes and their dynamic behavior, migrating to variant may require redesign.

Summary#

std::variant + std::visit is a powerful, modern alternative to runtime polymorphism when your type set is closed. It enables static dispatch, often lowers allocation and pointer-chasing overhead, and gives the compiler more opportunity to inline and optimize. Use virtuals when you need runtime extensibility or plugin-style architectures. If performance matters, implement small benchmarks for your workload—microbenchmarks are easy to write and often reveal surprising wins or bottlenecks.

Appendix: visitor overload helper (copyable)#

cpp
1// overload helper
2template<class... Ts>
3struct overload : Ts... { using Ts::operator()...; };
4
5template<class... Ts>
6overload(Ts...) -> overload<Ts...>;
7

Appendix: references & further reading#

  • cppreference: std::variant, std::visit
  • Herb Sutter — talks and posts on value semantics and type erasure
  • Papers and blog posts comparing static vs dynamic dispatch patterns
NT

NordVarg Team

Technical Writer

NordVarg Team is a software engineer at NordVarg specializing in high-performance financial systems and type-safe programming.

C++performancestd::variantdesigntutorials

Join 1,000+ Engineers

Get weekly insights on building high-performance financial systems, latest industry trends, and expert tips delivered straight to your inbox.

✓Weekly articles
✓Industry insights
✓No spam, ever

Related Posts

Nov 11, 2025•12 min read
Latency Optimization for C++ in HFT Trading — Practical Guide
A hands-on guide to profiling and optimizing latency in C++ trading code: hardware-aware design, kernel-bypass networking, lock-free queues, memory layout, and measurement best-practices.
GeneralC++HFT
Nov 11, 2025•8 min read
CRTP — Curiously Recurring Template Pattern in C++: elegant static polymorphism
How CRTP works, when to use it, policy/mixin patterns, C++20 improvements, pitfalls, and practical examples you can compile and run.
Generalc++patterns
Nov 10, 2025•15 min read
Building a High-Performance Message Queue: From Scratch
GeneralSystems ProgrammingPerformance

Interested in working together?