Use std::variant + std::visit to avoid virtual dispatch in C++
When the set of types is known ahead of time, prefer std::variant and visitors to eliminate virtual calls and improve performance and ownership semantics.
Virtual functions are the canonical way to express polymorphism in C++. They are flexible, familiar, and work well when code must be extensible at runtime. But virtual dispatch comes with measurable cost: an indirect call through a vtable, potential cache effects, and lost optimization opportunities for the compiler. When your type hierarchy is closed (you know all concrete variants ahead of time), std::variant combined with std::visit gives you a zero- or low-overhead alternative: static dispatch, clearer ownership, and often fewer surprises for move/copy semantics.
This article explains the trade-offs, shows a concrete Shape example with both virtual-based and std::variant approaches, presents a visitor helper for ergonomic code, and describes a simple benchmark harness readers can reproduce.
std::variant + std::visit when the set of concrete types is closed and performance/ownership clarity matter.std::variant trades some code verbosity for static dispatch and sometimes lower memory overhead and better inlining.std::variant visitstd::variant; runtime null/vtable corruption is still a concern for virtuals if used incorrectlyVirtual calls cost is context-dependent. In tight loops where a single indirect call is on the hot path, extra cycles per call add up. Besides the raw call cost, virtual objects commonly live behind pointers (heap allocation), increasing pointer chasing and cache-miss penalties. If you can store values directly and dispatch statically, you often gain significant performance.
But virtuals are excellent when you need runtime extensibility, so don't remove them reflexively—use them where their strengths are required.
std::variant<Ts...> is a type-safe union that holds exactly one of the alternatives. std::visit dispatches to a visitor callable based on the active alternative. The compiler resolves the visitor overloads and can often inline and optimize the calls.
Minimal example:
1#include <variant>
2#include <iostream>
3
4struct A { void run() const { std::cout << "A\n"; } };
5struct B { void run() const { std::cout << "B\n"; } };
6
7int main() {
8 std::variant<A, B> v = A{};
9 std::visit([](auto&& x){ x.run(); }, v);
10}
11We'll demonstrate two versions of the same idea: compute area for a heterogenous collection of shapes.
Virtual-based design (familiar):
1// virtual_shapes.cpp
2#include <memory>
3#include <vector>
4
5struct Shape {
6 virtual ~Shape() = default;
7 virtual double area() const = 0;
8};
9
10struct Circle : Shape { double r; double area() const override { return 3.14159 * r * r; }};
11struct Rect : Shape { double w,h; double area() const override { return w*h; }};
12
13int main() {
14 std::vector<std::unique_ptr<Shape>> shapes;
15 shapes.push_back(std::make_unique<Circle>(Circle{2.0}));
16 shapes.push_back(std::make_unique<Rect>(Rect{3.0,4.0}));
17 double total = 0;
18 for (auto &s : shapes) total += s->area();
19}
20Notes:
Variant-based design (closed set):
1// variant_shapes.cpp
2#include <variant>
3#include <vector>
4
5struct Circle { double r; double area() const { return 3.14159 * r * r; } };
6struct Rect { double w,h; double area() const { return w*h; } };
7
8using Shape = std::variant<Circle, Rect>;
9
10int main() {
11 std::vector<Shape> shapes;
12 shapes.emplace_back(Circle{2.0});
13 shapes.emplace_back(Rect{3.0,4.0});
14 double total = 0;
15 for (auto &s : shapes)
16 total += std::visit([](auto const &x){ return x.area(); }, s);
17}
18Notes:
std::visit dispatches to the correct overload; the compiler can inline the call when the variant alternatives are known.Writing visitors with many lambdas can be noisy. A common helper is the overload utility:
1template<class... Ts> struct overload : Ts... { using Ts::operator()...; };
2template<class... Ts> overload(Ts...) -> overload<Ts...>;
3Usage:
1std::visit(overload{
2 [](Circle const &c){ return c.area(); },
3 [](Rect const &r){ return r.area(); }
4}, s);
5This is concise and extendable.
std::variant stores the active alternative in-place (subject to alignment) and adds a small discriminator to indicate which alternative is active. There is no per-element heap allocation by default.unique_ptr to heap-allocated polymorphic objects, which requires allocation logic if deep-copying.Edge cases:
std::unique_ptr inside the variant alternative to keep variant size small at the cost of pointer indirection for that alternative.std::variant requires recompilation of all translation units that mention the variant.Sometimes the best design mixes both: a closed core that uses std::variant for hot paths and an interface layer (virtual) at boundaries. Example patterns:
Benchmark goals:
A minimal harness: compile two binaries (virtual version and variant version) with the same inputs, run each in tight microbench loops, and measure cycles or ns per operation. Use std::chrono::steady_clock for simple measurements; for more precise results, use perf, likwid, or platform PMUs.
Benchmark skeleton (conceptual):
1// bench.cpp (pseudo)
2// Build two versions: with VIRTUAL_IMPL or VARIANT_IMPL defined
3
4#include <vector>
5#include <random>
6#include <chrono>
7#include <iostream>
8
9int main(){
10 const size_t N = 10'000'000;
11 // build vector of shapes (either variant or unique_ptr based)
12 auto start = std::chrono::steady_clock::now();
13 double total = 0.0;
14 for(size_t i=0;i<N;++i) total += compute_area(i);
15 auto end = std::chrono::steady_clock::now();
16 std::cout << "time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count() << " ms\n";
17}
18Try these compile flags for fair comparison:
Interpretation notes:
Files to create locally (suggested):
variant_shapes.cpp (example above)virtual_shapes.cpp (example above)bench_variant.cpp, bench_virtual.cpp (thin harness that runs a hot loop)Build commands (example using g++):
1g++ -std=c++17 -O3 -march=native -flto -pipe variant_shapes.cpp -o variant_shapes
2g++ -std=c++17 -O3 -march=native -flto -pipe virtual_shapes.cpp -o virtual_shapes
3# Run each a few times and compare
4./variant_shapes
5./virtual_shapes
6For more precise profiling, use perf stat -r 10 ./variant_shapes and the same for virtual_shapes.
std::visit will always be faster: the actual performance depends heavily on context, object sizes, memory layout, and whether heap allocations dominate the cost.std::variant stores values; if you previously relied on pointers to polymorphic base classes and their dynamic behavior, migrating to variant may require redesign.std::variant + std::visit is a powerful, modern alternative to runtime polymorphism when your type set is closed. It enables static dispatch, often lowers allocation and pointer-chasing overhead, and gives the compiler more opportunity to inline and optimize. Use virtuals when you need runtime extensibility or plugin-style architectures. If performance matters, implement small benchmarks for your workload—microbenchmarks are easy to write and often reveal surprising wins or bottlenecks.
1// overload helper
2template<class... Ts>
3struct overload : Ts... { using Ts::operator()...; };
4
5template<class... Ts>
6overload(Ts...) -> overload<Ts...>;
7std::variant, std::visitTechnical Writer
NordVarg Team is a software engineer at NordVarg specializing in high-performance financial systems and type-safe programming.
Get weekly insights on building high-performance financial systems, latest industry trends, and expert tips delivered straight to your inbox.