WhatschatDocsProgramming
Related
Exploring Python 3.15 Alpha 2: Key Features and What to ExpectMastering Rust Testing with cargo-nextest: A Step-by-Step GuideMastering the Latest Rustup 1.29.0: A Complete Guide to Faster Toolchain ManagementMastering IntelliJ IDEA: Key Techniques and WorkflowsBreaking: .NET 10 Unveils Simplified API Versioning with OpenAPI Integration—Experts Hail as Game-Changer for Developers6 Tips to Reduce Heap Allocations in Go with Stack AllocationGo 1.26 Unveils Source-Level Inliner: A Self-Service Modernization Breakthrough for DevelopersModernize Your Go Codebase with the `go fix` Command: A Step-by-Step Guide

How to Reduce Heap Allocations by Stack-Allocating Slices in Go

Last updated: 2026-05-05 03:12:35 · Programming

Introduction

Heap allocations are a major source of performance overhead in Go programs. Each call to the memory allocator consumes CPU cycles, and allocated objects place additional strain on the garbage collector. In contrast, stack allocations are nearly free—they require no explicit deallocation and are automatically reclaimed when the function returns. This guide walks you through a practical technique: stack-allocating slices of known or bounded size to eliminate heap pressure and accelerate your hot code paths.

How to Reduce Heap Allocations by Stack-Allocating Slices in Go
Source: blog.golang.org

What You’ll Need

  • A Go development environment (version 1.20 or later for best escape analysis)
  • Basic familiarity with Go slices, arrays, and memory management
  • A profiler (e.g., pprof) to measure heap allocations before and after

Step-by-Step Instructions

Step 1: Identify Heap‑Allocation Hot Spots

Run your program under the profiler to locate functions where slices are repeatedly grown via append. Look for patterns like:

var tasks []task
for t := range c {
    tasks = append(tasks, t)
}
processAll(tasks)

Each time the slice’s backing array fills, the runtime must allocate a new—and usually larger—array on the heap. This produces garbage and slows down the inner loop. In your profiler output, pay special attention to runtime.mallocgc and runtime.growslice call stacks.

Step 2: Determine the Maximum Slice Size

Ask yourself: Is the maximum number of elements known at compile time? Even a loose upper bound is enough. For example, if you know you will never process more than 64 tasks, that bound allows stack allocation. If the bound depends on runtime input (e.g., len(users)), you may still be able to pre‑allocate capacity with make, but that alone does not move the allocation to the stack.

Step 3: Replace the Dynamic Slice with a Fixed‑Size Array

When the maximum size is a compile‑time constant, use a stack‑allocated array and then slice it:

func process(c chan task) {
    var tasks [64]task  // stack-allocated array
    var n int
    for t := range c {
        if n == 64 {
            // Handle overflow (log, return error, etc.)
            break
        }
        tasks[n] = t
        n++
    }
    processAll(tasks[:n]) // slice the array
}

Because [64]task has a fixed size known to the compiler, it is placed on the stack. The subsequent slice tasks[:n] points to that stack memory, so no heap allocation occurs. The append loop is gone entirely.

Step 4: Use make with Capacity for Bounded but Dynamic Sizes

If the maximum size is a runtime value (e.g., len(data)), pre‑allocate the backing array with make([]task, 0, maxSize). This avoids the incremental growth overhead, but note that make itself still allocates on the heap. To truly push it to the stack, the array must have a compile‑time constant size (see Step 3). However, pre‑allocation dramatically reduces the number of allocations and garbage generated:

func process(c chan task, max int) {
    tasks := make([]task, 0, max)
    for t := range c {
        if len(tasks) == cap(tasks) {
            break
        }
        tasks = append(tasks, t)
    }
    // ...
}

Step 5: Leverage Pooling for Repeated Slices

If you must use dynamic slices and cannot determine a maximum size, consider reusing backing arrays via sync.Pool. While not strictly stack allocation, this reduces heap churn. Combine with Steps 1–4 to minimise allocations in the most critical paths.

Step 6: Verify with Escape Analysis and Profiling

After refactoring, confirm that your objects stay on the stack. Use the -gcflags='-m' flag:

go build -gcflags='-m -l' yourfile.go

Look for lines like moved to heap or escapes to heap. If your fixed‑size array or slice is reported as “does not escape”, it is stack‑allocated. Rerun the profiler and verify reduced mallocgc calls and lower GC pauses.

Conclusion and Tips

  • Start small. Only rewrite the top 2–3 hot spots identified by profiling. Premature optimisation can hurt readability.
  • Respect stack limits. A fixed‑size array of several megabytes will overflow the stack (runtime: goroutine stack exceeds 1GB). Keep arrays small (a few thousand elements) unless you adjust the stack size with runtime/debug.SetMaxStack.
  • Watch for sharing. If part of the array is taken as a slice and returned or stored beyond the function call, the entire array escapes to the heap. Be careful with slices passed to goroutines or stored in global variables.
  • Combine with compiler hints. Declare a variable with var buf [64]byte inside a function – it is almost always stack‑allocated. The Go escape analyzer is conservative, so simple code is best.
  • Test for correctness. When you replace append with manual index tracking, confirm you handle the “full” case gracefully (overflow, error, or dynamic fallback).

By applying these steps, you can convert expensive heap allocations into cheap stack allocations, making your Go programs faster and more cache‑friendly. Start with the most performance‑sensitive loops and work outward—the gains can be substantial.