NUMA programming resources and references
This page collects the best external resources for learning NUMA programming concepts, implementation techniques, and optimization strategies. These resources complement our internal NUMA guides and provide authoritative information from hardware vendors, kernel developers, and the broader systems programming community.
Official documentation
Intel resources
- Intel NUMA Guide - Comprehensive guide to using NUMA on Intel platforms, covering QuickAssist Technology integration and performance optimization techniques
Linux kernel documentation
- Linux NUMA API - Complete reference for system-level NUMA programming interfaces, including numa_alloc_onnode, numa_bind, and topology discovery functions
- NUMA Best Practices - Linux kernel documentation on NUMA optimization, memory performance characteristics, and system configuration guidelines
- NUMA Policy Documentation - Detailed explanation of Linux NUMA memory policies and their performance implications
Programming language resources
Rust-specific
- Rust Allocator API - Official Rust documentation for the GlobalAlloc trait, essential for implementing custom NUMA-aware allocators
- Rust Memory Layout - Understanding Rust memory layout for optimal NUMA placement
- libc NUMA Bindings - Rust bindings for NUMA system calls
C/C++ resources
- NUMA C Programming Guide - Academic paper on NUMA programming techniques in C
- hwloc Library - Hardware locality library for portable NUMA topology discovery
Performance analysis tools
Measurement and profiling
- Intel Memory Latency Checker - Tool for measuring memory latency and bandwidth characteristics across NUMA nodes
- numactl and numastat - Command-line tools for NUMA policy control and statistics monitoring
- Intel VTune Profiler - Advanced profiler with NUMA-aware memory analysis capabilities
Benchmarking frameworks
- STREAM Benchmark - Memory bandwidth benchmark useful for measuring NUMA performance characteristics
- Intel MLC (Memory Latency Checker) - Comprehensive memory subsystem benchmarking tool
Academic and research papers
Foundational papers
- NUMA: A User-Level Memory Management Framework - Classic paper introducing NUMA concepts and early implementations
- Optimizing Memory Performance in NUMA Systems - IEEE paper on NUMA optimization strategies
Recent research
- Modern NUMA Architectures and Programming - Recent survey of NUMA programming techniques and emerging architectures
- NUMA-Aware Data Structures - Research on designing data structures for NUMA systems
Hardware vendor guides
AMD resources
- AMD NUMA Optimization Guide - AMD-specific NUMA optimization techniques for EPYC processors
- AMD Memory Optimization - Memory subsystem optimization for AMD architectures
ARM resources
- ARM NUMA Guidelines - NUMA programming guidelines for ARM server processors
- ARM Memory System Guide - Memory ordering and NUMA considerations for ARM architectures
Practical implementation examples
Open source projects
- jemalloc NUMA Support - Real-world NUMA-aware allocator implementation
- Linux Kernel mm/ - Linux kernel memory management source code with extensive NUMA handling
- DPDK NUMA Optimizations - Data Plane Development Kit NUMA optimization examples
Case studies
- Facebook’s NUMA Optimizations - Real-world NUMA optimization case study from Facebook’s HHVM
- Google’s TCMalloc NUMA Features - Production NUMA-aware allocator used at scale
Community resources
Forums and discussion
- Stack Overflow NUMA Tag - Community Q&A for NUMA programming questions
- Linux Kernel Mailing List - Discussions about NUMA implementation and optimization in the Linux kernel
- Reddit r/systems - Systems programming community with regular NUMA discussions
Blogs and articles
- Brendan Gregg’s NUMA Posts - Performance engineering blog with excellent NUMA analysis
- LWN NUMA Articles - In-depth technical articles about NUMA development in Linux
Books and comprehensive guides
Technical books
- “What Every Programmer Should Know About Memory” by Ulrich Drepper - Comprehensive guide to memory systems including NUMA
- “Computer Architecture: A Quantitative Approach” by Hennessy & Patterson - Academic textbook with excellent NUMA coverage
- “Systems Performance” by Brendan Gregg - Practical performance engineering including NUMA optimization
Online courses
- MIT 6.172: Performance Engineering - University course covering NUMA and other performance topics
- Carnegie Mellon 15-418 - Parallel computer architecture course with NUMA content
Related HFT framework articles
For practical application of these resources in high-frequency trading contexts, see:
- Understanding NUMA allocators - Beginner-friendly introduction to NUMA concepts
- Advanced memory management - Complete NUMA implementation with working code
- Nanosecond precision benchmarking - Measuring NUMA performance in HFT systems
Contributing to this resource list
This resource collection is maintained as part of our HFT framework documentation. If you find additional high-quality NUMA resources that would benefit the community, please contribute them through our documentation process.
Resources are selected based on:
- Authority: Official documentation, peer-reviewed papers, and recognized experts
- Practicality: Resources that provide actionable implementation guidance
- Relevance: Content specifically applicable to systems programming and HFT development
- Quality: Well-written, accurate, and up-to-date information
Last updated: September 2, 2025
Resource count: 30+ external links across 8 categories