2025

en.algorithmica.org

Computers usually store time as the number of seconds that have passed since the 1st of January, 1970 — the start of the “Unix era” — and use these timestamps in all computations that have to do with time. We humans also keep track of time relative to some point in the past, which usually has a…

Apr 18, 2025

Binary Exponentiation - Algorithmica

en.algorithmica.org

In modular arithmetic (and computational algebra in general), you often need to raise a number to the nnn-th power — to do modular division, perform primality tests, or compute some combinatorial values — and you usually want to spend fewer than Θ(n)\Theta(n)Θ(n) operations calculating it. Binary…

Apr 18, 2025

Extended Euclidean Algorithm - Algorithmica

en.algorithmica.org

Fermat’s theorem allows us to calculate modular multiplicative inverses through binary exponentiation in O(log⁡n)O(\log n)O(logn) operations, but it only works with prime modula. There is a generalization of it, Euler’s theorem, stating that if mmm and aaa are coprime, then aϕ(m)≡1(modm) a^{\phi(m)}…

Apr 18, 2025

Montgomery Multiplication - Algorithmica

en.algorithmica.org

Unsurprisingly, a large fraction of computation in modular arithmetic is often spent on calculating the modulo operation, which is as slow as general integer division and typically takes 15-20 cycles, depending on the operand size. The best way to deal this nuisance is to avoid modulo operation…

Apr 18, 2025

Cache-Oblivious Algorithms - Algorithmica

en.algorithmica.org

In the context of the external memory model, there are two types of efficient algorithms: Cache-aware algorithms that are efficient for known BBB and MMM.Cache-oblivious algorithms that are efficient for any BBB and MMM. For example, external merge sorting is a cache-aware, but not cache-oblivious…

Apr 18, 2025

Spatial and Temporal Locality - Algorithmica

en.algorithmica.org

To precisely assess the performance of an algorithm in terms of its memory operations, we need to take into account multiple characteristics of the cache system: the number of cache layers, the memory and block sizes of each layer, the exact strategy used for cache eviction by each layer, and…

Apr 18, 2025

Memory-Level Parallelism - Algorithmica

en.algorithmica.org

Memory requests can overlap in time: while you wait for a read request to complete, you can send a few others, which will be executed concurrently with it. This is the main reason why linear iteration is so much faster than pointer jumping: the CPU knows which memory locations it needs to fetch next…

Apr 18, 2025

Alignment and Packing - Algorithmica

en.algorithmica.org

The fact that the memory is partitioned into 64B cache lines makes it difficult to operate on data words that cross a cache line boundary. When you need to retrieve some primitive type, such as a 32-bit integer, you really want to have it located on a single cache line — both because retrieving two…

Apr 18, 2025

Pointer Alternatives - Algorithmica

en.algorithmica.org

In the pointer chasing benchmark, for simplicity, we didn’t use actual pointers, but integer indices relative to a base address: The memory addressing operator on x86 is fused with the address computation, so the k = q[k] line folds into just a single terse instruction that also does multiplication…

Apr 18, 2025

Algorithms Case Studies - Algorithmica

en.algorithmica.org

When you try to explain a complex concept, it is generally a good idea to give a very simple and minimal example illustrating it. This is why in this book you see about a dozen different ways of calculating the sum of an array, each highlighting a certain CPU feature. But the main purpose of this…

Apr 18, 2025

Matrix Multiplication - Algorithmica

en.algorithmica.org

In this case study, we will design and implement several algorithms for matrix multiplication. We start with the naive “for-for-for” algorithm and incrementally improve it, eventually arriving at a version that is 50 times faster and matches the performance of BLAS libraries while being under 40…

Apr 18, 2025

Data Structures Case Studies - Algorithmica

en.algorithmica.org

Optimizing data structures is different from optimizing algorithms as data structure problems have more dimensions: you may be optimizing for throughput, for latency, for memory usage, or any combination of those — and this complexity blows up exponentially when you need to process multiple query…

Apr 18, 2025

impro-by-keith-johnstone - Chad Nauseam Home

chadnauseam.com

Notes on Myself: The book starts with Keith talking about himself. It begins with him talking about how he felt so creative and inspired as a child, and was disappointed to see that everything seemed to become colorless and dull as he got older. This childlike state he calls "the visionary world".…

Apr 18, 2025