Dev Notes
Posts
Intel Spots A 3888.9% Performance Improvement In The Linux Kernel From One Line Of Code

Intel Spots A 3888.9% Performance Improvement In The Linux Kernel From One Line Of Code

PLUS: How Google Ads Was Able to Support 4.77 Billion Users With a SQL Database

Meghanadh Vasireddy
November 11, 2024

Good Morning! Intel's kernel test robot uncovered a whopping 3,888.9% performance boost from a single-line code fix that corrects memory alignment issues in Linux's Transparent Huge Pages. Google's ad platform revealed how they tackled massive scale using Spanner, their distributed SQL database that combines GPS and atomic clocks to maintain synchronized time across global data centers while serving 4.77 billion users. In a reality check for the AI industry, recent testing showed OpenAI's latest model scoring a surprisingly low 42.7% on basic accuracy tests, with competitor Anthropic's Claude also struggling at 28.9%, highlighting ongoing challenges with AI reliability.

— Forrest Knight & Meghanadh Vasireddy

Intel Spots A 3888.9% Performance Improvement In The Linux Kernel From One Line Of Code

Context: Remember that December 2023 kernel update that was supposed to make Transparent Huge Pages (THP) more efficient? Well, it turns out that change had some unexpected side effects. The patch aligned anonymous memory mappings to THP boundaries whenever they hit the PMD_SIZE threshold – sounds good in theory, right?

Intel's kernel test robot just flagged a mind-blowing 3,888.9% performance improvement from a single-line fix. The culprit? That well-intentioned THP alignment was actually fragmenting memory in certain workloads, causing major slowdowns. The most dramatic example was the cactusBSSN benchmark, which saw up to 600% performance degradation on some platforms.

The fix modifies the alignment logic to require memory mappings to be multiples of PMD size, not just meeting a minimum threshold. Here's what the patch addresses:

Memory Behavior Changes:

Old: Any mapping ≥ PMD_SIZE got THP-aligned
New: Only mappings that are PMD_SIZE multiples get aligned
Result: Prevents fragmentation of odd-sized mappings
Impact: Eliminates TLB and cache aliasing issues

The improvement was spotted on an Intel Xeon Platinum (Cooper Lake) server, but the fix should help various workloads, including the popular darktable photo editor. It's amazing how sometimes the biggest performance gains come from the smallest changes!

Read More Here

How Google Ads Was Able to Support 4.77 Billion Users With a SQL Database

Author: Neo Kim

As Google's ad platform exploded in growth, they faced a classic dilemma: they needed NoSQL's scalability but couldn't compromise on MySQL's ACID properties. Enter Google Spanner, their groundbreaking distributed SQL solution.

At its core, Spanner combines distributed systems magic with precise timekeeping. The secret sauce? A mix of battle-tested distributed algorithms and some seriously clever engineering:

Key Components:

TrueTime API: Uses GPS and atomic clocks to maintain synchronized time across global data centers
Two-Phase Commit (2PC): Ensures atomic transactions across partitions
Paxos Algorithm: Manages leader election and replication
Two-Phase Locking: Handles transaction isolation
Snapshot Isolation: Enables lock-free reads through MVCC

Impact: The results speak for themselves. Spanner delivers 99.999% availability while powering Google's $237 billion (2023) ads business. What's particularly impressive is how it achieves strong consistency at a global scale – when data updates in Europe, users in Asia see that change immediately.

Under the hood, Spanner's architecture separates compute and storage layers, using Google's Colossus distributed file system for the latter. This separation, combined with its time-based consistency model, allows it to scale horizontally while maintaining the transactional guarantees that made traditional SQL databases so reliable.

Read More Here

OpenAI's Best Model Scores Just 42.7% on Basic Accuracy Test

The headline number that's raising eyebrows: OpenAI's latest o1-preview model only achieved a 42.7% accuracy rate on SimpleQA. To put this in dev terms, that's worse than a coin flip. Their competitor, Anthropic's Claude-3.5-sonnet, scored even lower at 28.9%, though it showed better self-awareness by being more likely to admit uncertainty rather than confidently serving up wrong answers.

The investigation revealed some critical issues with current LLM architecture:

Model Overconfidence: The systems demonstrate a concerning tendency to be highly confident in incorrect outputs, suggesting issues with their uncertainty calibration
Hallucination Persistence: Despite advances in model size and training techniques, the core problem of hallucinations remains stubbornly present
Real-world Impact: The implications are particularly concerning in high-stakes applications, with recent examples of AI-powered medical transcription services introducing fictional patient details and medications

For developers and technical teams currently integrating LLMs into production systems, these findings underscore the critical importance of implementing robust validation layers and maintaining human oversight, especially in sensitive applications. While the industry races toward bigger training sets as a potential solution, the fundamental question of whether scale alone can solve these accuracy issues remains unresolved.

Read More Here

🔥 More Notes

📹 Youtube Spotlight

How I animate 3Blue1Brown | A Manim demo with Ben Sparks

3Blue1Brown

Was this forwarded to you? Sign Up Here