SlideShare a Scribd company logo
HOSTED BY
Running a Go App in Kubernetes:
CPU Impacts
Teiva Harsanyi
SRE at Google
Teiva Harsanyi
SRE at Google
■ SRE in the Borg ML team
■ 100 Go Mistakes author — 100go.co/book
Introduction
● K8s is not straightforward, it's easy to be wrong
● Unveil some Go & k8s complexity
● Discuss the impacts of running a Go app inside k8s
—focus on CPU
Core Concepts
Go and k8s Scheduler, GOMAXPROCS
Go Scheduling
3 key components:
■ G: Goroutine
■ M: OS thread (machine)
■ P: CPU core (processor)
Main actors:
■ OS scheduler: assigns an M on a P
■ Go scheduler: assigns a G on an M
M1 (runnable)
P0
M0
LRQ0
GRQ
G - executing
P1
LRQ1
G - runnable
go f()
Go Scheduling G: goroutine, M: OS thread, P: CPU core
P0
M0
LRQ0
GRQ
G - executing
P1
M1
LRQ1
G - runnable
Step 1
1/61
G - executing
Go Scheduling G: goroutine, M: OS thread, P: CPU core
P0
M0
LRQ0
GRQ
G - executing
P1
M1
LRQ1 G - waiting
Go Scheduling G: goroutine, M: OS thread, P: CPU core
G - executing
P0
M0
LRQ0
GRQ
G - executing
P1
M1
LRQ1
G - executing
G - runnable
Go Scheduling G: goroutine, M: OS thread, P: CPU core
Step 2
P0
M0
LRQ0
GRQ
G - executing
P1
M1
LRQ1
G - runnable
Step 3
Work stealing
Go Scheduling G: goroutine, M: OS thread, P: CPU core
Step 2
G - executing
P0
M0
LRQ0
GRQ
G - executing
P1
M1
LRQ1
Go Scheduling G: goroutine, M: OS thread, P: CPU core
Step 5 Network polling
Step 3
No goroutine,
what's next?
Step 4
GOMAXPROCS
Variable that defines the number of M (OS threads) that can execute
user-level Go code simultaneously
runtime.GOMAXPROCS(8) // Set GOMAXPROCS to 8
n := runtime.GOMAXPROCS(0) // Get the current value of GOMAXPROCS
Notes:
■ If an M is blocked, the Go scheduler can spin up more Ms
■ The GC derives the limit of how much CPU it should consume from GOMAXPROCS
k8s Deployment Config
spec:
containers:
- name: img
image: img:latest
resources:
requests:
cpu: "2000m" <--- Guaranteed minimum amount of CPU
resources
limits:
cpu: "4000m" <--- Maximum amount of CPU resources
k8s Scheduler
k8s uses Completely Fair Scheduler (CFS) as a process scheduler
Two main parameters:
■ cpu.cfs_period_us: Period (set to 100 ms by default)
■ cpu.cfs_quota_us: The amount of CPU time the app can consume during the defined
period
Example: if the resources.limits.cpu is set to 2000m (= 2 cores) => 2x100 = 200 ms
=> Each period of 100 ms, the app can consume up to 200 ms of CPU resource
Wait... What’s the default value of GOMAXPROCS?
Equal to runtime.NumCPU(), the number of CPU cores
GOMAXPROCS
Kubernetes
4 CPU-Core Machine spec:
resources:
limits:
cpu: 1000m
App
=> In this situation, will GOMAXPROCS be equal to 4 or to 1?
It will be equal to 4, Go isn’t CFS-aware (github.com/golang/go/issues/33803)
Experiment
Scenario Kubernetes
Load tester HTTP
spec:
resources:
requests:
cpu: 1000m
limits:
cpu: 1000m
4 CPU-Core Machine
CPU-bound App
~50 ms
Let's benchmark this scenario with:
■ GOMAXPROCS = 1
■ GOMAXPROCS = 4
Results
Why? rps < 20
Quota: 100 ms
Used: 99 ms ✅
100 ms period
28
ms
53
ms
18
ms
Quota: 100 ms
Used: 86 ms ✅
100 ms period
19
ms
17
ms
25
ms
25
ms
Quota: 100 ms
Used: 90 ms ✅
Core 0
Core 1
Core 2
Core 3
100 ms period
25
ms
28
ms
19
ms
18
ms
GOMAXPROCS=4
...
...
...
...
Why? rps >= 20
Throttling
Throttling
Throttling
Throttling
Quota: 100 ms
Used: 100 ms ❌
25
ms
25
ms
25
ms
25
ms
Core 0
Core 1
Core 2
Core 3
100 ms period
Throttling
Throttling
Throttling
Throttling
Quota: 100 ms
Used: 100 ms ❌
25
ms
25
ms
25
ms
25
ms
100 ms period
Throttling
Throttling
Throttling
Throttling
Quota: 100 ms
Used: 100 ms ❌
25
ms
25
ms
25
ms
25
ms
100 ms period
GOMAXPROCS=4
Solution
Is the solution to remove the limit?
Yes, in most cases
Main drawbacks:
■ May increase latency variance
■ Be careful in some specific conditions;
For example, if a workload has direct
correlation between CPU and memory
usage, watch out to OOMs
Main benefit:
■ A workload can use all the idle
CPU in the node
At Google?
■ We do have CPU limits
■ Not set manually, the CPU limit is recalculated each time a job is
rescheduled onto different machine
■ GOMAXPROCS is also set automatically depending on the CPU limit
Conclusion
Conclusion
■ Be aware that Go isn't CFS-aware
■ GOMAXPROCS should reflect the available compute parallelism
■ Be careful with k8s CPU limits
■ Benchmarks for the win
Teiva Harsanyi
@teivah
blog.teivah.io
Thank you! Let’s connect.

More Related Content

Similar to Running a Go App in Kubernetes: CPU Impacts

Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Henning Jacobs
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Henning Jacobs
 
How to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache SparkHow to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache Spark
Databricks
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
Alex Pinkin
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
Bill Liu
 
16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem
Tier1 app
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
상욱 송
 
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Henning Jacobs
 
Top-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptx
Tier1 app
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
Puppet
 
How to build a feedback loop in software
How to build a feedback loop in softwareHow to build a feedback loop in software
How to build a feedback loop in software
Sandeep Joshi
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql Loader
GregSmith458515
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC Hero
Tier1app
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with Micronaut
QAware GmbH
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
Yinghai Lu
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings
Dhaval Shah
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
Sematext Group, Inc.
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
Tier1 App
 
Benchmarking for HTTP/2
Benchmarking for HTTP/2Benchmarking for HTTP/2
Benchmarking for HTTP/2
Kit Chan
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache Airflow
Tatiana Al-Chueyr
 

Similar to Running a Go App in Kubernetes: CPU Impacts (20)

Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
How to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache SparkHow to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache Spark
 
SOLR Power FTW: short version
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
 
16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem16 artifacts to capture when there is a production problem
16 artifacts to capture when there is a production problem
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
 
Top-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptx
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
How to build a feedback loop in software
How to build a feedback loop in softwareHow to build a feedback loop in software
How to build a feedback loop in software
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql Loader
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC Hero
 
Microservices with Micronaut
Microservices with MicronautMicroservices with Micronaut
Microservices with Micronaut
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Benchmarking for HTTP/2
Benchmarking for HTTP/2Benchmarking for HTTP/2
Benchmarking for HTTP/2
 
Integrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache AirflowIntegrating ChatGPT with Apache Airflow
Integrating ChatGPT with Apache Airflow
 

More from ScyllaDB

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
ScyllaDB
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
ScyllaDB
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
ScyllaDB
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
ScyllaDB
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
ScyllaDB
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
ScyllaDB
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
ScyllaDB
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
ScyllaDB
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
ScyllaDB
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
ScyllaDB
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
ScyllaDB
 

More from ScyllaDB (20)

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
 

Recently uploaded

Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
Edge AI and Vision Alliance
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSECHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
kumarjarun2010
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
maigasapphire
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 

Recently uploaded (20)

Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Usef...
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSECHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
CHAPTER-8 COMPONENTS OF COMPUTER SYSTEM CLASS 9 CBSE
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 

Running a Go App in Kubernetes: CPU Impacts

  • 1. HOSTED BY Running a Go App in Kubernetes: CPU Impacts Teiva Harsanyi SRE at Google
  • 2. Teiva Harsanyi SRE at Google ■ SRE in the Borg ML team ■ 100 Go Mistakes author — 100go.co/book
  • 3. Introduction ● K8s is not straightforward, it's easy to be wrong ● Unveil some Go & k8s complexity ● Discuss the impacts of running a Go app inside k8s —focus on CPU
  • 4. Core Concepts Go and k8s Scheduler, GOMAXPROCS
  • 5. Go Scheduling 3 key components: ■ G: Goroutine ■ M: OS thread (machine) ■ P: CPU core (processor) Main actors: ■ OS scheduler: assigns an M on a P ■ Go scheduler: assigns a G on an M
  • 6. M1 (runnable) P0 M0 LRQ0 GRQ G - executing P1 LRQ1 G - runnable go f() Go Scheduling G: goroutine, M: OS thread, P: CPU core
  • 7. P0 M0 LRQ0 GRQ G - executing P1 M1 LRQ1 G - runnable Step 1 1/61 G - executing Go Scheduling G: goroutine, M: OS thread, P: CPU core
  • 8. P0 M0 LRQ0 GRQ G - executing P1 M1 LRQ1 G - waiting Go Scheduling G: goroutine, M: OS thread, P: CPU core G - executing
  • 9. P0 M0 LRQ0 GRQ G - executing P1 M1 LRQ1 G - executing G - runnable Go Scheduling G: goroutine, M: OS thread, P: CPU core Step 2
  • 10. P0 M0 LRQ0 GRQ G - executing P1 M1 LRQ1 G - runnable Step 3 Work stealing Go Scheduling G: goroutine, M: OS thread, P: CPU core Step 2 G - executing
  • 11. P0 M0 LRQ0 GRQ G - executing P1 M1 LRQ1 Go Scheduling G: goroutine, M: OS thread, P: CPU core Step 5 Network polling Step 3 No goroutine, what's next? Step 4
  • 12. GOMAXPROCS Variable that defines the number of M (OS threads) that can execute user-level Go code simultaneously runtime.GOMAXPROCS(8) // Set GOMAXPROCS to 8 n := runtime.GOMAXPROCS(0) // Get the current value of GOMAXPROCS Notes: ■ If an M is blocked, the Go scheduler can spin up more Ms ■ The GC derives the limit of how much CPU it should consume from GOMAXPROCS
  • 13. k8s Deployment Config spec: containers: - name: img image: img:latest resources: requests: cpu: "2000m" <--- Guaranteed minimum amount of CPU resources limits: cpu: "4000m" <--- Maximum amount of CPU resources
  • 14. k8s Scheduler k8s uses Completely Fair Scheduler (CFS) as a process scheduler Two main parameters: ■ cpu.cfs_period_us: Period (set to 100 ms by default) ■ cpu.cfs_quota_us: The amount of CPU time the app can consume during the defined period Example: if the resources.limits.cpu is set to 2000m (= 2 cores) => 2x100 = 200 ms => Each period of 100 ms, the app can consume up to 200 ms of CPU resource
  • 15. Wait... What’s the default value of GOMAXPROCS? Equal to runtime.NumCPU(), the number of CPU cores GOMAXPROCS Kubernetes 4 CPU-Core Machine spec: resources: limits: cpu: 1000m App => In this situation, will GOMAXPROCS be equal to 4 or to 1? It will be equal to 4, Go isn’t CFS-aware (github.com/golang/go/issues/33803)
  • 17. Scenario Kubernetes Load tester HTTP spec: resources: requests: cpu: 1000m limits: cpu: 1000m 4 CPU-Core Machine CPU-bound App ~50 ms Let's benchmark this scenario with: ■ GOMAXPROCS = 1 ■ GOMAXPROCS = 4
  • 19. Why? rps < 20 Quota: 100 ms Used: 99 ms ✅ 100 ms period 28 ms 53 ms 18 ms Quota: 100 ms Used: 86 ms ✅ 100 ms period 19 ms 17 ms 25 ms 25 ms Quota: 100 ms Used: 90 ms ✅ Core 0 Core 1 Core 2 Core 3 100 ms period 25 ms 28 ms 19 ms 18 ms GOMAXPROCS=4
  • 20. ... ... ... ... Why? rps >= 20 Throttling Throttling Throttling Throttling Quota: 100 ms Used: 100 ms ❌ 25 ms 25 ms 25 ms 25 ms Core 0 Core 1 Core 2 Core 3 100 ms period Throttling Throttling Throttling Throttling Quota: 100 ms Used: 100 ms ❌ 25 ms 25 ms 25 ms 25 ms 100 ms period Throttling Throttling Throttling Throttling Quota: 100 ms Used: 100 ms ❌ 25 ms 25 ms 25 ms 25 ms 100 ms period GOMAXPROCS=4
  • 21. Solution Is the solution to remove the limit? Yes, in most cases Main drawbacks: ■ May increase latency variance ■ Be careful in some specific conditions; For example, if a workload has direct correlation between CPU and memory usage, watch out to OOMs Main benefit: ■ A workload can use all the idle CPU in the node
  • 22. At Google? ■ We do have CPU limits ■ Not set manually, the CPU limit is recalculated each time a job is rescheduled onto different machine ■ GOMAXPROCS is also set automatically depending on the CPU limit
  • 24. Conclusion ■ Be aware that Go isn't CFS-aware ■ GOMAXPROCS should reflect the available compute parallelism ■ Be careful with k8s CPU limits ■ Benchmarks for the win