# senpai

**Repository Path**: ctyun-os-kernel/senpai

## Basic Information

- **Project Name**: senpai
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: GPL-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2024-03-03
- **Last Updated**: 2024-03-03

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Senpai

Senpai is an automated memory sizing tool for container applications.

## Background

Determining the exact amount of memory required by an application (the
*workingset size*) is a difficult, error-prone task.

Libraries and code pages used during startup are loaded into memory
only to be never touched again afterwards. On top of that, the Linux
filesystem cache doesn't kick out cold data until that memory is
required for new data. ***Allocated* memory is not a good proxy for
*required* memory**. This makes it difficult to provision memory
correctly and maintain adequate safety margins: Too little, and the
applications experience thrashing or out-of-memory kills during load
peaks; too much, and costly hardware resources are being wasted.

Senpai is a userspace tool that determines the actual memory
requirement of containerized applications.

Using Linux psi metrics and cgroup2 memory limits, senpai applies just
enough memory pressure on a container to page out the cold and unused
memory pages that aren't necessary for nominal workload
performance. It dynamically adapts to load peaks and troughs, and so
provides a workingset profile of an application over time.

This information helps system operators eliminate waste, shore up for
contingencies, optimize task placement in compute grids, and plan
long-term capacity/hardware requirements.

## Examples

An example kernel compile job has a peak memory consumption of 800M:

    $ time make -j4 -s
    real    3m58.050s
    user    13m33.735s
    sys     1m30.130s

    $ sort -n memory.current-nolimit.log | tail -n 1
    803934208

However, when a memory limit of 600M is applied, the job finishes in
the same amount of time - with 25% less available memory:

    # echo 600M > memory.high

    $ time make -j4 -s
    real    4m0.654s
    user    13m28.493s
    sys     1m31.509s

    $ sort -n memory.current-600M.log | tail -n 1
    629116928

Clearly, the full 800M aren't required. But 600M still has an unknown
amount of slack - even a 400M limit doesn't materially affect runtime:

    # echo 400M > memory.high

    $ time make -j4 -s
    real    4m3.186s
    user    13m20.452s
    sys     1m31.085s

    $ sort -n memory.current-400M.log | tail -n 1
    419368960

At 300M, on the other hand, the workload struggles to make forward
progress and finish within a reasonable amount of time:

    # echo 300M > memory.high

    $ time make -j4 -s
    ^C
    real    9m9.974s
    user    10m59.315s
    sys     1m16.576s

Finding the exact cutoff where job performance begins to plummet is a
tedious trial-and-error process. It also only works when the job does
a fixed amount of work every time it runs, like in this example, but
that isn't true for many datacenter services that run indefinitely and
process highly variable user input.

Senpai determines the memory requirement of an application while the
application is running:

    # senpai .
    2019-08-19 14:26:05 Configuration:
    2019-08-19 14:26:05   cgpath = /sys/fs/cgroup/kernelbuild
    2019-08-19 14:26:05   min_size = 104857600
    2019-08-19 14:26:05   max_size = 107374182400
    2019-08-19 14:26:05   interval = 5
    2019-08-19 14:26:05   pressure = 1000
    2019-08-19 14:26:05   max_probe = 0.01
    2019-08-19 14:26:05   max_backoff = 0.1
    2019-08-19 14:26:05   log_probe = 1000
    2019-08-19 14:26:05   log_backoff = 10
    2019-08-19 14:26:05 Resetting limit to memory.current.
    2019-08-19 14:26:06 limit=100.00M pressure=0.000000 time_to_probe= 6 total=117669927 delta=0 integral=0
    2019-08-19 14:26:07 limit=100.00M pressure=0.000000 time_to_probe= 5 total=117669927 delta=0 integral=0
    2019-08-19 14:26:08 limit=100.00M pressure=0.000000 time_to_probe= 4 total=117669927 delta=0 integral=0

    $ time make -j4 -s

    2019-08-19 14:26:09 limit=100.00M pressure=0.000000 time_to_probe= 3 total=117678359 delta=8432 integral=8432
    2019-08-19 14:26:09   backoff: 0.09259305978684715
    2019-08-19 14:26:10 limit=109.26M pressure=0.180000 time_to_probe= 5 total=117719536 delta=41177 integral=41177
    2019-08-19 14:26:10   backoff: 0.1
    2019-08-19 14:26:11 limit=120.18M pressure=0.180000 time_to_probe= 5 total=117768197 delta=48661 integral=48661

    ...

    2019-08-19 14:26:43 limit=340.48M pressure=0.160000 time_to_probe= 5 total=118045638 delta=202 integral=202
    2019-08-19 14:26:44 limit=340.48M pressure=0.130000 time_to_probe= 4 total=118045638 delta=0 integral=202
    2019-08-19 14:26:45 limit=340.48M pressure=0.130000 time_to_probe= 3 total=118045638 delta=0 integral=202
    2019-08-19 14:26:46 limit=340.48M pressure=0.110000 time_to_probe= 2 total=118045638 delta=0 integral=202
    2019-08-19 14:26:47 limit=340.48M pressure=0.110000 time_to_probe= 1 total=118045690 delta=52 integral=254
    2019-08-19 14:26:48 limit=340.48M pressure=0.090000 time_to_probe= 0 total=118045690 delta=0 integral=254
    2019-08-19 14:26:48   probe: -0.001983887611266873
    2019-08-19 14:26:49 limit=339.80M pressure=0.090000 time_to_probe= 5 total=118045690 delta=0 integral=0

    ...

    real    4m9.420s
    user    13m21.723s
    sys     1m33.037s

    $ sort -n memory.current-senpai.log | tail -n 1
    347762688

## Requirements
* Linux v4.20 or up with CONFIG_PSI=y
* python3

## License
senpai is GPL v2.0 licensed, as found in the LICENSE file.