# Performance Tuning

## Performance Test

OvenMediaEngine은 WebRTC 성능을 측정하기 위해 OvenRtcTester라는 테스터를 제공합니다. Go 언어로 개발되었으며 `pion/webrtc/v3` 및 `gorilla/websocket` 모듈을 사용합니다. 이 훌륭한 프로젝트에 기여해주신 [pion/webrtc](https://github.com/pion/webrtc/) 및 [gorilla/websocket](https://github.com/gorilla/websocket) 팀에 깊은 감사를 드립니다.

### Getting Started OvenRtcTester

#### Install GO&#x20;

OvenRtcTester는 Go 언어로 개발되었으므로 시스템에 Go가 설치되어 있어야 합니다. 다음 URL에서 Go를 설치하십시오: <https://golang.org/doc/install>

OvenRtcTester는 go 1.17 최신 버전으로 테스트되었습니다.

#### Run&#x20;

다음과 같이 간단하게 실행할 수 있습니다: `-url`은 필수입니다. `-life` 옵션을 사용하지 않으면 사용자가 `ctrl+c`를 누를 때까지 무기한으로 실행됩니다.

```bash
$ cd OvenMediaEngine/misc/oven_rtc_tester
$ go run OvenRtcTester.go
-url parameter is required and must be vaild. (input : undefined)
  -cint int
        [Optional] PeerConnection connection interval (milliseconds) (default 100)
  -life int
        [Optional] Number of times to execute the test (seconds)
  -n int
        [Optional] Number of client (default 1)
  -sint int
        [Optional] Summary information output cycle (milliseconds) (default 5000)
  -url string
        [Required] OvenMediaEngine's webrtc streaming URL (default "undefined")

```

선호도에 따라 `go build` 또는 `go install`을 사용할 수도 있습니다.

{% hint style="warning" %}
OvenRtcTester는 대상 시스템으로 OvenMediaEngine 0.12.4 이상 버전을 테스트해야 합니다. 0.12.4 미만의 OvenMediaEngine 버전은 RTP 타임스탬프를 잘못 계산하는 문제가 있어서, OvenRtcTester가 `Video Delay` 값을 잘못 계산하게 됩니다.
{% endhint %}

```bash
$ go run OvenRtcTester.go -url ws://192.168.0.160:13333/app/stream -n 5
client_0 connection state has changed checking 
client_0 has started
client_1 connection state has changed checking 
client_1 has started
client_0 connection state has changed connected 
client_1 connection state has changed connected 
client_1 track has started, of type 100: video/H264 
client_0 track has started, of type 100: video/H264 
client_1 track has started, of type 101: audio/OPUS 
client_0 track has started, of type 101: audio/OPUS 
client_2 connection state has changed checking 
client_2 has started
client_2 connection state has changed connected 
client_2 track has started, of type 100: video/H264 
client_2 track has started, of type 101: audio/OPUS 
client_3 connection state has changed checking 
client_3 has started
client_3 connection state has changed connected 
client_3 track has started, of type 100: video/H264 
client_3 track has started, of type 101: audio/OPUS 
client_4 connection state has changed checking 
client_4 has started
client_4 connection state has changed connected 
client_4 track has started, of type 100: video/H264 
client_4 track has started, of type 101: audio/OPUS 
<Summary>
Running time : 5s
Number of clients : 5
ICE Connection State : New(0), Checking(0) Connected(5) Completed(0) Disconnected(0) Failed(0) Closed(0)
Avg Video Delay(54.20 ms) Max Video Delay(55.00 ms) Min Video Delay(53.00 ms)
Avg Audio Delay(37.00 ms) Max Audio Delay(55.00 ms) Min Audio Delay(26.00 ms)
Avg FPS(30.15) Max FPS(30.25) Min FPS(30.00)
Avg BPS(4.1 Mbps) Max BPS(4.1 Mbps) Min BPS(4.0 Mbps)
Total Bytes(11.6 MBytes) Avg Bytes(2.3 MBytes)
Total Packets(13897) Avg Packets(2779)
Total Packet Losses(0) Avg Packet Losses(0)

<Summary>
Running time : 10s
Number of clients : 5
ICE Connection State : New(0), Checking(0) Connected(5) Completed(0) Disconnected(0) Failed(0) Closed(0)
Avg Video Delay(43.60 ms) Max Video Delay(45.00 ms) Min Video Delay(42.00 ms)
Avg Audio Delay(36.60 ms) Max Audio Delay(55.00 ms) Min Audio Delay(25.00 ms)
Avg FPS(30.04) Max FPS(30.11) Min FPS(30.00)
Avg BPS(4.0 Mbps) Max BPS(4.0 Mbps) Min BPS(4.0 Mbps)
Total Bytes(24.3 MBytes) Avg Bytes(4.9 MBytes)
Total Packets(28832) Avg Packets(5766)
Total Packet Losses(0) Avg Packet Losses(0)

<Summary>
Running time : 15s
Number of clients : 5
ICE Connection State : New(0), Checking(0) Connected(5) Completed(0) Disconnected(0) Failed(0) Closed(0)
Avg Video Delay(36.60 ms) Max Video Delay(38.00 ms) Min Video Delay(35.00 ms)
Avg Audio Delay(49.20 ms) Max Audio Delay(68.00 ms) Min Audio Delay(38.00 ms)
Avg FPS(30.07) Max FPS(30.07) Min FPS(30.07)
Avg BPS(4.0 Mbps) Max BPS(4.0 Mbps) Min BPS(4.0 Mbps)
Total Bytes(36.8 MBytes) Avg Bytes(7.4 MBytes)
Total Packets(43717) Avg Packets(8743)
Total Packet Losses(0) Avg Packet Losses(0)

^CTest stopped by user
***************************
Reports
***************************
<Summary>
Running time : 15s
Number of clients : 5
ICE Connection State : New(0), Checking(0) Connected(5) Completed(0) Disconnected(0) Failed(0) Closed(0)
Avg Video Delay(23.60 ms) Max Video Delay(25.00 ms) Min Video Delay(22.00 ms)
Avg Audio Delay(11.20 ms) Max Audio Delay(18.00 ms) Min Audio Delay(5.00 ms)
Avg FPS(30.07) Max FPS(30.07) Min FPS(30.07)
Avg BPS(4.0 Mbps) Max BPS(4.0 Mbps) Min BPS(4.0 Mbps)
Total Bytes(38.6 MBytes) Avg Bytes(7.7 MBytes)
Total Packets(45662) Avg Packets(9132)
Total Packet Losses(0) Avg Packet Losses(0)

<Details>
[client_0]
        running_time(15s) connection_state(connected) total_packets(9210) packet_loss(0)
        last_video_delay (22.0 ms) last_audio_delay (52.0 ms)
        total_bytes(7.8 Mbytes) avg_bps(4.0 Mbps) min_bps(3.6 Mbps) max_bps(4.3 Mbps)
        total_video_frames(463) avg_fps(30.07) min_fps(28.98) max_fps(31.00)

client_0 connection state has changed closed 
client_0 has stopped
[client_1]
        running_time(15s) connection_state(connected) total_packets(9210) packet_loss(0)
        last_video_delay (22.0 ms) last_audio_delay (52.0 ms)
        total_bytes(7.8 Mbytes) avg_bps(4.0 Mbps) min_bps(3.6 Mbps) max_bps(4.3 Mbps)
        total_video_frames(463) avg_fps(30.07) min_fps(28.98) max_fps(31.00)

client_1 has stopped
[client_2]
        running_time(15s) connection_state(connected) total_packets(9145) packet_loss(0)
        last_video_delay (23.0 ms) last_audio_delay (63.0 ms)
        total_bytes(7.7 Mbytes) avg_bps(4.0 Mbps) min_bps(3.6 Mbps) max_bps(4.5 Mbps)
        total_video_frames(460) avg_fps(30.07) min_fps(28.97) max_fps(31.02)

client_1 connection state has changed closed 
client_2 has stopped
[client_3]
        running_time(15s) connection_state(connected) total_packets(9081) packet_loss(0)
        last_video_delay (25.0 ms) last_audio_delay (65.0 ms)
        total_bytes(7.7 Mbytes) avg_bps(4.0 Mbps) min_bps(3.6 Mbps) max_bps(4.3 Mbps)
        total_video_frames(457) avg_fps(30.07) min_fps(29.00) max_fps(31.03)

client_2 connection state has changed closed 
client_3 has stopped
client_3 connection state has changed closed 
[client_4]
        running_time(15s) connection_state(connected) total_packets(9016) packet_loss(0)
        last_video_delay (26.0 ms) last_audio_delay (36.0 ms)
        total_bytes(7.6 Mbytes) avg_bps(4.0 Mbps) min_bps(3.6 Mbps) max_bps(4.3 Mbps)
        total_video_frames(454) avg_fps(30.07) min_fps(28.99) max_fps(31.02)

client_4 has stopped
```

## Performance Tuning

### Monitoring the usage of threads

Linux에는 스레드별 CPU 사용량을 모니터링하는 다양한 도구가 있습니다. 가장 간단한 `top` 명령어로 확인해 보겠습니다. `top -H -p` \[pid] 명령을 실행하면 다음 화면이 나타납니다.

<figure><img src="/files/i6o3whHoY8KuPrjkg4mH" alt=""><figcaption></figcaption></figure>

아래와 같이 OvenRtcTester를 사용하여 서버의 용량을 테스트할 수 있습니다. 최대 성능을 테스트할 때 OvenRtcTester 역시 많은 시스템 리소스를 사용하므로 OvenMediaEngine이 실행 중인 시스템과 분리하여 테스트하십시오. 또한 여러 대의 서버를 사용하여 OvenRtcTester를 테스트하는 것을 권장합니다. 예를 들어 하나의 OvenRtcTester에서 `-n 500`으로 500명의 플레이어를 시뮬레이션하고, 4대의 서버를 사용하여 2000명의 플레이어를 시뮬레이션할 수 있습니다.

{% hint style="warning" %}
OvenMediaEngine을 디버그(debug) 모드로 빌드하고 실행하면 성능이 매우 저하됩니다. 반드시 `make release && make install` 로 생성된 바이너리를 사용하여 최대 성능을 테스트하십시오.
{% endhint %}

```
$ go run OvenRtcTester.go -url ws://192.168.0.160:13333/app/stream -n 100
client_0 connection state has changed checking 
client_0 has started
client_0 connection state has changed connected 
client_0 track has started, of type 100: video/H264 
client_0 track has started, of type 101: audio/OPUS 
client_1 connection state has changed checking 
client_1 has started
client_1 connection state has changed connected 
client_1 track has started, of type 100: video/H264 
client_1 track has started, of type 101: audio/OPUS 
client_2 connection state has changed checking 
client_2 has started
client_2 connection state has changed connected 
client_2 track has started, of type 100: video/H264 
client_2 track has started, of type 101: audio/OPUS
....
client_94 connection state has changed checking 
client_94 has started
client_94 connection state has changed connected 
client_94 track has started, of type 100: video/H264 
client_94 track has started, of type 101: audio/OPUS 
client_95 connection state has changed checking 
client_95 has started
client_95 connection state has changed connected 
client_95 track has started, of type 100: video/H264 
client_95 track has started, of type 101: audio/OPUS 
client_96 connection state has changed checking 
client_96 has started
<Summary>
Running time : 10s
Number of clients : 97
ICE Connection State : New(0), Checking(1) Connected(96) Completed(0) Disconnected(0) Failed(0) Closed(0)
Avg Video Delay(13.51 ms) Max Video Delay(47.00 ms) Min Video Delay(0.00 ms)
Avg Audio Delay(22.42 ms) Max Audio Delay(67.00 ms) Min Audio Delay(0.00 ms)
Avg FPS(27.20) Max FPS(32.51) Min FPS(0.00)
Avg BPS(3.7 Mbps) Max BPS(4.6 Mbps) Min BPS(0bps)
Total Bytes(238.7 MBytes) Avg Bytes(2.5 MBytes)
Total Packets(285013) Avg Packets(2938)
Total Packet Losses(306) Avg Packet Losses(3)
```

OvenMediaEngine의 용량을 초과하면, OvenRtcTester의 Summary(요약) 보고서에서 `Avg Video Delay` 및 `Avg Audio Delay` 또는 `Packet loss`를 통해 이를 확인할 수 있습니다.

<figure><img src="/files/mQ4umi2IcPnYPTsfAyJb" alt=""><figcaption></figcaption></figure>

위의 캡처 화면 오른쪽에서는 OvenRtcTester로 400명의 플레이어를 시뮬레이션합니다. OvenRtcTester의 `<Summary>`를 보면 `Avg Video Delay`와 `Avg Audio Delay`가 매우 높고 `Avg FPS`가 낮은 것을 알 수 있습니다.

그리고 왼쪽에서는 `top -H -p` 명령으로 스레드별 CPU 사용량을 확인할 수 있습니다. 이를 통해 `StreamWorker` 스레드가 100%로 사용되고 있음을 확인했으며, 이제 `StreamWorker` 스레드 수를 늘려 서버를 확장할 수 있습니다. OvenMediaEngine이 서버의 모든 코어를 100% 사용하지 않는 경우, [스레드 수를 튜닝](#tuning-the-number-of-threads)하여 성능을 향상시킬 수 있습니다.

<figure><img src="/files/tycuu2NlnAjyIQPCpQTP" alt=""><figcaption></figcaption></figure>

설정에서 `StreamWorkerCount` 수를 8로 튜닝한 결과입니다. 이번에는 OvenRtcTester로 1000명의 플레이어를 시뮬레이션했으며, 안정적으로 작동하는 것을 확인할 수 있습니다.

### Tuning the number of threads

`<Bind>`의 `WorkerCount`는 소켓을 통한 송수신을 담당하는 스레드를 설정할 수 있습니다. 퍼블리셔(Publisher)의 `AppWorkerCount`를 사용하면 RTP 패키징과 같은 스트림별 처리에 사용되는 스레드 수를 설정할 수 있으며, `StreamWorkerCount`를 사용하면 SRTP 암호화와 같은 세션별 처리를 위한 스레드 수를 설정할 수 있습니다.

```xml

<Bind>
    <Providers>
        <RTMP>
            <Port>1935</Port>
            <WorkerCount>1</WorkerCount>
        </RTMP>
        ...
    </Providers>
    ...
    <Publishers>
        <WebRTC>
            <Signalling>
                <Port>3333</Port>
                <WorkerCount>1</WorkerCount>
            </Signalling>
            <IceCandidates>
                <TcpRelay>*:3478</TcpRelay>
                <IceCandidate>*:10000/udp</IceCandidate>
                <TcpRelayWorkerCount>1</TcpRelayWorkerCount>
            </IceCandidates>
    ...
</Bind>
        
<Application>
<Publishers>
<AppWorkerCount>1</AppWorkerCount>
<StreamWorkerCount>8</StreamWorkerCount>
</Publishers>
</Application>
```

#### Scalable Threads and Configuration

<table data-header-hidden><thead><tr><th width="289">Thread name</th><th>Element in the configuration</th></tr></thead><tbody><tr><td>Thread name</td><td>Element in the configuration</td></tr><tr><td>AW-XXX</td><td>&#x3C;Application>&#x3C;Publishers>&#x3C;AppWorkerCount></td></tr><tr><td>StreamWorker</td><td>&#x3C;Application>&#x3C;Publishers>&#x3C;StreamWorkerCount></td></tr><tr><td>SPICE-XXX</td><td><p>&#x3C;Bind>&#x3C;Provider>&#x3C;WebRTC>&#x3C;IceCandidates>&#x3C;TcpRelayWorkerCount></p><p>&#x3C;Bind>&#x3C;Pubishers>&#x3C;WebRTC>&#x3C;IceCandidates>&#x3C;TcpRelayWorkerCount></p></td></tr><tr><td>SPRtcSignalling</td><td><p>&#x3C;Bind>&#x3C;Provider>&#x3C;WebRTC>&#x3C;Signalling>&#x3C;WorkerCount></p><p>&#x3C;Bind>&#x3C;Pubishers>&#x3C;WebRTC>&#x3C;Signalling>&#x3C;WorkerCount></p></td></tr><tr><td>SPSegPub</td><td><p>&#x3C;Bind>&#x3C;Pubishers>&#x3C;HLS>&#x3C;WorkerCount></p><p>&#x3C;Bind>&#x3C;Pubishers>&#x3C;DASH>&#x3C;WorkerCount></p></td></tr><tr><td>SPRTMP-XXX</td><td>&#x3C;Bind>&#x3C;Providers>&#x3C;RTMP>&#x3C;WorkerCount></td></tr><tr><td>SPMPEGTS</td><td>&#x3C;Bind>&#x3C;Providers>&#x3C;MPEGTS>&#x3C;WorkerCount></td></tr><tr><td>SPOvtPub</td><td>&#x3C;Bind>&#x3C;Pubishers>&#x3C;OVT>&#x3C;WorkerCount></td></tr><tr><td>SPSRT</td><td>&#x3C;Bind>&#x3C;Providers>&#x3C;SRT>&#x3C;WorkerCount></td></tr></tbody></table>

#### AppWorkerCount

| Type    | Value |
| ------- | ----- |
| Default | 1     |
| Minimum | 1     |
| Maximum | 72    |

`AppWorkerCount`를 사용하면, 하나의 애플리케이션에 수백 개의 스트림이 생성될 때 스트림의 분산 처리를 위한 스레드 수를 설정할 수 있습니다. 애플리케이션이 스트림 생성을 요청받으면, 생성된 스레드 중 하나에 스트림이 균등하게 할당됩니다. Stream의 주요 역할은 원본(raw) 미디어 패킷을 전송할 프로토콜의 미디어 형식으로 패킷화하는 것입니다. 수천 개의 스트림이 있을 경우 하나의 스레드에서 이를 처리하기 어렵습니다. 또한 `StreamWorkerCount`가 0으로 설정된 경우, `AppWorkerCount`가 세션으로 미디어 패킷을 전송하는 역할을 담당합니다.

이 값은 CPU 코어 수를 초과하지 않는 것이 권장됩니다.

#### StreamWorkerCount

| Type    | Value |
| ------- | ----- |
| Default | 8     |
| Minimum | 0     |
| Maximum | 72    |

하나의 스레드에서 수천 명의 시청자에게 데이터를 전송하는 것은 불가능할 수 있습니다. `StreamWorkerCount`를 사용하면 세션을 여러 스레드에 분산시켜 동시에 전송할 수 있습니다. 즉, WebRTC의 SRTP 암호화나 HLS/DASH의 TLS 암호화에 필요한 리소스를 여러 스레드가 분산하여 처리할 수 있습니다. 이 값은 CPU 코어 수를 초과하지 않는 것이 권장됩니다.

### Use-Case

다수의 스트림이 생성되고 각 스트림에 접속하는 시청자가 매우 적은 경우, 아래와 같이 `AppWorkerCount`를 늘리고 `StreamWorkerCount`를 낮춥니다.

```
<Publishers>
  <AppWorkerCount>32</AppWorkerCount>
  <StreamWorkerCount>0</StreamWorkerCount>
</Publishers>
```

소수의 스트림이 생성되고 각 스트림에 아주 많은 시청자가 접속하는 경우, 아래와 같이 `AppWorkerCount`를 낮추고 `StreamWorkerCount`를 늘립니다.

```
<Publishers>
  <AppWorkerCount>1</AppWorkerCount>
  <StreamWorkerCount>32</StreamWorkerCount>
</Publishers>
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ovenmediaengine-enterprise.gitbook.io/guide/ko-kr/features/operations-and-monitoring/performance-tuning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
