Reliable TCP Connections in Go with Keepalive

wqwq
3 min readJul 28, 2024

--

Introduction

I previously wrote about NLB and ECS Fargate, focusing on infrastructure settings, such as using Terraform. In this article, I want to shift the focus to the application side.

Component

First, I want to discuss the below architecture. The connection is primarily socket-based. The main components are NLB and ECS Fargate.

Basic Implementation for socket in Go

We can implement Listen, Accept method. We can specify IP address in Listen method, then we can accept connection by Accept method. Then we will handle connection usually.

package main

import (
"fmt"
"net"
"os"
)

// handleConnection manages the communication with the client.
func handleConnection(conn net.Conn) {
defer conn.Close()
fmt.Println("Handling new connection from", conn.RemoteAddr())

// Example of reading from and writing to the connection
buffer := make([]byte, 1024)
for {
n, err := conn.Read(buffer)
if err != nil {
fmt.Println("Read error:", err)
return
}
if n == 0 {
break
}
fmt.Printf("Received message: %s\n", string(buffer[:n]))
_, err = conn.Write([]byte("Message received"))
if err != nil {
fmt.Println("Write error:", err)
return
}
}
}

func main() {
// Listen for connections on port 8080
listener, err := net.Listen("tcp", "localhost:8080")
if err != nil {
fmt.Fprintf(os.Stderr, "Error setting up listener: %v\n", err)
os.Exit(1)
}
defer listener.Close()
fmt.Println("Listening on port 8080...")

for {
// Accept connections
conn, err := listener.Accept()
if err != nil {
fmt.Fprintf(os.Stderr, "Error accepting connection: %v\n", err)
continue
}
fmt.Println("Accepted connection from", conn.RemoteAddr())
// Handle client connection in a goroutine
go handleConnection(conn)
}
}

Network Load balancer Timeout

NLB has a fixed idle timeout of 350 seconds, which cannot be changed. To keep the connection with NLB from the client healthy, we should use TCP keepalive. TCP keepalive regularly sends small packets to ensure the connection is still alive.

What if we don’t have TCP keepalive

As explained above, we cannot ensure a healthy connection without using TCP keepalive. When the NLB idle timeout exceeds, the connection may not close successfully, causing the connection to remain on the server. If this situation continues, file descriptors in ECS Fargate will be exhausted because the ulimit in ECS Fargate is limited by the OS. The ulimit defines the maximum number of file descriptors that can be used, which are essential for TCP connections. If we have many dead connections, the server will eventually run out of available file descriptors, rendering it unable to accept new connections and effectively making the server unusable.

How do we implement in Go

Implementation is simple in Go. We can use SetKeepAlive method.

SetKeepAlive(true)
SetKeepAlivePeriod(30 * time.Second)

Besides when you use AcceptTCP method in net package, Keep alive time is 15 seconds by default. If you want to change this value, you can use SetKeepAlivePeriod method as well.

c, err := l.AcceptTCP()

Reference

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response