- Packet loss minimization – The operating system buffers must be large enough to handle incoming network traffic while the application is paused during garbage collection. Usually UDP (User Datagram Protocol) is used in order to transmit multicast messages to server instances in a cluster; to limit the need to retransmit UDP messages the size of the operating system buffers must be set appropriately to avoid excessive UDP datagram loss.
- TCP/IP – On some systems the default value for the time wait interval is too high and needs to be adjusted. When the number approaches the maximum number of file descriptors per process, the application’s throughput will degrade, i.e., new connections have to wait for a free space in the application’s file descriptor table.
- Swapping – Swapping, also known as paging, is the use of secondary storage to store and retrieve data for use in RAM. Swapping is automatically performed by the operating system and typically occurs when the available RAM memory is depleted. Swapping can have a significant impact on the performance and should thus be avoided.
- Network interface card (NIC) – Configure the network card at it’s maximum link speed and at full duplex.
- Maximum number of open file descriptors – Most operating systems handle sockets as a form of file access and use file descriptors to keep track of which sockets are open. To contain the resources per process, the operating system restricts the number of file descriptors per process. Linux limits the number of open file descriptors per process, by default this is equal to 1024. It could be that the 1024 limit does not offer optimal performance.
- Large pages – Large pages are essentially blocks of contiguous physical memory addresses that are reserved for a process. Large pages improve performance of applications that access memory frequently. When large pages are used the application uses the translation look-aside buffer (TLB) in the processor more effectively. The TLB is a cache of recently used virtual-to-physical address space translations stored in the processor memory. To obtain data from memory, the processor looks up the TLB to find out the physical addresses (RAM or hard disk) that hold the required data. In the case of large pages, a single entry in the TLB could represent a large contiguous address space and thereby potentially reducing the TLB look-up frequency and avoiding frequent look-ups in the hierarchical page table stored in-memory.
The contents of /etc/sysctl.conf look as follows
- Packet loss minimization – The operating system buffers must be large enough to handle incoming network traffic while the application is paused during garbage collection. Usually UDP (User Datagram Protocol) is used in order to transmit multicast messages to server instances in a cluster; to limit the need to retransmit UDP messages the size of the operating system buffers must be set appropriately to avoid excessive UDP datagram loss.
- TCP/IP – On some systems the default value for the time wait interval is too high and needs to be adjusted. When the number approaches the maximum number of file descriptors per process, the application’s throughput will degrade, i.e., new connections have to wait for a free space in the application’s file descriptor table.
- Swapping – Swapping, also known as paging, is the use of secondary storage to store and retrieve data for use in RAM. Swapping is automatically performed by the operating system and typically occurs when the available RAM memory is depleted. Swapping can have a significant impact on the performance and should thus be avoided.
- Network interface card (NIC) – Configure the network card at it’s maximum link speed and at full duplex.
- Maximum number of open file descriptors – Most operating systems handle sockets as a form of file access and use file descriptors to keep track of which sockets are open. To contain the resources per process, the operating system restricts the number of file descriptors per process. Linux limits the number of open file descriptors per process, by default this is equal to 1024. It could be that the 1024 limit does not offer optimal performance.
- Large pages – Large pages are essentially blocks of contiguous physical memory addresses that are reserved for a process. Large pages improve performance of applications that access memory frequently. When large pages are used the application uses the translation look-aside buffer (TLB) in the processor more effectively. The TLB is a cache of recently used virtual-to-physical address space translations stored in the processor memory. To obtain data from memory, the processor looks up the TLB to find out the physical addresses (RAM or hard disk) that hold the required data. In the case of large pages, a single entry in the TLB could represent a large contiguous address space and thereby potentially reducing the TLB look-up frequency and avoiding frequent look-ups in the hierarchical page table stored in-memory.
The contents of /etc/sysctl.conf look as follows
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
#net.bridge.bridge-nf-call-ip6tables = 0
#net.bridge.bridge-nf-call-iptables = 0
#net.bridge.bridge-nf-call-arptables = 0
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536
# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
# increase TCP max buffer size (depending on the type of NIC and the round-trip time these values can be changed)
# Maximum TCP Receive Window
net.core.rmem_max = 8388608
net.core.rmem_default = 8388608
# Maximum TCP Send Window
net.core.wmem_max = 8388608
net.core.wmem_default = 8388608
# memory reserved for TCP receive buffers (vector of 3 integers: [min, default, max])
net.ipv4.tcp_rmem = 4096 87380 8388608
# memory reserved for TCP send buffers (vector of 3 integers: [min, default, max])
net.ipv4.tcp_wmem = 4096 87380 8388608
# increase the length of the processor input queue
net.core.netdev_max_backlog = 30000
# maximum amount of memory buffers (could be set equal to net.core.rmem_max and net.core.wmem_max)
net.core.optmem_max = 20480
# socket of the listen backlog
net.core.somaxconn = 1024
# tcp selective acknowledgements (disable them on high-speed networks)
net.ipv4.tcp_sack = 1
net.ipv4.tcp_dsack = 1
# Timestamps add 12 bytes to the TCP header
net.ipv4.tcp_timestamps = 1
# Support for large TCP Windows - Needs to be set to 1 if the Max TCP Window is over 65535
net.ipv4.tcp_window_scaling = 1
# The interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe
net.ipv4.tcp_keepalive_time = 1800
# The interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime
net.ipv4.tcp_keepalive_intvl = 30
# The number of unacknowledged probes to send before considering the connection dead and notifying the application layer
net.ipv4.tcp_keepalive_probes = 5
# The time that must elapse before TCP/IP can release a closed connection and reuse its resources.
net.ipv4.tcp_fin_timeout = 30
# Size of the backlog connections queue.
net.ipv4.tcp_max_syn_backlog=4096
# The tcp_tw_reuse setting is particularly useful in environments where numerous short connections are open and left in TIME_WAIT state, such as web servers.
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_tw_recycle=1
# The percentage of how aggressively memory pages are swapped to disk
vm.swappiness = 0
# The percentage of main memory the pdflush daemon should write data out to the disk.
vm.dirty_background_ratio=25
# The percentage of main memory the actual disk writes will take place.
vm.dirty_ratio=20
# set the number of huge pages based on the Hugepagesize, i.e., 2048kB
# when we want to reserve 1GB in huge pages we have to set the number of huge pages to:
# (1024*1024*1024*1)/(1024*1024*2) = 1073741824/2097152 = 512
vm.nr_hugepages = 512
# give permission to the group that runs the process to access the shared memory segment
# to this end open the /etc/group file and retrieve the group-id (javainstall:x:500:)
vm.hugetlb_shm_group = 500
The contents of /etc/security/limits.conf look as follows
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#
#
#Where:
#
# - an user name
# - a group name, with @group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
#
#
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open files
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
#
#
#
#* soft core 0
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#@student - maxlogins 4
# open file descriptors
@javainstall soft nofile 8192
@javainstall hard nofile 8192
# memlock - maximum locked in-memory address space (kB), we set this equal to:
# number_of_huge_pages * huge_page_size = 512 * 2048 = 1048576
@javainstall soft memlock 1048576
@javainstall hard memlock 1048576