Posted by: rolande | December 30, 2010

Performance Tuning the Network Stack on Mac OS X

There is a decent amount of documentation out there that details all of the tunable parameters on the Mac OSX IP stack. However, most of these documents either provide basic suggestions without much background on a particular setting or they discuss some of the implications of changing certain parameters but don’t give you very solid guidance or recommendations on the best configuration in a particular scenario. Many of the parameters are dependent upon others. So, the configuration should be addressed with that in mind. This document applies to OSX 10.5 Leopard, 10.6 Snow Leopard, 10.7 Lion, and 10.8 Mountain Lion.

The closest thing I have found to a full discussion and tutorial on the topic can be found here. Unfortunately, that link is now offline. For now, you can at least find it in the Wayback Machine archive here. Thank you to martineau(at)linxure.net for pointing this out.

The above document was a great reference to bookmark, but I thought I would also include my own thoughts on this topic to shed some additional light on the subject.

LAST UPDATED – 10/23/13

This post has generated a fair amount of feedback, due to issues encountered. I have attempted to provide an update to address certain strange connectivity behaviors. I believe that most issues were caused by the aggressive TCP keepalive timers. I have not been able to recreate any of the strange side effects, yet, with these updates. So far, so good. After over a year of using these settings on 10.6 Snow Leopard and the past year on 10.8 Mountain Lion, I can report that all local applications and systems are running well. I definitely notice more zip in webpage display inside Chrome and I am able to sustain higher throughputs on various speed tests compared to before.

For reference, here are the custom settings I have added to my own sysctl.conf file:


kern.ipc.maxsockbuf=4194304
kern.ipc.somaxconn=2048
kern.ipc.nmbclusters=2048
net.inet.tcp.rfc1323=1
net.inet.tcp.win_scale_factor=4
net.inet.tcp.sockthreshold=16
net.inet.tcp.sendspace=1042560
net.inet.tcp.recvspace=1042560
net.inet.tcp.mssdflt=1448
net.inet.tcp.msl=15000
net.inet.tcp.always_keepalive=0
net.inet.tcp.delayed_ack=3
net.inet.tcp.slowstart_flightsize=20
net.inet.tcp.blackhole=2
net.inet.udp.blackhole=1
net.inet.icmp.icmplim=50

The easiest way to edit this file is to open a Terminal window and execute ‘sudo nano /etc/sysctl.conf’. The sudo command allows you to elevate your rights to admin. You will be prompted to enter your password if you have admin rights. nano is the name of the command line text editor program. The above entries just get added to this file one line at a time.

You can also update your running settings without rebooting by using the ‘sudo sysctl -w’ command. Just append each of the above settings one at a time after this command. kern.ipc.maxsockets and kern.ipc.nmbclusters can only be modified from the sysctl.conf file upon reboot.

Following you will find my explanations about each of the parameters I have customized or included in my sysctl.conf file:

  1. One suggestion out there is to set the kern.ipc.maxsockbuf value to the sum of the net.inet.tcp.sendspace and net.inet.tcp.recvspace variables. The key is that this value can’t be any less than the sum of those 2 values or it can cause some fatal errors with your system. The default value, at least what I found on 10.6.5, is 4194304. That is more than enough. In my case, I have just hard coded this value into my sysctl.conf file to ensure that it does not change to prevent problems. If you are trying to tune for high throughput with Gigabit connectivity you may want to increase this value as recommended in the TCP Tuning Guide for FreeBSD. Generally the suggestion seems to be to minimally set this value to twice the Bandwidth Delay Product. For the majority of the world that is using their Mac on a DSL or Cable connection, that value would be much less than what you would need to support local transfers on your LAN. Personally, I’d leave this at the default but hard code it just to be sure.

  2. kern.ipc.somaxconn limits the maximum number of sockets that can be open at any one time. The default here is just 128. If an attacker can flood you with a sufficiently high number of SYN packets in a short enough period of time, all of your possible network connections will be used up, thus successfully denying your users access to the service. Increasing this value is also beneficial if you run any automated programs like P2P clients that can drain your connection pool real quickly.

  3. kern.ipc.maxsockets is now a baseline setting and the ceiling is dynamically calculated based on available system memory. This setting is no longer relevant and is not user configurable even at boot time.

  4. kern.ipc.nmbclusters set the connection thresholds for the entire system. One socket is created per network connection, and one per Unix domain socket connection. While remote servers and clients will connect to you on the network, more and more local applications are taking advantage of using Unix domain sockets for inter-process communication. There is far less overhead as full TCP packets don’t have to be constructed. The speed of Unix domain socket communication is also much faster as data does not have to go over the network stack but can instead go almost directly to the application. The number of sockets you’ll need depends on what applications will be running. I would recommend starting with a value matching the number of network buffers, and then tuning it as appropriate. You can find out how many network buffer clusters in use with the command netstat -m. The defaults are usually fine for most people. However, if you want to host Torrents, you will likely want to tune these values to 2 or 4 times the default of 512. The kern.ipc.nmbclusters value appears to default to 32768 on 10.7 Lion. So, this should not be something you have to tune going forward.

  5. I have hard-coded the enabling of RFC1323 (net.inet.tcp.rfc1323) which is the TCP High Performance Options (Window Scaling). This should be on by default on all future OSX builds. It should be noted that this setting also enables TCP Timestamps by default. This adds an additional 12 bytes to the TCP header, thus reducing the MSS to 1448 bytes. The default value, at this point, is arbitrarily set to 3. I have hard-coded the Window Scaling factor to 4 because it matches my need to fill up my existing Internet connection. Ensuring this value is set to 4 allows me the ability to fully utilize my 45Meg AT&T Uverse connection. I calculated this based on my Internet connection’s bandwidth-delay product. On average I should be able to achieve 45Mbps or 45 x 106bits per second. My average maximum roundtrip latency is somewhere around 50 milliseconds or 0.05 seconds. 45 x 106 x 0.05 = 2,250,000 bits. So, my network can sustain approximately 2,250,000 / 8 bits per byte = 281,250 Bytes of outstanding, unacknowledged data on the network if my aim is to fully utilize my bandwidth. The TCP window field is 16 bits wide yielding a maximum value of 65535 Bytes. A window scaling factor of 3 which is the same as saying 23 = 8 is more than enough to fill my connection. If the TCP window is set to 65535 with a window scale factor of 3, I would be able to transmit 23 x 65,535 Bytes = 524,280 Bytes on the network before requiring an ACK packet. So, a value of 3 for the Window Scale Factor setting should be more than adequate for the vast majority of individual’s Internet connections. Once you get beyond 100Mbps with an average peak latency around 50 milliseconds, you might want to consider bumping the Window Scale Factor up to 4.

    If you notice unacceptable poor performance with key applications you use, I would suggest you disable this option and make sure your net.inet.tcp.sendspace and net.inet.tcp.recvspace values are set no higher than 65535. Any applications with load balanced servers that are using a Layer 5 type ruleset can exhibit performance problems with window scaling if the explicit window scaling configuration has not been properly addressed on the Load Balancer.

  6. The next option I have set is net.inet.tcp.sockthreshold. This parameter sets the number of open sockets at which the system will begin obeying the values you set in the net.inet.tcp.recvspace and net.inet.tcp.sendspace parameters. The default value is 64. Essentially think of this as a Quality of Service threshold. Prior to reaching this number of simultaneous sockets, the system will restrict itself to a max window of 65536 bytes per connection. As long as your window sizes are set above 65536, once you hit the socket threshold, the performance should always be better than anything to which you were previously accustomed. The higher you set this value, the more opportunity you give the system to “take over” all of your network resources.

  7. The net.inet.tcp.sendspace and net.inet.tcp.recvspace settings control the maximum TCP window size the system will allow sending from the machine or receiving to the machine when the connection counts are over a pre-defined threshold. Up until the latest releases of most operating systems, these values defaulted to 65535 bytes. This has been the de facto standard essentially from the beginning of the TCP protocol’s existence in the 1970′s. Now that the RFC1323 High Performance TCP Options are starting to be more widely accepted and configured, these values can be increased to improve performance. I have set mine both to 1042560 bytes. That is essentially 16 times the previous limit. I arrived at this value using the following calculation:

    MSS x 45 x 24 = 1448 x 45 x 16 = 1042560

    • The MSS I am using is 1448 because I have RFC1323 enabled which enables TCP Timestamps and reduces the default MSS of 1460 bytes by 12 bytes to 1448 bytes.
    • 24 matches the Windows Scaling Factor I have chosen to configure.
    • The value of 45 is a little bit more convoluted to figure out. This number is a multiple of the MSS that is less than or equal to the max TCP Window field value of 65535 bytes. So, 1448 x 45 = 65160. If you were using an MSS of 1460, this value would be set to 44. But, in the case of OSX, since TCP Timestamps are automatically enabled when you enable RFC1323, you shouldn’t set the MSS higher than 1448. It might be less if you have additional overhead on your line such as PPPoE on a DSL line etc.

    You must have the RFC1323 options enabled, in order to set these values above 65535.

  8. The net.inet.tcp.mssdflt setting seems simple to configure on the surface. However, arriving at the optimum setting for your particular network setup and requirements can be a mathematical exercise that is not straightforward. The default MSS value that Apple has configured is a measly 512 bytes. That setting value is more targeted to be optimal for dial-up users. The impact is not really noticeable on a high speed LAN segment. But it can be a performance bottleneck across a typical residential broadband connection. This setting adjusts the Maximum Segment Size that your system can transmit. You need to understand the characteristics of your own network connection, in order to determine the appropriate value. For a machine that only communicates with other hosts across a normal Ethernet network, the answer is very simple. The value should be set to 1460 bytes, as this is the standard MSS on Ethernet networks. IP packets have a standard 40 byte header. With a standard MTU of 1500 bytes on Ethernet, that would leave 1460 bytes for payload in the IP packet. In my case, I had a DSL line that used PPPoE for its transport protocol. In order to get the most out of that DSL line and avoid wasteful protocol overhead, I wanted this value to be exactly equal to the amount of payload data I can attach within a single PPPoE frame to avoid fragmenting segments which causes additional PPPoE frames and ATM Cells to be created which adds to the overall overhead on my DSL line and reduces my effective bandwidth. There are quite a few references out there to help you determine the appropriate setting. So, to configure for a DSL line that uses PPPoE like mine, an appropriate MSS value would be 1452 bytes. 1460 bytes is the normal MSS on Ethernet for IP traffic, as I described earlier. With PPPoE you have to subtract an additional 8 bytes of overhead for the PPPoE header. That leaves you with an MSS of 1452 bytes. There is one other element to account for. ATM. Many DSL providers, like mine, use the ATM protocol as the underlying transport carrier for your PPPoE data. That used to be the only way it was done. ATM uses 53 byte cells of which each cell has a 5 byte header. That leaves 48 bytes for payload in each cell. If I set my MSS to 1452 bytes, that does not divide evenly across ATM’s 48 byte cell payloads. 1452/48 = 30.25 I am left with 12 bytes of additional data to send at the end. Ultimately ATM will fill the last cell with 36 bytes of null data in that scenario. To avoid this overhead, I reduce the MSS to 1440 bytes so that it will evenly fit into the ATM cells. 30 * 48 = 1440 < 1452

    I now have AT&T Uverse which uses VDSL with Packet Transfer Mode (PTM) as the transport protocol. It provides an MTU of 1500. So this eliminates all the complexity of the above calculations and take things back to the default of 1460 bytes. However, if you have enabled the RFC1323 option for TCP Window Scaling, the MSS should be set to 1448 to account for the 12 byte TCP Timestamp headers that OSX includes when that option is enabled.

  9. net.inet.tcp.msl defines the Maximum Segment Life. This is the maximum amount of time to wait for an ACK in reply to a SYN-ACK or FIN-ACK, in milliseconds. If an ACK is not received in this time, the segment can be considered “lost” and the network connection is freed. This setting is primarily about DoS protection but it is also important when it comes to TCP sequence reuse or Twrap. There are two implications for this. When you are trying to close a connection, if the final ACK is lost or delayed, the socket will still close, and more quickly. However if a client is trying to open a connection to you and their ACK is delayed more than 7500ms, the connection will not form. RFC 753 defines the MSL as 120 seconds (120000ms), however this was written in 1979 and timing issues have changed slightly since then. Today, FreeBSD’s default is 30000ms. This is sufficient for most conditions, but for stronger DoS protection you will want to lower this. I have set mine to 15000 or 15 seconds. This will work best for speeds up to 1Gbps. See Section 1.2 on TCP Reliability starting on Page 4 of RFC1323 for a good description of the importance of TCP MSL as it relates to link bandwidth and TCP sequence reuse or Twrap. If you are using Gig links, you should set this value shorter than 17 seconds or 17000 milliseconds to prevent TCP sequence reuse issues.

  10. It appears that the aggressive TCP keepalive timers below are not well liked by quite a few built-in applications. I have removed these adjustments and kept the defaults for the time being. net.inet.tcp.keepidle sets the interval in milliseconds when the system will send a keepalive packet to test an idle connection to see if it is still active. I set this to 120000 or 120 seconds which is a fairly common interval. The default is 2 hours.

  11. net.inet.tcp.keepinit sets the keepalive probe interval in milliseconds during initiation of a TCP connection. I have set mine to the same as the regular interval which is 1500 or 1.5 seconds

  12. net.inet.tcp.keepintvl sets the interval, in milliseconds, between keepalive probes sent to remote machines. After TCPTV_KEEPCNT (default 8) probes are sent, with no response, the (TCP) connection is dropped. I have set this value to 1500 or 1.5 seconds.

  13. net.inet.tcp.delayed_ack controls the behavior when sending TCP acknowledgements. Allowing delayed ACKs can cause pauses at the tail end of data transfers and used to be a known problem for Macs. This was due to a known poor interaction with the Nagle algorithm in the TCP stack when dealing with slow start and congestion control. I previously had recommended disabling this feature by setting it to “0″. I have learned that Apple has updated the behavior of Delayed ACK, since the release of OSX 10.5 Leopard to support Greg Minshall’s “Proposed Modification to Nagle’s Algorithm“. I have now reverted this setting back to the default and enabled this feature in auto-detect mode by setting the value to “3″. This effectively enables the Nagle algorithm but prevents the unacknowledged runt packet problem causing an ACK deadlock which can unnecessarily pause transfers and cause significant delays. For your reference, following are the available options:

    • delayed_ack=0 responds after every packet (OFF)
    • delayed_ack=1 always employs delayed ack, 6 packets can get 1 ack
    • delayed_ack=2 immediate ack after 2nd packet, 2 packets per ack (Compatibility Mode)
    • delayed_ack=3 should auto detect when to employ delayed ack, 4 packets per ack. (DEFAULT)
  14. net.inet.tcp.slowstart_flightsize sets the number of outstanding packets permitted with non-local systems during the slowstart phase of TCP ramp up. In order to more quickly overcome TCP slowstart, I have bumped this up to a value of 20. This allows my system to use up to 10% of my bandwidth during TCP ramp up. I calculated this by figuring my Bandwidth-Delay Product and taking 10% of that value divided by the max MSS of 1460 bytes to get rough packet count. So, taking the line rate at 45Mbps or 45 x 106 x 50 milliseconds or 0.05 seconds / 8 bits per byte / 1448 bytes per packet, I came up with roughly 20 packets.

  15. net.inet.tcp.blackhole defines what happens when a TCP packet is received on a closed port. When set to ’1′, SYN packets arriving on a closed port will be dropped without a RST packet being sent back. When set to ’2′, all packets arriving on a closed port are dropped without an RST being sent back. This saves both CPU time because packets don’t need to be processed as much, and outbound bandwidth as packets are not sent out.

  16. net.inet.udp.blackhole is similar to net.inet.tcp.blackhole in its function. As the UDP protocol does not have states like TCP, there is only a need for one choice when it comes to dropping UDP packets. When net.inet.udp.blackhole is set to ’1′, all UDP packets arriving on a closed port will be dropped.

  17. The name ‘net.inet.icmp.icmplim‘ is somewhat misleading. This sysctl controls the maximum number of ICMP “Unreachable” and also TCP RST packets that will be sent back every second. It helps curb the effects of attacks which generate a lot of reply packets. I have set mine to a value of 50.

References

About these ads

Responses

  1. Hi, thanks for this article. However, I noticed some side effects after installing your sysctl.conf file on my 2 computers (an iMac and a MacBook Pro):
    1. The printer got uninstalled
    2. The computers don’t “see” each other via Bonjour. No external access to the disk and no screen sharing possible. The Bonjour protocol does work- a NAS server is visible for both computers via Bonjour.
    Can you help me out?

    • When I first updated my setting everything was fine but as the system ran and more process started up thing would weird. It started with not being able to use my VPN configurations. When I tried to create a VPN the GUI would say something like unrecoverable error check your configurations but this was a working configuration and hour earlier.

      The console log had

      configd[32] IPSec Controller: cannot create racoon control socket
      configd[32] IPSec Controller: cannot connect racoon control socket (errno = 61)

      So I thought I would just remove and try to reconfigure…the weird thing was I could delete the VPN configuration but when I hit reply it would reappear, so that I could not save any changes to the systems preferences pane. In the log I had these messages.

      .com.apple.systempreferences[9546] _SCHelperOpen connect() failed: Connection refused

      The next thing I noticed was my IP printer was gone and the system would not allow me add it back…So I turned to Google and found a site talking about the sysctl mibs and sockets availability that had the same errors in the logs (site was in Russian). The light bulb went on I backed out all the variables and viola all my services were working.

      I believe the culprit is the

      net.inet.tcp.sockthreshold=16

      Variable I have added all of the others back and no problems so far..

  2. Just a quick update: I renamed the sysctl.conf file on both computers, restarted then and lo and behold, everything is back to normal. The printer is there, screen sharing and remote access work as they should.
    So now we have a question: which line in the sysctl.conf file caused this trouble?

  3. Interesting find. I don’t use Bonjour or desktop sharing, so I hadn’t noticed. I will see if I can recreate and identify which particular parameter is causing the headache. That is really odd behavior. As far as the printer disappearing, is it locally attached or over the network? I wonder if it has something to do with the MSL timer setting? Try setting that back to 30000 which is the default for BSD.

  4. The printer is attached via USB port to the iMac. I tried reinstalling the drivers first, but it failed- drivers could not be found via Apple despite the info that they are available there. I downloaded the driver from Canon’s website, but installation of these failed too.
    The MSL timer- first I’ll do some reading about what it is, then give it a try.
    Thanks a lot for your quick reply. W.

    • MSL is the maximum segment lifetime. That is the maximum amount of time a segment of data can be outstanding on the network without acknowledgement or response from the remote end.

      I narrowed the issue down to one of the TCP timer options. At least for the printing problem I was able to recreate. It appears that the OSX Printer & Fax settings screen uses local network communication to talk to the CUPS printing system to identify the installed printers and their status. Something about the tuned TCP timers/keepalives was causing this particular communication to fail. I am not 100% sure yet which one is at fault. I pulled out the MSL setting and the keepalive settings and put it back to default and the printers reappeared without rebooting.

      I set these back to default:

      net.inet.tcp.msl=10000 -> 15000
      net.inet.tcp.always_keepalive=1 -> 0
      net.inet.tcp.delayed_ack=0 -> 3
      net.inet.tcp.blackhole=2 -> 0
      net.inet.udp.blackhole=1 -> 0

  5. Hello again, Scott.
    while perusing the subject of TCP tuning, I came across IPNETTUNERX by http://www.sustworks.com .
    I don’t think this software is worth its money, but the interesting thing was a list of parameters it modifies. They are (name – current – default):
    kern.ipc.maxsockbuf – 4194304 – 262144
    kern.ipc.maxsockets – 512 – 512
    kern.maxfiles – 12288 – 12288
    kern.maxvnodes – 33792 -

    net.inet.ip.accept_sourceroute – 0 – 0
    net.inet.ip.fastforwarding – 0 – 0
    net.inet.ip.forwarding – 0 – 0
    net.inet.ip.fw.dyn_count – 0 – 0
    net.inet.ip.fw.dyn_max – 4096 – 4096
    net.inet.ip.fw.enable – 1 – 1
    net.inet.ip.fw.verbose – 2 – 0
    net.inet.ip.fw.verbose_limit – 0 – 0
    net.inet.ip.intr_queue_drops – 0 – 0
    net.inet.ip.intr_queue_maxlen – 50 – 50
    net.inet.ip.keepfaith – 0 – 0
    net.inet.ip.portrange.first – 49152 – 49152
    net.inet.ip.portrange.hifirst – 49152 – 49152
    net.inet.ip.portrange.hilast – 65535 – 65535
    net.inet.ip.portrange.last – 65535 – 65535
    net.inet.portrange.lowfirst – 1023 – 1023
    net.inet.ip.portrange.lowlast – 600 – 600
    net.inet.ip.redirect – 1 – 1
    net.inet.ip.rtexpire – 140 – 3600
    net.inet.ip.rtmaxcache – 128 – 128
    net.inet.ip.rtminexpire – 10 – 10
    net.inet.ip.sourceroute – 0 – 0
    net.inet.ip.ttl – 64 – 64
    net.inet.raw.maxdgram – 8192 – 8192
    net.inet.raw.recvspace – 8192 – 8192
    net.inet.tcp.aways_keepalive – 0 – 0
    net.inet.tcp.blackhole – 2 – 0
    net.inet.tcp.delacktime – – 50
    net.inet.tcp.delayed_ack – 3 – 3
    net.inet.tcp.do_tcpdrain – 0 – 0
    net.inet.tcp.drop_synfin – 1 – 1
    net.inet.tcp.icmp_may_rst – 1 – 1
    net.inet.tcp.isn_reseed_interval – 0 – 0
    net.inet.tcp.keepidle – 7200000 – 144000
    net.inet.tcp.keepinit – 75000 – 1500
    net.inet.tcp.keepintvl – 75000 – 1500
    net.inet.tcp.local_slowstart_flightsize – 8 – 4
    net.inet.tcp.log_in_vain – 3 – 0
    net.inet.tcp.minmss – 216 – 216
    net.inet.tcp.minmssoverload – 0 – 0
    net.inet.tcp.msl – 15000 – 600
    net.inet.tcp.mssdflt – 512 – 512
    net.inet.tcp.newreno – 0 – 0
    net.inet.tcp.path_mtu_discovery – 1 – 1
    net.inet.tcp.pcbcount – 28 –
    net.inet.tcp.recvspace – 65536 – 32768
    net.inet.tcp.rfc1323 – 1 – 1
    net.inet.tcp.rfc1644 – 0 – 0
    net.inet.tcp.rttdflt – – 3 (not available in Mac OS 10.2 or later)
    net.inet.tcp.sack – 1 – 1
    net.inet.tcp.sack_globalholes – 0 – 0
    net.inet.tcp.sack_globalmaxholes – 65536 – 65536
    net.inet.tcp.sack_maxholes – 128 – 128
    net.inet.tcp.sendspace – 65536 – 32768
    net.inet.tcp.slowlink_wsize – 8192 – 8192
    net.inet.tcp.slowstart_flightsize – 1 – 1
    net.inet.tcp.sockthreshold – 64 – 256
    net.inet.tcp.strict_rfc1948 – 0 – 0
    net.inet.tcp.tcbhashsize – 4096 – 4096
    net.inet.tcp.tcp_lq_overflow – 1 – 1
    net.inet.tcp.v6mssdflt – – 50
    net.inet.udp.blackhole – 1 – 0
    net.inet.udp.checksum – 1 – 1
    net.inet.udp.log_in_vain – 3 – 0
    net.inet.udp.maxdgram – 9216 – 9216
    net.inet.udp.pcbcount – 41 –
    net.inet.udp.recvspace – 42080 – 42080
    net.link.ether.inet.apple_hwcksum_rx – 1 – 1
    net.link.ether.inet.apple_hwcksum_tx – 1 – 1
    net.link.ether.inet.host_down_time – 20 – 20
    net.link.ether.inet.log_arp_wrong_iface – – 0
    net.link.ether.inet.max_age – 1200 – 1200
    net.link.ether.inet.maxtries – 5 – 5
    net.link.ether.inet.proxyall – 0 – 0
    net.link.ether.inet.prune_intvl – 300 – 300
    net.link.ether.inet.useloopback – 1 – 1
    net.local.stream.recvspace – 8192 – 8192
    net.local.stream.sendspace – 8192 – 8192

    what do you think?

  6. Thanks for sharing. Note that OS X server variants have some different values from the desktop, some noticeable differences (derived from a 10.5 upgraded from 10.4 OS X server):

    kern.maxvnodes: 120000
    kern.ipc.somaxconn: 2500
    kern.ipc.maxsockbuf: 8388608
    net.inet.tcp.recvspace: 65536
    net.inet.tcp.sendspace: 65536

  7. I’ve been trying to tune my IP stack on and off for a couple of months without success. I am unable to get the send and receive spaces to be anything other than 65536. I updated my parameters to match what you prescribed above, but when I do a sudo sysctl net.inet.tcp.sendspace, I get a response of 65536 for both. Any idea what is overriding what i have in sysctl.conf? (osx 10.6.7)

    Thanks in advance,

    Frederick

    • You have to set kern.ipc.maxsockbuf to a value equivalent to the sum of those 2 fields or you will cause problems on your machine. You also must already have net.inet.tcp.rfc1323=1. If that is not set, which it should be by default, the system will not allow a window size greater than 65535.

  8. It hid all my system monitor actions

    • You likely did not set kern.ipc.maxsockbuf to a value equivalent to the sum of the send and receive window sizes. If you do not do this you will cause strange issues with local services on your Mac.

  9. Thanks for the post, I’ve found the detailed description of each variable modification very useful.

    However when I tried to increase kern.ipc.somaxconn to 32768, clients on my home LAN could not connect to my local FTP server. :-o I cannot tell how kern.ipc.somaxconn should affect this in any way, but that’s what happens.

    By default (without any sysctl mods) clients have no problem connecting. If I create the /etc/sysctl.conf with the only line “kern.ipc.somaxconn=32768″ and reboot, clients fail to connect. Do you have any clue what might cause this? :-o

    I don’t know if this helps, but here’s my config and a few parameters:
    - MacBook Pro (Intel CPU, bought in 2006 Dec.) with 2 GB RAM
    - runs ipfw (configured via WaterRoof)
    - has Little Snitch (but that should not interfere with values set in kern.ipc.somaxconn)
    - FTP server is PureFTPd 1.0.29
    - clients are running Windows (Win7 and FTP client is FileZilla)

    I googled for potential pitfalls with changing kern.ipc.somaxconn, but nothing came up.

    One other thing: do I understand correctly that kern.ipc.somaxconn determines the max. number of _incoming_ connections for a listening socket? Ie. a local webserver, FTP server or torrent client?

    And does kern.ipc.maxsockets determine the max. number of sockets (of all kinds)? The latter has a default of 512. Does this mean that my Mac can have only 512 network connections (incoming + outgoing together) at any given moment? Do I have to set other variables too if I wanted to increase the max. number of network connections? I’m just asking because I’ve read in a blog post that maybe the max. number of open files/handles must be increased as well, because they’re somehow related (?).

    Thanks for your help!

    • First of all, I apologize for the extremely delayed response.

      The kern.ipc.somaxconn variable is highly dependent upon the amount of available memory on your system. This value represents the number of simultaneously permitted connections in a single socket’s listening queue. Your assessment is accurate. Most documentation I have read indicates that the default value of 128 is usually fine for a home/work machine. You really only need to tune this up if you are doing some high volume server hosting and experiencing connection failures. If the value is too high, it is likely you will run into resource issues.

      kern.ipc.maxsockets and kern.ipc.nmbclusters kind of go hand in hand. They do set the threshold for maximum connections on the system. I have set mine both to 2048.

      • No problem. Thanks for the reply. Hope you get well soon. :-)

  10. [...] have been reading TCP tuning guides on the web. Most of them suggest tweaking tcp parameters. Specifically, they recommend increasing values for net.inet.tcp.sendspace [...]

  11. This post has generated a fair amount of feedback, due to issues encountered with various applications. Unfortunately, I was mostly offline and unable to focus on responding for the past 6 or 7 months because of a serious medical issue I was dealing with. I have finally updated this post to address observed local application connectivity failure as a result of a couple settings. I believe that most issues were caused by aggressive TCP keepalive timers. I have not been able to recreate any of the previous strange side effects, yet. So far, so good. As always, your feedback is welcome. Thank you.

  12. Rolande,
    Thank you for a very informative article. As you may be aware, P2P clients can occasionally put OS X into system freeze. Other than your references above, do you have any other suggestions for settings that might reduce the possibility of this unpleasant occurrence.
    Also, I wonder if certain settings such as “net.inet.tcp.send/recvspace” are, in at least some contexts, more like a soft limit only. While using P2P, for example “getsockopt” returns values double and quadruple the defaults shown in “sysctl -a” for “tcp.send/recvspace”. Also the default value “kern.ipc.maxsockbuf: 4194304″ could easily be exceeded when using P2P but perhaps that is just a soft limit as well? Do you have any thoughts on the responsibility of the application code to try to limit socket buffer size via “setsockopt”? Hope at least some of this makes sense! :)

    • I don’t believe those are soft limits from a network stack perspective. If you allow the tcp.send/recvspace values to exceed the maxsockbuf in aggregate it will cause your system to puke. Typically it is the sheer volume of TCP connections generated by a P2P client that causes the system trouble in that it will eat up a decent amount of memory and cause the machine to chew on the CPU to perform all of the connection management. I would tune the P2P client to be less aggressive by setting reasonable connection limits and not try to use system settings to account for the P2P clients aggressive behavior.

  13. Hi are those settings also suitable for direct connections via crossover cable? I have my Mac Pro connected directly to a PC both with gigabit lan and i want to get the best performance out of it. Is there a way to reduce CPU usage for network connections?

    • These settings will work for a LAN only connection. They aren’t 100% optimal but they are definitely better than the defaults out of the box. If you want to squeeze all the juice you can out of a high speed LAN link, there are a number of things you can do, however, the trade off is that if you need to access a slower broadband Internet connection using that same high speed interface, your performance can and will suffer. The key is to optimize performance on a single interface to cover the majority of traffic scenarios.

  14. The link to the full discussion and tutorial on the topic is dead. Fortunately I was able to retrieve a copy using the WayBack machine. Here’s a link to that:
    > http://web.archive.org/web/20110611213330/http://macgeekery.com/tips/configuration/mac_os_x_network_tuning_guide_revisited

    • Martin, the link you posted is not looking very alive from here either. I eventually get timed out.

      • Sorry about that. I just put the url from your link into the Wayback Machine’s form and took the most recent thing it had (May 27, 2011). The Wayback Machine’s website is: http://archive.org/web/web.php

      • Interesting, I just tried the link in the quote of my message and it worked fine.

        I left a reply on your web page detailing how I arrived at it.

        LMK if you’d like a PDF I made of it.

      • Worked now when I just tried it again. Thanks for the link. I’ll update. That is too bad that Adam’s blog went offline.

  15. Wow, your info has been far better than anything else I could fine on the web, thanks!

    • Thank you. I’m glad you found it useful.

  16. hi rolande i was wondering if you could help me out.

    Im having trouble a slow transfer rate with Snow Leopard computers transferring files to and from a Linux server using SMB, AFP, and iperf3. One Snow Leopard computers will download a file about at 60MB/sec and upload at an inconstant 30-55MB/sec(see note below). Another Snow Leopard computer will only get up to about 15-20MB/sec. A Mountain Lion computer (that is actually older than the other computers) will also transfer both ways at a consistent 60MB/sec. A Mac to a windows 7 computer are about the same results speeds as to the servers. Both the Snow Leopard computers are installed from the same net restore image file. All speeds are about the same if i use AFP or SMB. If i transfer between two Macs the rate goes up to about 105MB/sec. All Windows computers transfer at about 105MB/sec to the server and to other clients.

    When i run iperf3 i get about 70MB/sec to the server and 112MB/sec from either server. i haven’t gotten iperf(3) to work on the windows computers (install issue) so i haven’t ben able to test that. do you know a good place to find iperf3 for windows? i haven’t had any luck yet.

    i tried adding the sysctl.conf file, however no change.

    The server is Fedora 16 and 17 (two servers)

    both have NIC teaming (round robin) configured in the OS

    the fedora 16 server has an dual port intel card installed. when im testing with that server if i have both network ports connected i get the 30-55MB/sec on one of the Snow Leopard computers. if i unplug one network port the speed goes up to 60MB/sec.
    the network switch is a gigabit netgear 48 port GS748Tv4

    thanks,
    nate

  17. Hi Rolande, thank you for yr very informative article! I’m looking forward to implementing them in our 10.6 production servers, after a lengthy troubleshooting process to find the “bottlenecks” that I’m seeing in application performance.

    I read yr replies above, and I understand that we are seeking a balanced optimization for a slightly wider range of network topology and usage.

    For me, I have been optimizing Apache, PHP and part of an application, and now they are almost done, the benefit from these work is quite remarkable. Even though the application can be optimized more, the network connections issue is the biggest issue now.

    The real bottle neck is that, by using a web service such as http://gtmetrix.com, I find that the main apache web server is taking 5-6 seconds to accept a connection! Safari’s blue loading bar would move 1-2 blocks to the right, then halts for 5-6secs, then boom, everything bursting down (access from the Internet, not intranet)

    Just wondering if you can comment on the following setup, and which parameters should be set more aggressively.

    * all macmini as servers, all running gigabit intranet
    * the NAT gateway is an AirPort Extreme
    * the minis provide various services to the Internet thru the AE via gigabit network, of which has a 100Mb uplink and downlink
    * the Ethernet packet size is set to the maximum allowable values on different servers

    • Typically the slow connection response on Apache is due to a reverse DNS lookup issue. You probably have Apache configured to log reverse lookups of connecting clients. If reverse DNS fails, it takes 5 seconds for it to timeout before Apache will release the connection for service.

      As far as the rest of your configuration, you have a fairly beefy Internet connection at 100Meg. Is it a straight ethernet handoff or what is the connection? If it is straight ethernet, you can set the MSS to 1460 and MTU to 1500. If the connection is using anything like PPPoE or PPPoA, you will want to dial the MSS down to account for all of the WAN link overhead. At 100Meg, though, you won’t notice a significant jump in performance, unless you are pushing really large volumes of simultaneous smaller transactions.

  18. Hi — thanks for this info — unfortunately I am still on 10.4.11 tiger ppc G5 imac. Do you have settings for that on a 10MBps DSL line. Sure would appreciate any insight and info you could provide with much thanks!!

    • What I have posted should still work with OSX 10.4.11 Tiger on PPC, although I have not tested. You can view the current settings using ‘sysctl-a’ and then just grep for whatever you are interested in. The kernel buffer and socket settings may be a little more conservative depending on how much memory you have on your machine.

  19. Hi
    Thanks for this post.
    I am running OS X 10.7.5.
    Unfortunately, when I reboot the /etc/sysctl.conf is being ignored.
    When I query the settings using ‘sysctl -w’ they are different to the values in sysctl.conf.
    Are there any other steps?
    Thanks in advance.

    • I think you mean you queried using ‘sysctl -a’. ‘sysctl -w’ is used for overwriting existing values on the fly. Are you sure the file has the correct ownership permissions? Should look like this when you run an ‘ls /etc/sysctl.conf’:

      -rw-r--r-- 1 root wheel 440 Jun 11 2012 /etc/sysctl.conf

      If not, use ‘sudo chmod 644 /etc/sysctl.conf’ and ‘sudo chown root:wheel /etc/sysctl.conf’ to set it correctly.

      • Hi thanks for the speedy reply.

        Yes the ownership and permissions look right:
        $ ls -l /etc/sysctl.conf
        -rw-r–r– 1 root wheel 25 Mar 15 11:39 /etc/sysctl.conf

        But after reboot, the settings are do not reflect the values in sysctl.conf, when I query the values with ‘sysctl -a’.

      • Are you able to update the running settings using ‘sudo sysctl -w ‘? What OSX release are you on? I have not run into one that did not load a sysctl.conf file on boot if it was present with the right permissions and ownership and had correct syntax. OSX has supported this since like release 10.3. I have it working on my own Macbook running Mountain Lion and a work machine running Lion, as well as when I had previous releases like Leopard and Snow Leopard installed.

  20. Hello Rolande, very informative guide! I’m a Mac newbie, just came across your page when I was looking for ways to improve my utorrent performance. Recently after I upgraded OSX from 10.8.2 to 10.8.3, I noticed a significant drop in the performance (namely I can’t seem to connect to other peers despite the client said my connection is fine). So I decided to tweak the TCP performance a bit by “sudo sysctl -w kern.ipc.somaxconn=2048″, which I know increasing my TCP max connections from 128 to 2048, but still seems not working as good as before. May I have your thoughts on this one? Am I doing this right?

  21. I took a whack at optimizing a very slow Leopard machine yesterday [link attached] and while it showed some early and immediate improvement, this morning seems to be much less impressive. Also, snmp was not reporting the right (or any) values after I made some of these changes. Its registers may be getting stepped on by these changes. Even monitoring it directly (not through a graphing program like cacti or mrtg) shows the same problem.

    It looks like kern.ipc.somaxconn was set to 512 on the Leopard machine and only 1024 on the Mountain Lion system. It stands to reason that synchronizing them is just as important as optimizing them. So you may end up with performance that’s not the best possible but the best you can get.

    kern.ipc.nmbclusters in Leopard seems to be hard-coded at 32768, even after a reboot. Everything else seems to be modifiable.

    I use netstat -w1 -I en1 to monitor bandwidth during transfers: I don’t know if you have a better method.

    I’ll revisit some of this later and update my notes.

  22. Scott, Have you repeated your tests on 10.8.4 ? Network performance is significantly slow over there…there are lots of new features:
    net.inet.tcp.lro: 1
    net.link.generic.system.rxpoll: 1
    net.inet.tcp.ack_prioritize: 1
    net.inet.tcp.doautorcvbuf: 1
    net.inet.tcp.doautosndbuf: 1
    net.link.generic.system.flow_advisory: 1

    Apart from that since 10.8, mac OSx started supporting QFQ packet scheduler. I am a newbie to mac OSx. I have kernel mode driver, which does socket send and it used to hang in 10.8.2. To workaround that I used net.link.generic.system.flow_advisory = 0. But n/w performance still sucks,
    if you have explored anything in this area, please update us with your findings.

    • I have 10.8 running on my personal Macbook Pro. I have applied essentially the same settings on its interface. I have not stress tested it on the local segment for 100Meg or Gig but performance has been good when testing my broadband connection at 24Meg downstream. When you say performance sucks, describe what you are measuring and what methodology you are using.

  23. Sorry for delay in reply. I am using mac pro 4,1 and performance issue seen only for built-in Intel NIC which is using QFQ scheduler. To reproduce the issue, I am using netperf request-response test. When you have multiple sockets executing request-response test [say 12] , 3-4 among these 12 socket will get hung during sock_send (for 10.8.2, 10.8.3, 10.8.4). I have reported bug on apple bug reporter for this and is been fixed in 10.8.5. But now on 10.8.5 pre-release build, for above scenario where we are sending data over multiple sockets, 3-4 sockets out of 12 will face significant performance degradation during socket send.

    What stress test you execute normally for mac?

  24. Thanks for the info – been looking for something like this for a while. Very much appreciated. Running on an iMac with 10.8.4. Some questions:

    First – there is no /etc/sysctl.conf file on my system. I didn’t see if you mentioned if the file exists or not on yours. I’m assuming that even if it doesn’t exist, that the existence of this file upon reboot will be read and the values applied?

    I’ve applied the settings on a temp basis using ‘sysctl -w’ I just set up a quick shell script to read the sysctl.conf file (which right now resides in my user directory – I just want to test without it becoming “permanent” by putting it into /etc right now). Anyway, one thing I noticed by doing this was an error trying to set the kern.ipc.nmbclusters – “sysctl: kid ‘kern.ipc.nmbclusters’ is read only”. Going through the posts, and your explanation – sounds like this can only be set on a reboot. Someone else notes that this value seems to be hard-coded in Leopard. So, is this now a “tuneable” parameter in ML with a reboot?

    Other than that – all parameters were applied via my shell script. This way is kind of nice, as the sysctl command shows you the existing value, before setting it to the new value.

    So far, it does seem to have made an improvement in speed. I’m testing right now using a couple of online “speed tests” for U/L and D/L speeds (yeah, I know, REAL scientific ) – but it does seem to have made a roughly 20-30% increase in D/L speed with these same sites. I’ll continue checking through the day to see if results continue with those improvements – or they were an aberration for the 5 times I tried them.
    Thanks!

  25. Nice description of many of the sysctl options. One thing I’ve noticed is that I don’t have to change the value of net.inet.tcp.mssdflt, since my connections all appear to use the correct value of MSS (e.g. 1460 for an Ethernet MTU of 1500). There may be some circumstances where the small default is used, but I haven’t been able to create any with my testing with Wireshark. BTW, I’m running 10.8.5 on a Late 2008 Mac Pro.

  26. Still wrangling with this here. I was doing some file transfers between two gigabit-enabled machines through a gigabit switch and maxing out at 20 Mbits. Yuck. So I have messed with iperf a bit and found that I can get as much 550Mbits over the same wire. I assume some disk i/o was part of the other slow result but that much? I was copying between two nfs-mounted disks exported from another gigabit enabled device so likely lots of overhead. But 20 Mbits? Yikes.

    • The settings I have posted here are definitely not designed for maximum or optimal throughput on a Gigabit network. However, by themselves, they should not create such poor throughput results as you are experiencing. I have been able to achieve over 45Mbps of throughput with these settings running on my Macbook with Mountain Lion using AT&T Uverse. I don’t have any network based file storage to do any real testing with locally at Gigabit speeds. However, I have done tests in the past and have found that much of the off the shelf NAS hardware provides pretty poor bulk throughput. They are typically I/O limited. I had a 1TB Buffalo Terastation that had 4 250GB drives running RAID5 in software and I could not get much over 85-88Mbps of throughput, even when connected using Gig. The primary limitation is that setup was running RAID5 in software with no hardware acceleration and did not have much horsepower to begin with on the system. If I had disabled RAID5 and ran it as one large disk volume, my guess is that I would likely have seen 2 to 3 times the performance improvement.

      You mention you are copying between two NFS mounted disks. When you do this, all the data must be downloaded to the host from the source NFS mount and then uploaded to the destination NFS mount. That means you would be doubling up the traffic on the host interface. Are you sure that your switch is not enabling Ethernet flow control? It is likely if you were to run a packet capture you would discover that there is slow ACKs or TCP retransmission and/or backoff occurring that is artificially limiting your throughput. It is not as much a network issue, as it is something else artificially restricting TCP from making full use of the network. Another possibility is that you are overwhelming the hardware capability of your switch. Depending on the switch port layout and configuration, it is possible that you are overwhelming shared buffer memory and causing tail dropping or head of line blocking and forcing TCP to retransmit etc.

  27. It’s been a while since I worked with NFS, but in the old days the default datagram size it used was 2048 Bytes. This meant all the frames had to be fragmented, so it was useful to modify the NFS options to a smaller value that would fit in a 1500 Byte Ethernet frame. NFS also uses UDP by default, so TCP parameters don’t come into play.

    • There are other things with NFS to consider for throughput performance like the Read and Write size for the mounts. I would think on lower power hardware that NFS over TCP would be more reliable and give more consistent throughput than using UDP. It could be very easy on a fast enough network with UDP to overwhelm the receiving host and cause tail dropping. This would cause NFS to have to perform recovery. I think this would have to be slower than letting TCP take care of it if tuned properly.

      • I did some poking around and my knowledge of NFS is indeed quite old. The newer versions default to TCP now, for pretty much the same reasons you describe.

  28. I need to review this. There was some other stuff in the mix like where/how the disks themselves were housed. I have gotten sustained 100 mbit, up to 125, but that’s the best real world result I have seen. I suspect a lot of my issues are about disk speed.

  29. So here’s a puzzle. I fired up an old iMac G5 with a whopping 2G of RAM and am pulling a backup, so lots of bandwidth would be good. It maxes out at 2G of RAM but it’s kern.ipc.maxsockbuf is 8388608, double what my newer Macbook pro i5 supports. The network throughput is maxing out at 48Mbits/second with no modifications or tweaks which is less than the Macbook pro in wireless N can do.

    So I wonder why the kern.ipc.maxsockbuf value is higher (I can’t set it that high on the newer machine and assumed it was limited by available RAM).

    • The kern.ipc.maxsockbuf value (surprise) is set at the kernel level. It can only be done at boot time. I believe it is calculated at boot time based on the hardware configuration and version of OSX running, unless it is overridden with your own static setting in sysctl.conf. The system has to figure out how to allocate the resources in an optimal manner. The method used to determine the best allocation has changed through version releases. Regardless, this memory allocation itself does not dictate network performance. Think of it more as just a governor on potential capacity. It simply provides the buffer capacity for the network stack to be able to achieve higher scale throughput. It does not mean that the system can or will make use of it.

      You have to look at all components from end to end to identify where the bottlenecks may be. What method of file transfer were you using? I assume this was all on your local LAN on the same switch? The host you were pulling the backup from was not doing anything else and has plenty of memory and its IP stack has been tuned similarly, as well? Do you know what read rate the disks on that host can sustain? Do you know what the sustained write rate is on the iMac G5? The odds are that the root cause lies with the disk throughput on either end. I assume you are Gig attached. Typically on a 100Meg or Gig LAN connection I would expect to see a ceiling more like 80Mbps on a single transfer with general consumer grade hardware. With my old NAS box using CIFS I could get around 88Mbps on Gig connectivity between it and my Macbook Pro. That is what you get with software based RAID5 with typical consumer grade drives without a real hardware controller and dedicated memory.

  30. I think that’s right, that kern.ipc.maxsockbuf is determined by the OS/kernel which is why I was surprised to see a figure on a 2Gb G5 that was double what I found on my 4Gb Intel i5.

    So I can fiddle around with iperf all day long but I want to find a real test of throughput. Wire speed is all very well, that means my cables are good (made ‘em myself), my hardware and software is good, but my guess is disk speed is always going to cut into those benefits. If I push a 1 Gb file over my network, I get 8.4 Mbits/sec, about 2:33 according to time(1). That’s over wireless N. Wired seems even slower, clocking in at 7.1. Mounting a volume from a time capsule (no wireless, it’s radio is off) and copying a file there took about half the time: 1:23. So the time capsule must write faster, I guess. The disk used in the first tests is only UDMA100, 1.5 Gb/s.

    This through a switch to the time capsule. If I take the switch out, I get right around 1 minute to copy that single file. No idea how the TCP stack on the time capsule is set up.

    Not very scientific but reasonably “real world” I think.

    If 80Mbits is the ceiling, I guess I’m doing OK, since I can hit that and pass it at times. I had a long restore job that ran up to 114Mbits with sustained speeds of over 100 for several hours.

  31. And now, to complicate matters, the gigabit card I just added to the mix seems to have crapped out (got it from a recycler so not 100% sure how good it was. It had been dropping in and out so maybe it wasn’t the driver as I suspected). So I dropped back to the onboard 100Mbit and it’s just about the same, as far as actual file transfers go. It won’t shine in iperf with 600 mbit wire speed but it moves actual data at the rate as the gigabit card did.

  32. Last comment on this, I think. I have confirmed to myself that disk I/O is more of a bottleneck that anything else in my case. I did a massive file copy (lots of files of all sizes copied via rsync) and got sustained copy speeds of almost 60 Mbits over the old RealTek interface (builtin to the main board, 100 Mbits) via wireless N. Then through a 3Com gigabit card, I was able to get more than 80 Mbits. Not an elegant test, as I was writing to an external disk connected for 480 Mbit USB in both cases. It claimed to support 40 MB/sec speed (da0: 40.000MB/s transfers). I am reading that as megabytes.

    Sadly, it either overruns the switch and causes it to hang or something goes wrong with the card, as the transfer breaks when running over gigabit.

    So I have all the speed I’m gonna get out of all this old junk ;-)

    The takeaways for me are:
    • that you need large volumes of data so you spend more time transferring and less time setting up and tearing down connections. Even if that’s not the intended workload, you can find the carrying capacity that way.
    • fewer components is better
    • good data collection (snap, graphed by mrtg or cacti/rrdtool) is useful.

  33. Scott,

    socket threshold is no longer available in ML & Mavericks. Does it have a replacement?

  34. Hi Scott,

    thanks for your great blog article, it works great!
    Just a question:
    We have several OS X virtual machines on NFS storage, do you know how can we increase the disk timeout inside OS X? Is there also a sysctl parameter?

    Thanks and best regards
    Reto

  35. FYI

    Setting this to:

    net.inet.udp.recvspace=4194304

    Will give better connection to UDP Trackers under “Transmission”

  36. Hi Scott!

    How r u? Any update on OS X Mavericks optimization.

    Thanks for this awesome work u r doing!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

Join 519 other followers

%d bloggers like this: