Welcome to Linux Knowledge Base and Tutorial
"The place where you learn linux"
Linux Magazine - Missing Anything?

 Create an AccountHome | Submit News | Your Account  

Tutorial Menu
Linux Tutorial Home
Table of Contents

· Introduction to Operating Systems
· Linux Basics
· Working with the System
· Shells and Utilities
· Editing Files
· Basic Administration
· The Operating System
· The X Windowing System
· The Computer Itself
· Networking
· System Monitoring
· Solving Problems
· Security
· Installing and Upgrading
· Linux and Windows

Man Pages
Linux Topics
Test Your Knowledge

Site Menu
Site Map
Copyright Info
Terms of Use
Privacy Info
Masthead / Impressum
Your Account

Private Messages

News Archive
Submit News
User Articles
Web Links


The Web

Who's Online
There are currently, 93 guest(s) and 0 member(s) that are online.

You are an Anonymous user. You can register for free by clicking here




       #include <sys/socket.h>
       #include <netinet/in.h>
       #include <netinet/tcp.h>
       tcp_socket = socket(PF_INET, SOCK_STREAM, 0);


       This  is  an implementation of the TCP protocol defined in
       RFC793, RFC1122 and RFC2001  with  the  NewReno  and  SACK
       extensions.  It provides a reliable, stream oriented, full
       duplex connection between two sockets on top of ip(7), for
       both  v4  and  v6  versions.  TCP guarantees that the data
       arrives in order and retransmits lost packets.  It  gener­
       ates  and  checks a per packet checksum to catch transmis­
       sion errors.  TCP does not preserve record boundaries.

       A fresh TCP socket has no remote or local address  and  is
       not fully specified.  To create an outgoing TCP connection
       use connect(2) to establish a connection  to  another  TCP
       socket.   To  receive new incoming connections bind(2) the
       socket first to a local address and  port  and  then  call
       listen(2)  to  put the socket into listening state.  After
       that a new socket for  each  incoming  connection  can  be
       accepted  using  accept(2).  A socket which has had accept
       or connect successfully called on it  is  fully  specified
       and may transmit data.  Data cannot be transmitted on lis­
       tening or not yet connected sockets.

       Linux supports RFC1323 TCP  high  performance  extensions.
       These  include Protection Against Wrapped Sequence Numbers
       (PAWS), Window Scaling  and  Timestamps.   Window  scaling
       allows  the  use  of large (> 64K) TCP windows in order to
       support links with high latency or bandwidth.  To make use
       of  them,  the  send  and  receive  buffer  sizes  must be
       increased.   They   can   be   set   globally   with   the
       net.ipv4.tcp_wmem  and net.ipv4.tcp_rmem sysctl variables,
       or on  individual  sockets  by  using  the  SO_SNDBUF  and
       SO_RCVBUF socket options with the setsockopt(2) call.

       The  maximum  sizes  for  socket  buffers declared via the
       SO_SNDBUF and SO_RCVBUF  mechanisms  are  limited  by  the
       global  net.core.rmem_max  and  net.core.wmem_max sysctls.
       Note that TCP actually allocates twice  the  size  of  the
       buffer  requested in the setsockopt(2) call, and so a suc­
       ceeding getsockopt(2) call will not return the  same  size
       of  buffer  as  requested  in the setsockopt(2) call.  TCP
       uses this for administrative purposes and internal  kernel
       structures,  and  the  sysctl variables reflect the larger
       sizes compared to the actual TCP windows.   On  individual
       connections,  the  socket buffer size must be set prior to
       the listen() or connect() calls in order to have  it  take
       effect. See socket(7) for more information.
       Linux 2.4 introduced a  number  of  changes  for  improved
       throughput and scaling, as well as enhanced functionality.
       Some of these features include support for zerocopy  send­
       file(2),  Explicit Congestion Notification, new management
       of TIME_WAIT sockets, keep-alive socket options  and  sup­
       port for Duplicate SACK extensions.


       TCP  is  built on top of IP (see ip(7)).  The address for­
       mats defined by ip(7) apply to  TCP.   TCP  only  supports
       point-to-point  communication; broadcasting and multicast­
       ing are not supported.


       These    variables    can    be    accessed     by     the
       /proc/sys/net/ipv4/*  files  or  with the sysctl(2) inter­
       face.  In addition, most IP sysctls also apply to TCP; see

              Enable  resetting connections if the listening ser­
              vice is too slow and unable to keep up  and  accept
              them.  It is not enabled by default.  It means that
              if overflow occurred due to a burst, the connection
              will recover.  Enable this option _only_ if you are
              really sure that the  listening  daemon  cannot  be
              tuned  to accept connections faster.  Enabling this
              option can harm the clients of your server.

              Count         buffering         overhead         as
              bytes/2^tcp_adv_win_scale  (if  tcp_adv_win_scale >
              0) or bytes-bytes/2^(-tcp_adv_win_scale), if it  is
              <= 0. The default is 2.

              The  socket  receive buffer space is shared between
              the application and kernel.  TCP maintains part  of
              the  buffer  as the TCP window, this is the size of
              the receive window advertised  to  the  other  end.
              The  rest of the space is used as the "application"
              buffer, used to isolate the network from scheduling
              and  application  latencies.  The tcp_adv_win_scale
              default value of 2 implies that the space used  for
              the  application  buffer  is one fourth that of the

              This variable defines how many  bytes  of  the  TCP
              window are reserved for buffering overhead.

              A  maximum  of (window/2^tcp_app_win, mss) bytes in
              the window are reserved for the application buffer.

              Enable  TCP Forward Acknowledgement support.  It is
              enabled by default.

              How many seconds to wait for  a  final  FIN  packet
              before  the  socket  is  forcibly  closed.  This is
              strictly a violation of the TCP specification,  but
              required   to   prevent   denial-of-service   (DoS)
              attacks.  The default value in 2.4 kernels  is  60,
              down from 180 in 2.2.

              The   number  of  seconds  between  TCP  keep-alive
              probes.  The default value is 75 seconds.

              The maximum number of TCP keep-alive probes to send
              before  giving  up and killing the connection if no
              response is  obtained  from  the  other  end.   The
              default value is 9.

              The number of seconds a connection needs to be idle
              before TCP begins sending  out  keep-alive  probes.
              Keep-alives  are  only  sent  when the SO_KEEPALIVE
              socket option is enabled.   The  default  value  is
              7200 seconds (2 hours).  An idle connection is ter­
              minated after approximately an additional  11  min­
              utes  (9  probes  an  interval of 75 seconds apart)
              when keep-alive is enabled.

              Note that underlying connection tracking mechanisms
              and application timeouts may be much shorter.

              The maximum number of orphaned (not attached to any
              user file handle) TCP sockets allowed in  the  sys­
              tem.   When  this  number is exceeded, the orphaned
              connection is reset and a warning is printed.  This
              limit  exists  only  to prevent simple DoS attacks.
              Lowering this limit  is  not  recommended.  Network
              conditions might require you to increase the number
              of orphans allowed, but note that each  orphan  can
              eat  up to ~64K of unswappable memory.  The default
              initial value is set equal to the kernel  parameter
              NR_FILE.   This initial default is adjusted depend­
              ing on the memory in the system.

              The maximum number of  queued  connection  requests
              which  have  still  not received an acknowledgement
              prevent  simple  DoS attacks.  The default value of
              NR_FILE*2 is adjusted depending on  the  memory  in
              the system.  If this number is exceeded, the socket
              is closed and a warning is printed.

              This is a vector of  3  integers:  [low,  pressure,
              high].   These  bounds are used by TCP to track its
              memory usage.  The defaults are calculated at  boot
              time from the amount of available memory.

              low  -  TCP  doesn't regulate its memory allocation
              when the number of pages it has allocated  globally
              is below this number.

              pressure  -  when the amount of memory allocated by
              TCP exceeds this number of pages, TCP moderates its
              memory  consumption.  This memory pressure state is
              exited once the number  of  pages  allocated  falls
              below the low mark.

              high  - the maximum number of pages, globally, that
              TCP will allocate.  This value overrides any  other
              limits imposed by the kernel.

              The  maximum  number  of attempts made to probe the
              other end of a connection which has been closed  by
              our end.  The default value is 8.

              The  maximum  a  packet  can  be reordered in a TCP
              packet stream without TCP assuming packet loss  and
              going  into  slow  start.  The default is 3.  It is
              not advisable to change this  number.   This  is  a
              packet reordering detection metric designed to min­
              imize unnecessary back off and retransmits provoked
              by reordering of packets on a connection.

              Try  to  send full-sized packets during retransmit.
              This is enabled by default.

              The number of times TCP will attempt to  retransmit
              a  packet  on  an  established connection normally,
              without the extra effort  of  getting  the  network
              layers  involved.   Once  we  exceed this number of
              retransmits, we first have the network layer update
              the  route  if possible before each new retransmit.
              The default is the RFC specified minimum of 3.

              of the TIME_WAIT period.

              This is a vector  of  3  integers:  [min,  default,
              max].  These parameters are used by TCP to regulate
              receive buffer sizes.  TCP dynamically adjusts  the
              size of the receive buffer from the defaults listed
              below, in the  range  of  these  sysctl  variables,
              depending on memory available in the system.

              min  -  minimum  size of the receive buffer used by
              each TCP socket.  The default value is 4K,  and  is
              lowered  to  PAGE_SIZE bytes in low memory systems.
              This value is used to ensure that in  memory  pres­
              sure  mode,  allocations below this size will still
              succeed.  This is not used to bound the size of the
              receive   buffer  declared  using  SO_RCVBUF  on  a

              default - the default size of  the  receive  buffer
              for  a  TCP socket.  This value overwrites the ini­
              tial default buffer size from  the  generic  global
              net.core.rmem_default  defined  for  all protocols.
              The default value is 87380 bytes, and is lowered to
              43689  in  low  memory  systems.  If larger receive
              buffer sizes are  desired,  this  value  should  be
              increased (to affect all sockets).  To employ large
              TCP windows, the  net.ipv4.tcp_window_scaling  must
              be enabled (default).

              max  -  the maximum size of the receive buffer used
              by each TCP socket.  This value does  not  override
              the  global net.core.rmem_max.  This is not used to
              limit the size of the receive buffer declared using
              SO_RCVBUF  on  a  socket.   The  default  value  of
              87380*2 bytes is lowered to  87380  in  low  memory

              Enable  RFC2018 TCP Selective Acknowledgements.  It
              is enabled by default.

              Enable the strict RFC793 interpretation of the  TCP
              urgent-pointer  field.   The  default is to use the
              BSD-compatible  interpretation   of   the   urgent-
              pointer,  pointing  to  the  first  byte  after the
              urgent data.  The RFC793 interpretation is to  have
              it point to the last byte of urgent data.  Enabling
              this option may lead  to  interoperatibility  prob­

              such  as TCP extensions.  It can cause problems for
              clients and relays.  It is  not  recommended  as  a
              tuning mechanism for heavily loaded servers to help
              with overloaded or misconfigured  conditions.   For
              recommended  alternatives  see tcp_max_syn_backlog,
              tcp_synack_retries, tcp_abort_on_overflow.

              The maximum number of times  initial  SYNs  for  an
              active  TCP  connection attempt will be retransmit­
              ted.  This value should not  be  higher  than  255.
              The  default  value  is  5,  which  corresponds  to
              approximately 180 seconds.

              Enable RFC1323 TCP timestamps.  This is enabled  by

              Enable  fast recycling of TIME-WAIT sockets.  It is
              not enabled by default.  Enabling  this  option  is
              not  recommended  since  this  causes problems when
              working with NAT (Network Address Translation).

              Enable RFC1323 TCP window scaling.  It  is  enabled
              by default.  This feature allows the use of a large
              window (> 64K) on  a  TCP  connection,  should  the
              other  end support it.  Normally, the 16 bit window
              length field in the TCP header  limits  the  window
              size to less than 64K bytes.  If larger windows are
              desired, applications  can  increase  the  size  of
              their  socket buffers and the window scaling option
              will be employed.  If  tcp_window_scaling  is  dis­
              abled,  TCP  will  not  negotiate the use of window
              scaling with the other end during connection setup.

              This  is  a  vector  of  3 integers: [min, default,
              max].  These parameters are used by TCP to regulate
              send  buffer  sizes.   TCP  dynamically adjusts the
              size of the send buffer  from  the  default  values
              listed  below,  in  the range of these sysctl vari­
              ables, depending on memory available.

              min - minimum size of the send buffer used by  each
              TCP  socket.   The default value is 4K bytes.  This
              value is used to ensure  that  in  memory  pressure
              mode,  allocations  below this size will still suc­
              ceed.  This is not used to bound the  size  of  the
              send buffer declared using SO_SNDBUF on a socket.

              SO_SNDBUF  on  a socket.  The default value is 128K
              bytes.  It is lowered to 64K depending on the  mem­
              ory available in the system.


       To  set  or get a TCP socket option, call getsockopt(2) to
       read or setsockopt(2) to write the option with the  option
       level  argument  set to SOL_TCP.  In addition, most SOL_IP
       socket options are valid on TCP sockets. For more informa­
       tion see ip(7).

              If  set, don't send out partial frames.  All queued
              partial frames are sent when the option is  cleared
              again.   This  is  useful  for  prepending  headers
              before calling sendfile(2), or for throughput opti­
              mization.   This  option  cannot  be  combined with
              TCP_NODELAY.  This option should  not  be  used  in
              code intended to be portable.

              Allows  a  listener  to  be awakened only when data
              arrives on the  socket.   Takes  an  integer  value
              (seconds),  this  can  bound  the maximum number of
              attempts TCP will make to complete the  connection.
              This  option should not be used in code intended to
              be portable.

              Used to collect information about this socket.  The
              kernel  returns a struct tcp_info as defined in the
              file /usr/include/linux/tcp.h.  This option  should
              not be used in code intended to be portable.

              The  maximum  number of keepalive probes TCP should
              send before dropping the connection.   This  option
              should not be used in code intended to be portable.

              The time  (in  seconds)  the  connection  needs  to
              remain  idle  before  TCP  starts sending keepalive
              probes, if the socket option SO_KEEPALIVE has  been
              set on this socket.  This option should not be used
              in code intended to be portable.

              The time (in seconds) between individual  keepalive
              probes.   This  option  should  not be used in code
              intended to be portable.

              mum bounds over the value provided.

              If  set,  disable  the Nagle algorithm.  This means
              that segments are always sent as soon as  possible,
              even if there is only a small amount of data.  When
              not set, data is buffered until there is  a  suffi­
              cient amount to send out, thereby avoiding the fre­
              quent sending of small packets,  which  results  in
              poor  utilization of the network.  This option can­
              not  be  used  at  the  same  time  as  the  option

              Enable  quickack  mode  if  set or disable quickack
              mode if cleared.  In quickack mode, acks  are  sent
              immediately,  rather  than  delayed  if  needed  in
              accordance to normal TCP operation.  This  flag  is
              not  permanent, it only enables a switch to or from
              quickack mode.  Subsequent  operation  of  the  TCP
              protocol  will once again enter/leave quickack mode
              depending on internal protocol processing and  fac­
              tors  such  as  delayed  ack timeouts occurring and
              data transfer.  This option should not be  used  in
              code intended to be portable.

              Set  the  number of SYN retransmits that TCP should
              send before aborting the attempt  to  connect.   It
              cannot  exceed 255.  This option should not be used
              in code intended to be portable.

              Bound the size of the  advertised  window  to  this
              value.   The  kernel  imposes  a  minimum  size  of
              SOCK_MIN_RCVBUF/2.  This option should not be  used
              in code intended to be portable.


       These  ioctls can be accessed using ioctl(2).  The correct
       syntax is:

              int value;
              error = ioctl(tcp_socket, ioctl_type, &value);

              Returns the amount of queued  unread  data  in  the
              receive  buffer.  Argument is a pointer to an inte­
              ger.  The socket must not be in LISTEN state,  oth­
              erwise an error (EINVAL) is returned.

       ETIMEDOUT or the last received error on this connection is

       Some applications require a  quicker  error  notification.
       This  can  be  enabled  with  the  SOL_IP level IP_RECVERR
       socket option.  When this option is enabled, all  incoming
       errors  are  immediately  passed to the user program.  Use
       this option with care - it  makes  TCP  less  tolerant  to
       routing changes and other normal network conditions.


       When an error occurs doing a connection setup occurring in
       a  socket  write  SIGPIPE  is   only   raised   when   the
       SO_KEEPALIVE socket option is set.

       TCP  has  no real out-of-band data; it has urgent data. In
       Linux this means if the other end sends newer  out-of-band
       data the older urgent data is inserted as normal data into
       the stream (even when SO_OOBINLINE is not set). This  dif­
       fers from BSD based stacks.

       Linux uses the BSD compatible interpretation of the urgent
       pointer field by default.  This violates RFC1122,  but  is
       required  for  interoperability with other stacks.  It can
       be changed by the tcp_stdurg sysctl.


       EPIPE  The other end closed the socket unexpectedly  or  a
              read is executed on a shut down socket.

              The other end didn't acknowledge retransmitted data
              after some time.

              Passed socket address type in  sin_family  was  not

       Any  errors  defined for ip(7) or the generic socket layer
       may also be returned for TCP.


       Not all errors are documented.
       IPv6 is not described.


       Support for  Explicit  Congestion  Notification,  zerocopy
       sendfile,  reordering  support  and  some  SACK extensions
       (DSACK) were  introduced  in  2.4.   Support  for  forward
       acknowledgement  (FACK),  TIME_WAIT recycling, per connec­
       tion keepalive socket options and sysctls were  introduced
       in 2.3.

       RFC793 for the TCP specification.
       RFC1122  for the TCP requirements and a description of the
       Nagle algorithm.
       RFC1323 for TCP timestamp and window scaling options.
       RFC1644 for a description of TIME_WAIT assassination  haz­
       RFC2481 for a description of Explicit Congestion Notifica­
       RFC2581 for TCP congestion control algorithms.
       RFC2018 and RFC2883 for SACK and extensions to SACK.

Linux Man Page              2003-08-21                     TCP(7)

Looking for a "printer friendly" version?



Security Code
Security Code
Type Security Code

Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Help if you can!

Amazon Wish List

Did You Know?
You can choose larger fonts by selecting a different themes.


Tell a Friend About Us

Bookmark and Share

Web site powered by PHP-Nuke

Is this information useful? At the very least you can help by spreading the word to your favorite newsgroups, mailing lists and forums.
All logos and trademarks in this site are property of their respective owner. The comments are property of their posters. Articles are the property of their respective owners. Unless otherwise stated in the body of the article, article content (C) 1994-2013 by James Mohr. All rights reserved. The stylized page/paper, as well as the terms "The Linux Tutorial", "The Linux Server Tutorial", "The Linux Knowledge Base and Tutorial" and "The place where you learn Linux" are service marks of James Mohr. All rights reserved.
The Linux Knowledge Base and Tutorial may contain links to sites on the Internet, which are owned and operated by third parties. The Linux Tutorial is not responsible for the content of any such third-party site. By viewing/utilizing this web site, you have agreed to our disclaimer, terms of use and privacy policy. Use of automated download software ("harvesters") such as wget, httrack, etc. causes the site to quickly exceed its bandwidth limitation and are therefore expressly prohibited. For more details on this, take a look here

PHP-Nuke Copyright © 2004 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.04 Seconds