[Chapter 3] 3.4 Mail Services

3.4 Mail Services

Users consider electronic mail the most important network service because they use it for interpersonal communications. Some applications are newer and fancier. Other applications consume more network bandwidth. Others are more important for the continued operation of the network. But email is the application people use to communicate with each other. It isn't very fancy, but it's vital.

TCP/IP provides a reliable, flexible email system built on a few basic protocols. These are: Simple Mail Transfer Protocol (SMTP), Post Office Protocol (POP), and Multipurpose Internet Mail Extensions (MIME). There are other TCP/IP mail protocols. Interactive Mail Access Protocol, defined in RFC 1176, is an interesting protocol designed to supplant POP. It provides remote text searches and message parsing features not found in POP. We will touch only briefly on IMAP. It and other protocols have some very interesting features, but they are not yet widely implemented.

Our coverage concentrates on the three protocols you are most likely to use building your network: SMTP, POP, and MIME. We start with SMTP, the foundation of all TCP/IP email systems.

3.4.1 Simple Mail Transfer Protocol

SMTP is the TCP/IP mail delivery protocol. It moves mail across the Internet and across your local network. SMTP is defined in RFC 821, A Simple Mail Transfer Protocol. It runs over the reliable, connection-oriented service provided by Transmission Control Protocol (TCP), and it uses well-known port number 25. [7] Table 3.1 lists some of the simple, human-readable commands used by SMTP.

[7] Most standard TCP/IP applications are assigned a well-known port in the Assigned Numbers RFC, so that remote systems know how to connect the service.

Table 3.1: SMTP Commands
Command	Syntax	Function
Hello	HELO <`sending-host`>	Identify sending SMTP
From	MAIL FROM:<`from-address`>	Sender address
Recipient	RCPT TO:<`to-address`>	Recipient address
Data	DATA	Begin a message
Reset	RSET	Abort a message
Verify	VRFY <`string`>	Verify a username
Expand	EXPN <`string`>	Expand a mailing list
Help	HELP [`string`]	Request online help
Quit	QUIT	End the SMTP session

SMTP is such a simple protocol you can literally do it yourself. telnet to port 25 on a remote host and type mail in from the command line using the SMTP commands. This technique is sometimes used to test a remote system's SMTP server, but we use it here to illustrate how mail is delivered between systems. The example below shows mail manually input from Daniel on peanut.nuts.com to Tyler on almond.nuts.com.

% telnet almond.nuts.com 25
Trying 172.16.12.1 ...
Connected to almond.nuts.com.
Escape character is '^]'.
220 almond Sendmail 4.1/1.41 ready at Tue, 29 Mar 94 17:21:26 EST
helo peanut.nuts.com
250 almond Hello peanut.nuts.com, pleased to meet you
mail from:<daniel@peanut.nuts.com>
250 <daniel@peanut.nuts.com>... Sender ok
rcpt to:<tyler@almond.nuts.com>
250 <tyler@almond.nuts.com>... Recipient ok
data
354 Enter mail, end with "." on a line by itself
Hi Tyler!
.
250 Mail accepted
quit
221 almond delivering mail
Connection closed by foreign host.

The user input is shown in bold type. All of the other lines are output from the system. This example shows how simple it is. A TCP connection is opened. The sending system identifies itself. The From address and the To address are provided. The message transmission begins with the DATA command and ends with a line that contains only a period (.). The session terminates with a QUIT command. Very simple, and very few commands are used.

There are other commands (SEND, SOML, SAML, and TURN) defined in RFC 821 that are optional and not widely implemented. Even some of the commands that are implemented are not commonly used. The commands HELP, VRFY, and EXPN are designed more for interactive use than for the normal machine-to-machine interaction used by SMTP. The following excerpt from a SMTP session shows how these odd commands work.

HELP
214-Commands:
214-    HELO    MAIL    RCPT    DATA    RSET
214-    NOOP    QUIT    HELP    VRFY    EXPN
214-For more info use "HELP <topic>".
214-For local information contact postmaster at this site.
214 End of HELP info
HELP RSET
214-RSET
214-    Resets the system.
214 End of HELP info
VRFY <jane>
250 <jane@brazil.nuts.com>
VRFY <mac>
250 Kathy McCafferty <<mac>>
EXPN <admin>
250-<sara@pecan.nuts.com>
250 David Craig <<david>>
250-<tyler@nuts.com>

The HELP command prints out a summary of the commands implemented on the system. The HELP RSET command specifically requests information about the RSET command. Frankly, this help system isn't very helpful!

The VRFY and EXPN commands are more useful, but are often disabled for security reasons because they provide user account information that might be exploited by network intruders. The EXPN <admin> command asks for a listing of the email addresses in the mailing list admin, and that is what the system provides. The VRFY command asks for information about an individual instead of a mailing list. In the case of the VRFY <mac> command, mac is a local user account and the user's account information is returned. In the case of VRFY <jane>, jane is an alias in the /etc/aliases file. The value returned is the email address for jane found in that file. The three commands in this example are interesting, but rarely used. SMTP depends on the other commands to get the real work done.

SMTP provides direct end-to-end mail delivery. This is unusual. Most mail systems use store and forward protocols like UUCP and X.400 that move mail toward its destination one hop at a time, storing the complete message at each hop and then forwarding it on to the next system. The message proceeds in this manner until final delivery is made. Figure 3.3 illustrates both store and forward and direct delivery mail systems. The UUCP address clearly shows the path that the mail takes to its destination, while the SMTP mail address implies direct delivery. [8]

[8] The address doesn't have anything to do with whether or not a system is store and forward or direct delivery. It just happens that UUCP provides an address that helps to illustrate this point.

Figure 3.3: Mail delivery systems

Direct delivery allows SMTP to deliver mail without relying on intermediate hosts. If the delivery fails, the local system knows it right away. It can inform the user that sent the mail or queue the mail for later delivery without reliance on remote systems. The disadvantage of direct delivery is that it requires both systems to be fully capable of handling mail. Some systems cannot handle mail, particularly small systems such as PCs or mobile systems such as laptops. These systems are usually shut down at the end of the day and are frequently offline. Mail directed from a remote host fails with a "cannot connect" error when the local system is turned off or offline. To handle these cases, features in the DNS system are used to route the message to a mail server in lieu of direct delivery. The mail is then moved from the server to the client system when the client is back online. The protocol most TCP/IP networks use for this task is POP.

3.4.2 Post Office Protocol

There are two versions of POP in widespread use: POP2 and POP3. POP2 is defined in RFC 937 and POP3 is defined in RFC 1725. POP2 uses port 109 and POP3 uses port 110. These are incompatible protocols that use different commands, but they perform the same basic functions. The POP protocols verify the user's login name and password, and move the user's mail from the server to the user's local mail reader.

A sample POP2 session clearly illustrates how a POP protocol works. POP2 is a simple request/response protocol, and just as with SMTP, you can type POP2 commands directly into its well-known port (109) and observe their effect. Here's an example with the user input shown in bold type:

% telnet almond.nuts.com 109
Trying 172.16.12.1 ...
Connected to almond.nuts.com.
Escape character is '^]'.
+ POP2 almond POP2 Server at Wed 30-Mar-94 3:48PM-EST
HELO hunt WatsWatt
#3  ...(From folder 'NEWMAIL')
READ
=496
RETR
{The full text of message 1}
ACKD
=929
RETR
{The full text of message 2}
ACKD
=624
RETR
{The full text of message 3}
ACKD
=0
QUIT
+OK POP2 Server exiting (0 NEWMAIL messages left)
Connection closed by foreign host.

The HELO command provides the username and password for the account of the mailbox that is being retrieved. (This is the same username and password used to log into the mail server.) In response to the HELO command the server sends a count of the number of messages in the mailbox, three (#3) in our example. The READ command begins reading the mail. RETR retrieves the full text of the current message. ACKD acknowledges receipt of the message and deletes it from the server. After each acknowledgment the server sends a count of the number of bytes in the new message. If the byte count is zero (=0) it indicates that there are no more messages to be retrieved and the client ends the session with the QUIT command. Simple! Table 3.2 lists the full set of POP2 commands.

Table 3.2: POP2 Commands
Command	Syntax	Function
Hello	HELO `user password`	Identify user account
Folder	FOLD `mail-folder`	Select mail folder
Read	READ [`n`]	Read mail, optionally start with message `n`
Retrieve	RETR	Retrieve message
Save	ACKS	Acknowledge and save
Delete	ACKD	Acknowledge and delete
Failed	NACK	Negative acknowledgement
Quit	QUIT	End the POP2 session

The commands for POP3 are completely different from the commands used for POP2. Table 3.3 shows the set of POP3 commands defined in RFC 1725.

Table 3.3: POP3 Commands
Command	Function
USER `username`	The user's account name
PASS `password`	The user's password
STAT	Display the number of unread messages/bytes
RETR `n`	Retrieve message number `n`
DELE `n`	Delete message number `n`
LAST	Display the number of the last message accessed
LIST [`n`]	Display the size of message `n` or of all messages
RSET	Undelete all messages; reset message number to 1
TOP `n l`	Print the headers and `l` lines of message `n`
NOOP	Do nothing
QUIT	End the POP3 session

Despite the fact that these commands are different from those used by POP2, they can be used to perform similar functions. In the POP2 example we logged into the server and read and deleted three mail messages. Here's a similar session using POP3:

% telnet almond 110
Trying 172.16.12.1 ...
Connected to almond.nuts.com.
Escape character is '^]'.
+OK almond POP3 Server Process 3.3(1) at Mon 15-May-95 4:48PM-EDT
user hunt
+OK User name (hunt) ok. Password, please.
pass Watts?Watt?
+OK 3 messages in folder NEWMAIL (V3.3 Rev B04)
stat
+OK 3 459
retr 1
+OK 146 octets
  The full text of message 1
dele 1
+OK message # 1 deleted
retr 2
+OK 155 octets
  The full text of message 2
dele 2
+OK message # 2 deleted
retr 3
+OK 158 octets
  The full text of message 3
dele 3
+OK message # 3 deleted
quit
+OK POP3 almond Server exiting (0 NEWMAIL messages left)
Connection closed by foreign host.

Naturally you don't really type these commands in yourself, but experiencing hands-on interaction with SMTP and POP gives you a clearer understanding of what these programs do and why they are needed.

3.4.3 Multipurpose Internet Mail Extensions

The last email protocol on our quick tour is MIME. [9] As its name implies, Multipurpose Internet Mail Extensions is an extension of the existing TCP/IP mail system, not a replacement for it. MIME is more concerned with what the mail system delivers then it is with the mechanics of delivery. It doesn't attempt to replace SMTP or TCP; it extends the definition of what constitutes "mail."

[9] MIME is also an integral part of the Web and HTTP.

The structure of the mail message carried by SMTP is defined in RFC 822, Standard for the Format of ARPA Internet Text Messages. RFC 822 defines a set of mail headers that are so widely accepted they are used by many mail systems that do not use SMTP. This is a great benefit to email because it provides a common ground for mail translation and delivery through gateways to different mail networks. MIME extends RFC 822 into two areas not covered by the original RFC:

Support for various data types. The mail system defined by RFC 821 and RFC 822 transfers only 7-bit ASCII data. This is suitable for carrying text data composed of US ASCII characters, but it does not support several languages that have richer character sets and it does not support binary data transfer.
Support for complex message bodies. RFC 822 does not provide a detailed description of the body of an electronic message. It concentrates on the mail headers.

MIME addresses these two weaknesses by defining encoding techniques for carrying various forms of data, and by defining a structure for the message body that allows multiple objects to be carried in a single message. The RFC 1521, MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies, defines two headers that give structure to the mail message body and allow it to carry various forms of data. These are the Content-Type header and the Content-Transfer-Encoding header.

As the name implies, the Content-Type header defines the type of data being carried in the message. The header has a Subtype field that refines the definition. Many subtypes have been defined since the original RFC was released. A current list of MIME types can be obtained from the Internet. [10] The original RFC defines seven initial content types and a few subtypes:

[10] Go to ftp://ftp.isi.edu/in-notes/iana/assignments/media-types and retrieve the file media-types.

text: Text data. RFC 1521 defines text subtypes plain and richtext. Several subtypes have since been added, including enriched and html.
application: Binary data. The primary subtype defined in RFC 1521 is octet-stream, which indicates the data is a stream of 8-bit binary bytes. One other subtype, PostScript, is defined in the standard. Since then more than 90 subtypes have been defined. They specify binary data formatted for a particular application. For example, msword is an application subtype.
image: Still graphic images. Two subtypes are defined in RFC 1521: jpeg and gif. More than 10 additional subtypes have since been added, including widely used image data standards such as tiff, cgm, and g3fax.
video: Moving graphic images. The initially defined subtype was mpeg, which is a widely used standard for computer video data. A few others have since been added, including quicktime.
audio: Audio data. The only subtype initially defined for audio was basic, which means the sounds are encoded using pulse code modulation (PCM).
multipart: Data composed of multiple independent sections. A multipart message body is made up of several independent parts. RFC 1521 defines four subtypes. The primary subtype is mixed, which means that each part of the message can be data of any content type. Other subtypes are: alternative, meaning that the same data is repeated in each section in different formats; parallel, meaning that the data in the various parts is to be viewed simultaneously; and digest, meaning that each section is data of the type message. Several subtypes have since been added, including support for voice messages (voice-message) and encrypted messages.
message: Data that is an encapsulated mail message. RFC 1521 defines three subtypes. The primary subtype, rfc822, indicates that the data is a complete RFC 822 mail message. The other subtypes, partial and External-body, are both designed to handle large messages. partial allows large encapsulated messages to be split among multiple MIME messages. External-body points to an external source for the contents of a large message body, so that only the pointer, not the message itself, is contained in the MIME message. Two additional subtypes have been defined: news for carrying network news, and http for HTTP traffic formatted to comply with MIME content typing.

The Content-Transfer-Encoding header identifies the type of encoding used on the data. Traditional SMTP systems only forward 7-bit ASCII data with a line length of less than 1000 bytes. To ensure that the data from a MIME system is forwarded through gateways that may only support 7-bit ASCII, the data can be encoded. RFC 1521 defines six types of encoding. Some types are used to identify the encoding inherent in the data. Only two types are actual encoding techniques defined in the RFC. The six encoding types are:

7bit: US ASCII data. No encoding is performed on 7-bit ASCII data.
8bit: Octet data. No encoding is performed. The data is binary, but the lines of data are short enough for SMTP transport; i.e., the lines are fewer than 1000 bytes long.
binary: Binary data. No encoding is performed. The data is binary and the lines may be longer than 1000 bytes. There is no difference between binary and 8bit data except the line length restriction; both types of data are unencoded byte (octet) streams. MIME does not handle unencoded bitstream data.
quoted-printable: Encoded text data. This encoding technique handles data that is largely composed of printable ASCII text. The ASCII text is sent unencoded, while bytes with a value greater than 127 or less than 33 are sent encoded as strings made up of the equal sign followed by the hexadecimal value of the byte. For example: the ASCII form feed character, which has the hexadecimal value of 0C, is sent as =0C. Naturally there's more to it than this - for example, the literal equal sign has to be sent as =3D, and the newline at the end of each line is not encoded. But this is the general idea of how quoted-printable data is sent.
base64: Encoded binary data. This encoding technique can be used on any byte-stream data. Three octets of data are encoded as four 6-bit characters, which increases the size of the file by one-third. The 6-bit characters are a subset of US ASCII, chosen because they can be handled by any type of mail system. The maximum line length for base64 data is 76 characters. Figure 3.4 illustrates this 3 to 4 encoding technique.
x-token: Specially encoded data. It is possible for software developers to define their own private encoding techniques. If they do so, the name of the encoding technique must begin with X-. Doing this is strongly discouraged because it limits interoperability between mail systems.

Figure 3.4: base64 encoding

The number of supported data types and encoding techniques grows as new data formats appear and are used in message transmissions. New RFCs constantly define new data types and encoding. Read the latest RFCs to keep up with MIME developments.

MIME defines data types that SMTP was not designed to carry. To handle these and other future requirements, RFC 1869, SMTP Service Extensions, defines a technique for making SMTP extensible. The RFC does not define new services for SMTP; in fact, the only service extensions mentioned in the RFC are defined in other RFCs. What this RFC does define is a simple mechanism for systems to negotiate which SMTP extensions are supported. The RFC defines a new hello command (EHLO) and the legal responses to that command. One response is for the receiving system to return a list of the SMTP extensions it supports. This response allows the sending system to know what extended services can be used, and to avoid those that are not implemented on the remote system. SMTP implementations that support the EHLO command are called Extended SMTP (ESMTP).

Several ESMTP service extensions have been defined for MIME mailers. Table 3.4 lists some of these. The table lists the EHLO keyword associated with each extension, the number of the RFC that defines it, and its purpose. These service extensions are just the beginning. Undoubtedly more will be defined to support MIME and other SMTP enhancements.

Table 3.4: SMTP Service Extensions
Keyword	RFC	Server Extension
8BITMIME	1652	Accept 8bit binary data
CHUNKING	1830	Accept messages cut into chunks
CHECKPOINT	1845	Checkpoint/restart mail transactions
PIPELINING	1854	Accept multiple commands in a single send
SIZE	1870	Display maximum acceptable message size
DSN	1891	Provide delivery status notifications
ETRN	1985	Accept remote queue processing requests
ENHANCEDSTATUSCODES	2034	Provide enhanced error codes

It is easy to check which extensions are supported by your server by using the EHLO command. The following example is from a sendmail 8.8.5. system:

> telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 peanut ESMTP Sendmail 8.7.5/8.7.3; Tue, 11 Nov 1997 15:22:34 -0500
ehlo peanut
250-peanut Hello craig@localhost [127.0.0.1], pleased to meet you
250-EXPN
250 HELP
250-8BITMIME
250-SIZE
250-DSN
250-ETRN
250-VERB
250-ONEX
250-XUSR
quit
221 peanut closing connection
Connection closed by foreign host.

The sample system lists nine commands in response to the EHLO greeting. Two of these, EXPN and HELP, are standard SMTP commands that aren't implemented on all systems (the standard commands are listed in Table 3.1 8BITMIME, SIZE, DSN, and ETRN are ESMTP extensions, all of which are described in Table 3.4 The last three keywords in the response are VERB, ONEX, and XUSR. All of these are specific to sendmail version 8. None is defined in an RFC. VERB simply places the sendmail server in verbose mode. ONEX limits the session to a single message transaction. XUSR is as yet unimplemented, but it will be equivalent to the -U sendmail command-line argument. [11] As the last three keywords indicate, the RFCs allow for private ESMTP extensions.

[11] See Appendix E, A sendmail Reference, for a list of the sendmail command-line arguments.

The specific extensions implemented on each operating systems are different. For example, on a Solaris 2.5.1 system only three keywords (EXPN, SIZE, and HELP) are displayed in response to EHLO. The purpose of EHLO is to identify these differences at the begining of the SMTP mail exchange.

ESMTP and MIME are important because they provide a standard way to transfer non-ASCII data through email. Users share lots of application specific data that are not 7-bit ASCII. Many users depend on email as a file transfer mechanism.

SMTP, POP, and MIME are essential parts of the mail system, but other email protocols may also be essential in the future. The one certainty is that the network will continue to change. You need to track current developments and include helpful technologies into your planning. In the next section we look at the various types of TCP/IP configuration servers. Unlike DNS and email, configuration servers are not used on most networks. This is changing, however. The demand for easier installation and improved mobility may make configuration servers part of your network's future.


3.3 Domain Name Service		3.5 Configuration Servers