Telnet for Humans

I recently spent some work on telnet support in a MUD server. It took a bit of reading to find out how to implement it; the RFCs aren’t the friendliest thing to untangle, and I didn’t run across anything that was that much easier. It hurts that the standard is split across some thirty RFCs over twenty years. And this standard is big for how simple it seems — did you know that telnet supports encryption, for instance?

The most frustrating thing is that pretty much no MUD client supports any part of this, and MUD clients are poorly behaved in general. (In contrast, the telnet(1) utility on Linux seems pretty rugged, assuming the worst of the servers it interacts with.) It’s partly because the standard is so intractable, so I’m writing this guide to help out.

Core concept

An IBM 3277 display
An IBM 3277. Quite a bit smarter than Telnet. source

Telnet is a standardization of client/server communication where the client is a moderately intelligent terminal. Think back to the days where you had a mainframe in the bowels of a CS department and grad students logged in from glass terminals throughout the building. The terminal’s a dumb thing that can pretty much just display text and send text to you. It sends you whatever the user types in, and you send it what you want it to display.

Lesson One: The Client Is Always Stupid

The terminal is guaranteed to be smart enough to handle text scrolling and that’s about it. (And really, the only reason we assume it can scroll is because the original terminals were printers paired with keyboards.)

What does that mean? For starters, the client is not doing any layout. (I mean, it’s free to, but it never will.) If you send it a long line, it would be valid for it to cut the line off. In practice, clients wrap long lines by character rather than by word. So if your window is 25 characters wide, you might log into the MUD to see:

You see a large golden thr
oneroom filled with throne
s, each more regal than th
e last.

Let’s face it, if it’s possible for the terminal to do something wrong while technically doing what you tell it, it will. You must explicitly negotiate for anything you want the client to do. (And no, there’s no “wrap lines intelligently” option.)

Command stream

Interleaved within the data stream are commands. These commands generally deal with the terminal state and capabilities or the format of the data.

There are no guarantees about how commands and data are interleaved, and similarly there’s no guarantee about commands arriving in one packet versus several. This makes things annoying, to say the least.

Commands are distinguished from data by starting with 0xFF, and there are only a few formats allowed, so it’s not a Heraklean task to detect them. If you want to implement a non-compliant but mostly working telnet handler, you can easily just filter out the relevant octet patterns.

(If you happen to need a 0xFF in your data stream, just send it twice: 0xFF 0xFF is a command to send a literal 0xFF character to the client.)

Negotiation

Generally, a server wants to know what a client supports so it can send the appropriate stuff, and the client wants to advertise to the server what it expects so it can get input it can deal with. But neither can send something until both agree. So what you initially see is a negotiation.

For example, Telnet clients and servers are required to support 7-bit ASCII. However, many other encodings exist. In RFC 2066 (published in 1997), the IETF established a means for clients and servers to choose a different character set — the CHARSET option. To negotiate enabling the CHARSET option, we might see an exchange like:


server: IAC WILL CHARSET (255 251 42)
client: IAC DO CHARSET (255 253 42)

The first octet for every command is 0xFF, which the RTFs call “IAC”, or “interpret as command”. This makes it easy to distinguish commands from data. Our next octet was WILL, 0xFB, which indicates that the server is willing to deal with the option it’s discussing. It ends with the CHARSET option (0x2A), which is what it’s negotiating about.

The client responds in kind, but it replaces WILL with DO (0xFD). This indicates that it’s received the request and agrees. It could also have sent IAC DONT CHARSET to indicate that it can’t deal with this option. Both are legal and valid.

WILL indicates that the sending party is willing to do something, and DO indicates that the sending party expects the receiving party to do it. The client could send IAC WILL CHARSET instead of the server, indicating that the client wants to handle the character encoding negotiations instead of the server.

For many options, there’s not much difference if the client says WILL and the server says DO versus if the server says WILL and the client says DO.

Extended commands

So we’ve covered one format of command: IAC [WILL|WONT|DO|DONT] option. That’s a simple three-octet sequence. But our example was character set negotiation. How are you going to do this in three octets?

You aren’t. You need a longer format. Specifically:

IAC SB option payload IAC SE

Essentially, we have a command: “begin a subcommand for the following option.” This option value is the same as the one we used in negotiation. Then we have an option-defined payload of arbitrary values, followed by another command: “end this subcommand”. (There are different begin and end markers, so you could theoretically nest them. For your own sanity, don’t.)

For example, the server might advertise character sets it can use to send data to the client:


server: IAC SB CHARSET REQUEST ";UTF-8;ISO-8859-1" IAC SE
(255 250 42 1 59 85 84 70 45 56 59 73 83 79 45 56 56 53 57 45 49 255 240)

Here, the first subcommand contains the CHARSET-specific REQUEST code (0x1) followed by a literal ASCII string specifying the supported encodings, delimited by an arbitrary one-octet delimiter. That delimiter appears as the first octet in the sequence to avoid a separate step to agree on a delimiter.

Using the command stream as a side data channel

If the client and server agree on a protocol, you can use the command stream to send data rather than simply agreeing on connection properties. RFC 1073 (published in 1988) creates a protocol for the client to inform the server of its dimensions, for instance.

Specifically for MUDs, you may be interested in GMCP, the Generic Mud Communication Protocol. This lets you send a select collection of character stats from the server to the client.

What telnet options are interesting?

Charset

You want to use UTF-8. The protocol doesn’t support it natively; it assumes 7-bit ASCII. You have to negotiate for it. (Even if you want to use Extended ASCII, you have to negotiate for that, but it’s got a slightly different process.)

We saw an example above, of the server advertising which character sets it can support. A happy response would be:


client: IAC SB CHARSET ACCEPT "UTF-8" IAC SE
(255 250 42 2 85 84 70 45 56 255 240)

Note that the delimiter is gone.

But the client doesn’t necessarily support any encodings that the server can provide. A sad response would be:


client: IAC SB CHARSET REJECT IAC SE
(255 250 42 3 255 240)

Negotiate About Window Size

It’s pretty damn essential to know how wide the client terminal is because clients don’t do wrapping. It’s also important to know how tall the terminal is — if you’re displaying a help file that’s 50 lines long, should you send it all in one go or paginate?

Discworld MUD has explicit settings for that. Wouldn’t it be great if you could automatically detect changes? If you could do the right thing for a client without the user having to deal with it?

How it works: you enable NAWS (code 31) using negotiation as covered above. Then the client can send an extended command about the window size at any point in time:


client: IAC SB NAWS 0 80 0 24 IAC SE
(255 250 31 0 80 0 24 255 240)

The payload is simply the width followed by the height as 16-bit values in network byte order — in this case, 80 columns wide and 24 rows high. If I were mudding on a truly monstrous display, I might send:

IAC SB NAWS 10 25 4 12 IAC SE

This would have a width of 10 * 256 + 25 = 2585 columns and a height of 4 * 256 + 12 = 1036 rows.

The client can send this at any time and should send it whenever the number of displayable rows or columns changes. That could be the window being resized, or it could be switching fonts or font sizes.

Negotiate About Carriage-Return Disposition

If you’re not familiar with this: remember how, with typewriters, you’d finish a line, then you’d need to move the paper forward (a line feed), and then you’d need to return the head of the typewriter, known as the carriage, all the way to the start? We brought that over into ASCII. I don’t know what we were thinking. In our defense, we were pretty smashed.

Telnet, by default, interprets a line feed character as a command to go to the next line but stay at the same column, and it interprets a carriage return character as a command to go to the start column while staying on the same line.

(As a historical note, UNIX-derived operating systems always used just the line feed character, ‘\n’, for this; Windows has always required the carriage return followed by the line feed, ‘\r\n’; and Macintosh around System 7 required just the carriage return, ‘\r’. Are we having fun yet?)

As is, you need to be careful to send ‘\r\n’ rather than just ‘\n’ for newlines. This is kind of annoying. Wouldn’t it be great if the client could interpret ‘\n’ correctly? It would be less work for you, definitely, and it would safeguard against some data handling problems.

How do you tell the client what sort of carriage return handling you want?


server: IAC WILL NAOCRD (255 251 10)
client: IAC DO NAOCRD (255 253 10)
server: IAC SB NAOCRD DR 252 IAC SE (255 250 10 1 252 255 240)

The value 252 is a magic number that just tells the client that it should handle carriage returns, but it might receive some carriage returns. In this case, it should just discard them.

GMCP

GMCP is the Generic Mud Communication Protocol. It lets a MUD send specific types of data to the client, using the extended command stream as a data transfer mechanism. It’s also got some options for the client to send data back, including login information and the client name.

You might want to enable it just to get the client name. That lets you enable quirks modes for different popular clients. Yeah, it’s crud, but you need clients to work well.

One awesome thing that GMCP enables is auto mapping. As long as you have a stable unique integer to identify each room (sorry, LP MUDs), you can send sufficient data to clients that they can automatically generate maps for your MUD.

GMCP’s code is 201, and it has far too many options to list. I’ll just show a brief exchange:


client: IAC WILL GMCP
server: IAC DO GMCP
client: IAC SB Core.Hello { "client": "Mudlet", "version": "2.1.0" } IAC SE
client: IAC SB Core.Supports.Set [ "Char 1", "Room 1" ] IAC SE
client: IAC SB Char.Login { "name": "dhasenan", "password": "it's a secret!" } IAC SE
server: IAC SB Char.Vitals { "hp": "10", "mp": "18" } IAC SE
server: IAC SB Room.Info { "num": 1200, "name": "The Hall of Immortals", "area": "gods" } IAC SE

No, I won’t!

Telnet allows you to negotiate about many things. Almost nothing succeeds.

Remember how clients don’t do word wrap? And how, for the past thirty years, it’s been possible to have the client send its window dimensions to the server? It would be really handy if the client would actually do that. Which ones do?

  • CMud: yes
  • GGMud: no
  • GNOME Mud: yes, and it helpfully breaks up its notifications into two packets just to thwart you
  • KildClient: no
  • Mudlet: yes
  • MUSHclient: no
  • telnet(1): no
  • tinyfugue: yes

Uh…huh.

Well, we’ve got the most popular Windows client and the most popular command line client, anyway. That’s nice, I guess…?

Okay, well, it would be nice if we could at least use a modern character set rather than this blasted 7-bit ASCII. What clients support negotiating about the character set? Or at least handle UTF-8 characters?

  • CMud: no; it looks like it does Latin-1 by default.
  • GGMud: no; it omits characters it can’t deal with.
  • GNOME Mud: no negotiation, but it seems to handle UTF-8 characters okay.
  • KildClient: no; it munges as if it’s Latin-1.
  • Mudlet: no; it also munges like Latin-1.
  • MUSHclient: no; also Latin-1.
  • telnet(1): no, but it blindly passes data back to the terminal, so UTF-8 usually works.
  • tinyfugue: no; it munges characters in a rather unique way (é -> C for some bizarre reason).

You begin to see, I think, why it’s so annoying to use Telnet.

Why bother?

If there’s no sensible client, why bother with all these neat and annoying options? Why waste your time implementing the negotiate about window size option if no clients will use it?

Well, first off, MUDs are often expected to support GMCP. So you have that much work to start with. And once you’ve done that, it’s only another couple hours to support window size and charset. You’ll want to support window size specifications using internal configuration options; telnet commands just offer another way of manipulating the setting — one you don’t have to explain to your users.

Character sets are incredibly important, and it’s disappointing that so few MUD clients support UTF-8. MUDs are one of the few games that work for blind people, and right now they’re restricted to people who know English and scant few other languages. Blind non-Anglophones deserve games, and right now it’s damn hard to find them.

Finally, there’s no pressure for clients to support options that servers don’t. By supporting it on the server side, you’re encouraging clients to support it.

Further reading

Telnet: PCMicro’s Telnet collation, which puts the constants you need and the RFCs you hate all in one place.

GMCP: Iron Realms’s writeup of the portions they support, which should be moderately comprehensive.

Leave a Reply