Creating a browser-based interactive terminal (Using XtermJS and NodeJS)

Why a terminal in the browser?

A browser-based terminal can be useful for many different use cases.

Instead of sharing server access through keys, you can grant them access to a web-based terminal that is behind a login screen one that you control. For example, Digital Ocean provides browser-based terminal access to droplets and Play with Dockers provides terminal access to containers right in the browser.

If you create developer tools chances are at some point you will need to create an embedded terminal.

One other interesting use case is creating controlled terminal access, by this, I mean being able to disable certain commands so they can not be used.

Get the source code

The source code used for the interactive terminal can be found on Github. The README.md file contains all the steps you need to get it up and running.

The purpose of this post is to explain how the different parts of the code work.

How it works

An image of browser terminal architecture

There is the frontend implementation which is a terminal emulator recreated using nothing more than HTML, CSS, and Javascript.

Whenever the user types a command in the emulator it's sent over to a NodeJS backend using a WebSocket.

The reason for sending it over WebSocket instead of the standard HTTP protocol is because of how an actual terminal function.

when the backend gets the command from the emulator it runs the command against an actual terminal. Now if the input is a command like ls well it returns the list of files and folders to the emulator, perfect one-way communication.

However, for commands like netstat, the terminal refreshes the screen with changes and it will be difficult to convey these changes back and forth over HTTP. With a WebSocket, the real terminal can send over output whenever it has something and the frontend emulator can too. This way even progress bars can be emulated without problems.

Frontend setup

What you see in the browser is an emulator as mentioned earlier, it's meant to emulate the experience you get with a real terminal. For this demo, I used XtermJS.

It provides the look of a terminal and also a bunch of hooks. Xterm does not handle the interpretations of terminal commands, it only provides the interface and captures and makes the user's input available to you through hooks. It also deals with character encoding and all the little things that deliver that terminal look and feel.

You can pull in XtermJS as a Node module or use a CDN which is what I did in the demo.

There is the stylesheet for well, the emulator styling.

01: <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/xterm@4.19.0/css/xterm.css" />

And the Javascript library for initiating and accessing the functionalities of XtermJS such as the hooks.

01: <script src="https://cdn.jsdelivr.net/npm/xterm@4.19.0/lib/xterm.js"></script>

You will also need to let XtermJS know which HTML element to embed the terminal DOM elements.

In the case of the demo thus:

01: <div id="terminal"></div>

Ok so now let's look at some of the frontend Javascript code. The following code instantiates Xterm.

01: var term = new window.Terminal({
02:     cursorBlink: true
03: });
04: term.open(document.getElementById('terminal'));

View on Github

Once we have Xterm pulled in through a CDN it adds the Terminal function this contains everything we need to work with Xterm, we just need to instantiate it with all the options we want, in the case of the demo we just set the cursor to blink.

The last line passes the DOM element(DIV) we will like to embed the terminal into.

There is also some WebSocket code we should look at:

The line below instantiates the WebSocket and establishes a connection to the backend.

01: const socket = new WebSocket("ws://localhost:6060");

View on Github

Whenever a user types a command and hits enter it's sent over the socket to the backend. Note the newline character at the end of the command. This is added so that when the backend runs the command against the actual terminal the newline will simulate the hit of the enter key after the command (speaking loosely).

01: socket.send(command + '\n');

View on Github

Once a command is sent to the backend the onmessage handler will fire up on every response sent back from the backend. In this case, the handler outputs the data directly into the emulator.

01: socket.onmessage = (event) => {
02:     term.write(event.data);
03: }

View on Github

Backend implementation

Now let's talk about the backend implementation. It acts as a bridge between the emulator and the real terminal, forwarding input and outputs between them.

It does that through a WebSocket connection using the WS Node package.

When the frontend sends over a command to the backend we need to forward this over to the actual terminal. We can use NodeJS's exec method to run the command but there is one big problem.

exec closes an execution when it hits its first output, meaning if the output from running a command line is say a question that requires the user's input, well the user won't have the chance to provide the answer since exec would have closed that connection after the question is displayed.

To get around this we can manipulate the standard buffer as part of our command instruction to exec or just throw in another Node package node-pty

We create a custom spawned process using node pty, this process will create either a PowerShell or Bash terminal shell instance depending on the OS.

01: var shell = os.platform() === 'win32' ? 'powershell.exe' : 'bash';
02: var ptyProcess = pty.spawn(shell, [], {
03:     name: 'xterm-color',
04:     //   cwd: process.env.HOME,
05:     env: process.env,
06: });

View on Github

The code below is responsible for sending commands to the actual terminal whenever the frontend sends them over. ptyProcess.write performs a stdin directly to the terminal.

01: ws.on('message', command => {
02:     ptyProcess.write(command);
03: })

And we listen for the response from the terminal using the code below

01: ptyProcess.on('data', function (data) {
02:     ws.send(data);
03:     console.log(data);
04: });

ws.send(data); on line 02 sends the response back to the frontend to be displayed to the user.

And that's pretty much it. This here is by no means a complete functioning terminal, it currently lacks a few things like capturing arrow key movements and key combos and sending that over to the actual terminal.

This will be important to get things like VIM working in our web terminal. It's something I have actually started looking into and will surely share my results once am done :).

Here is another article for you 😊 "How to remove the arrow from the HTML details tag"