We’ve compiled a list of the most used and useful APIs that are built in to the standard Node.js runtime. For each module you’ll find simple english explanations and examples to help you understand.
This guide has been adapted from my course Node.js: Novice to Ninja. Check it out there to follow comprehensive course to build your own multi-user real time chat application. It also includes quizzes, videos, code to run your own docker containers.
When building your first Node.js application it’s helpful to know what utilities and APIs node offers out of the box to help with common use cases and development needs.
The process object provides information about your Node.js application as well as control methods. Use it to get information like environment variables, and CPU and Memory usage. process is available globally: you can use it without import, although the Node.js documentation recommends you explicitly reference it:
The os API has similarities to process (see the “Process” section above), but it can also return information about the Operating System Node.js is running in. This provides information such as what OS version, CPUs and up time.
URL is another global object that lets you safely create, parse, and modify web URLs. It’s really useful for quickly extracting protocols, ports, parameters and hashes from URLs without resorting to regex. For example:
You can view and change any property. For example:
You can then use the URLSearchParams API to modify query string values. For example:
There are also methods for converting file system paths to URLs and back again.
The dns module provides name resolution functions so you can look up the IP address, name server, TXT records, and other domain information.
The fs API can create, read, update, and delete files, directories, and permissions. Recent releases of the Node.js runtime provide promise-based functions in fs/promises, which make it easier to manage asynchronous file operations.
You’ll often use fs in conjunction with path to resolve file names on different operating systems.
The following example module returns information about a file system object using the stat and access methods:
When passed a filename, the function returns an object with information about that file. For example:
The main filecompress.js script uses path.resolve() to resolve input and output filenames passed on the command line into absolute file paths, then fetches information using getFileInfo() above:
The code validates the paths and terminates with error messages if necessary:
The whole file is then read into a string named content using readFile():
The resulting string is output to a file using writeFile(), and a status message shows the saving:
Run the project code with an example HTML file:
You often need to execute multiple functions when something occurs. For example, a user registers on your app, so the code must add their details to a database, start a new logged-in session, and send a welcome email. The Events module :
This series of function calls is tightly coupled to user registration. Further activities incur further function calls. For example:
You could have dozens of calls managed in this single, ever-growing code block.
The Node.js Events API provides an alternative way to structure the code using a publish–subscribe pattern. The userRegister() function can emit an event—perhaps named newuser —after the user’s database record is created.
Any number of event handler functions can subscribe and react to newuser events; there’s no need to change the userRegister() function. Each handler runs independently of the others, so they could execute in any order.
In most situations, you’re attaching handlers for user or browser events, although you can raise your own custom events. Event handling in Node.js is conceptually similar, but the API is different.
Objects that emit events must be instances of the Node.js EventEmitter class. These have an emit() method to raise new events and an on() method for attaching handlers.
The event example project provides a class that triggers a tick event on predefined intervals. The ./lib/ticker.js module exports a default class that extends EventEmitter:
Its constructor must call the parent constructor. It then passes the delay argument to a start() method:
The start() method checks delay is valid, resets the current timer if necessary, and sets the new delay property:
It then starts a new interval timer that runs the emit() method with the event name "tick". Subscribers to this event receive an object with the delay value and number of seconds since the Node.js application started:C
The main event.js entry script imports the module and sets a delay period of one second (1000 milliseconds):Copy
It attaches handler functions triggered every time a tick event occurs:
A third handler triggers on the first tick event only using the once() method:
Finally, the current number of listeners is output:
Run the project code with node event.js.
The output shows handler 3 triggering once, while handler 1 and 2 run on every tick until the app is terminated.
The file system example code above (in the “File System” section) reads a whole file into memory before outputting the minified result. What if the file was larger than the RAM available? The Node.js application would fail with an “out of memory” error.
The solution is streaming. This processes incoming data in smaller, more manageable chunks. A stream can be:
Each chunk of data is returned as a Buffer object, which represents a fixed-length sequence of bytes. You may need to convert this to a string or another appropriate type for processing.
The example code has a filestream project which uses a transform stream to address the file size problem in the filecompress project. As before, it accepts and validates input and output filenames before declaring a Compress class, which extends Transform:
The _transform method is called when a new chunk of data is ready. It’s received as a Buffer object that’s converted to a string, minified, and output using the push() method. A callback() function is called once chunk processing is complete.
The application initiates file read and write streams and instantiates a new compress object:
The incoming file read stream has .pipe() methods defined, which feed the incoming data through a series of functions that may (or may not) alter the contents. The data is piped through the compress transform before that output is piped to the writeable file. A final on('finish') event handler function executes once the stream has ended:
Run the project code with an example HTML file of any size:
This is a small demonstration of Node.js streams. Stream handling is a complex topic, and you may not use them often. In some cases, a module such as Express uses streaming under the hood but abstracts the complexity from you.
You should also be aware of data chunking challenges. A chunk could be any size and split the incoming data in inconvenient ways. Consider minifying this code:
Two chunks could arrive in sequence:
Processing each chunk independently results in the following invalid minified script:
The solution is to pre-parse each chunk and split it into whole sections that can be processed. In some cases, chunks (or parts of chunks) will be added to the start of the next chunk.
Minification is best applied to whole lines, although an extra complication occurs because <!-- --> and /* */ comments can span more than one line. Here’s a possible algorithm for each incoming chunk:
The process runs again for each incoming chunk.
That’s your next coding challenge— if you’re willing to accept it!
Complex calculations that process data from a file or database may be less problematic, because each stage runs asynchronously as it waits for data to arrive. Processing occurs on separate iterations of the event loop.
The example code has a worker project that exports a diceRun() function in lib/dice.js. This throws any number of N-sided dice a number of times and records a count of the total score (which should result in a Normal distribution curve):
The code in index.js starts a process that runs every second and outputs a message:
Two dice are then thrown one billion times using a standard call to the diceRun() function:
This halts the timer, because the Node.js event loop can’t continue to the next iteration until the calculation completes.
The code then tries the same calculation in a new Worker. This loads a script named worker.js and passes the calculation parameters in the workerData property of an options object:
Event handlers are attached to the worker object running the worker.js script so it can receive incoming results:
… and tidy up once processing has completed:
The worker.js script starts the diceRun() calculation and posts a message to the parent when it’s complete—which is received by the "message" handler above:
The timer isn’t paused while the worker runs, because it executes on another CPU thread. In other words, the Node.js event loop continues to iterate without long delays.
Run the project code with node index.js.
You should note that the worker-based calculation runs slightly faster because the thread is fully dedicated to that process. Consider using workers if you encounter performance bottlenecks in your application.
It’s sometimes necessary to call applications that are either not written in Node.js or have a risk of failure.
I worked on an Express application that generated a fuzzy image hash used to identify similar graphics. It ran asynchronously and worked well—until someone uploaded a malformed GIF containing a circular reference (animation frameA referenced frameB which referenced frameA).
The hash calculation never ended. The user gave up and tried uploading again. And again. And again. The whole application eventually crashed with memory errors.
The problem was fixed by running the hashing algorithm in a child process. The Express application remained stable because it launched, monitored, and terminated the calculation when it took too long.
The child process API allows you to run sub-processes that you can monitor and terminate as necessary. There are three options:
Unlike worker threads, child processes are independent from the main Node.js script and can’t access the same memory.
Is your 64-core server CPU under-utilized when your Node.js application runs on a single core? Clusters allow you to fork any number of identical processes to handle the load more efficiently.
The initial primary process can fork itself—perhaps once for each CPU returned by os.cpus(). It can also handle restarts when a process fails, and broker communication messages between forked processes.
Clusters work amazingly well, but your code can become complex. Simpler and more robust options include:
Both can start, monitor, and restart multiple isolated instances of the same Node.js application. The application will remain active even when one fails.
It’s worth mentioning: make your application stateless to ensure it can scale and be more resilient. It should be possible to start any number of instances and share the processing load.
This article has provided a sample of the more useful Node.js APIs, but I encourage you to browse the documentation and discover them for yourself. The documentation is generally good and shows simple examples, but it can be terse in places.
As mentioned, this guide is based on my course Node.js: Novice to Ninja which is available on SitePoint Premium.
Craig is a freelance UK web consultant who built his first page for IE2.0 in 1995. Since that time he's been advocating standards, accessibility, and best-practice HTML5 techniques. He's created enterprise specifications, websites and online applications for companies and organisations including the UK Parliament, the European Parliament, the Department of Energy & Climate Change, Microsoft, and more. He's written more than 1,000 articles for SitePoint and you can find him @craigbuckler.