Bring back struct dirent->d_namlen

Most UNIX-like operating systems built around the C programming language adhere to two standards: POSIX.1 and the Single UNIX Specification. The specific part we’re interested in for this article is the structure of a directory entry (“struct dirent”) as specified in dirent.h. The only field of a dirent that’s guaranteed to exist is the actual file name of the entry (d_name), but many other optional dirent members exist out there. The inode or serial number of the entry (d_ino, or d_fileno on BSD) is technically an X/Open System Interface (XSI) extension but almost everything in use today provides it. d_off, d_reclen, and d_type are all available on most UNIX-like systems. The variances in what dirent members are supported by each compilation environment are a notable source of compilation failures, portability issues, and general confusion.

What we’re interested in today is a lesser-known dirent member that is available on BSD systems, QNX (under a slightly different name), and not much else: d_namlen, the length of d_name. SUS doesn’t specify it, and neither does POSIX.1 or Android’s Bionic C library. glibc and some other C libraries provide a macro _DIRENT_HAVE_D_NAMLEN to discover that it’s not supported on most systems. Linus Torvalds himself has said that” d_namlen should go away” while “d_reclen actually makes sense.” It’s this claim combined with my recent experience with libjodycode that has motivated this article.

Linux, POSIX, and SUS have all shunned d_namlen. The logic often seen for its rejection is “just do strlen(dirent->d_name) instead because that’s all it is anyway” and the most rationale I’ve seen has come from Linus Torvalds in 1995:

I personally would like to totally forget about “d_namlen”, for a couple of reasons:
– it’s not POSIX. Thus program which uses it is broken.
– SVR4 doesn’t seem to have it, so programs that use it are doubly broken.
– it’s useless. Any broken program which uses it can trivially be altered to use “strlen(dirent->d_name)” instead.
…Compared to d_namlen, d_reclen actually makes sense.

This isn’t good enough for me. Standards like POSIX define what shall be present but generally don’t prohibit providing more. SUSv2 doesn’t have d_off, d_reclen, d_namlen, or d_type. The same is true for POSIX.1-2017. That leaves the third point: “it’s useless.” This is objectively incorrect. I’m here to argue that we should bring back d_namlen and enjoy the improvement in software performance that it can bring to the table.

Rationale for resurrecting d_namlen

How libjodycode is bringing it back

The original idea to pass file name length from readdir() calls back into applications came to me while working on the Windows side of libjodycode. The Windows Unicode support requires that file names read with e.g. FindFirstFileW() have their lengths counted for allocation operations. Why not pass that completed work into jdupes? Why should jdupes always have to use strlen(dirent->d_name), duplicating the work we’ve already done in libjodycode? Exploring this idea is how I discovered d_namlen and decided to include it in the Windows definition of a libjodycode dirent structure so it could be passed along. Most of the Linux/BSD/macOS side of libjodycode functions as a pass-through; that is, jc_readdir() just calls readdir() and uses the existing struct dirent definitions for whatever system it’s built on. Adding d_namlen would require tons of extra data copying that would hurt far more than having d_namlen would help. Even worse: Linux provides no equivalent, so d_namlen would be calculated with strlen() even if not used later.

Enter jc_get_d_namlen()! This new function included beside jc_readdir() allows a libjodycode program to extract the length of d_name in the most efficient way possible on the platform. On Windows it takes advantage of the d_namlen as provided by JC_DIRENT. On BSD and macOS it uses d_namlen already provided by struct dirent.

Remember that 1995 opinion by Linus about d_reclen making sense and how I said he was wrong? On Linux, structs are padded to 4- or 8-byte boundaries for efficiency, so doing some math against d_reclen only gives you the allocated size of the name, not the actual name length. Fortunately, this still makes it possible to skip over part of the name without checking it. In the absence of d_namlen provided by either the OS or a JC_DIRENT, the d_reclen size is used to calculate a skip count, then perform strlen() only on the last few bytes.

Of course, if d_reclen and d_namlen are both unavailable, jc_get_d_namlen() simply calls strlen() without any other work.

Synthetic benchmarks on Linux non-recursively running jc_readdir() 20,000 times against /usr/include and using write() to print the contents show that the jc_get_d_namlen() code is up to 13% faster than using strlen() directly, with Valgrind showing a similar drop in total CPU instructions executed. The worst performance boost I managed to achieve in all of my benchmarking was 0.7%. BSD/macOS and Windows should see even larger performance improvements since d_namlen is directly available and requires none of the work behind the d_reclen skip. I encourage anyone reading this and writing Linux C programs to steal my d_reclen skip code and see how much of a difference it makes.

The moral of the story is that only a fool duplicates their effort just to end up in the exact same place as the first time around.

News: working toward jdupes 2.0

I have wanted to rewrite jdupes for a very long time. Some of the decisions inherited from fdupes have been a real pain when trying to add features. Unfortunately, every attempt at a comtoreplete rewrite was too daunting of a task, and several major overhauls have been started and subsequently abandoned. One of them has finally stuck, though, and it’s well on the way to becoming worthy of the version 2.0 milestone.

The biggest goal I’ve had is to replace the “file tree” data structure with something more flexible; thus, the size tree was born. There is no way for two files to be identical if their sizes are different. Grouping files into sizes before further work makes a lot of sense! I’ve also been replacing code containing “knowledge” of the actual structures behind file information storage with more generic “get next thing” style interfaces that are more flexible in nature.

One of the long-term goals is to parallelize jdupes. The widespread availability of multi-core processors, solid-state storage, and RAID storage means that a lot of opportunities to speed things up are lost in the current simple single-threaded operation paradigm. Abstracting out things in the right way will make it far easier to split work into multiple threads where it makes sense to do so. A specific case where a ton of performance is being left on the table is when files are being compared that are on different disks; a close second is exploiting the internally parallel nature of modern SSDs to perform parallel I/O on the same SSD.

As of this news post, I’m in the process of writing a “query” interface that will give other code modules a standard way to acquire information about discovered duplicate sets and make better decisions. The file tree information model made it very difficult to add features such as hard linking to the files with the highest hard link count first or intelligently establishing hard links when duplicates exist on different volumes. This rigidity led to a long-standing problem with reproducible builds in Debian that I couldn’t solve very easily. The only real solution to the problem of reproducible results was to nuke the file tree paradigm and the tree-specific code in all of the action modules in favor of a new system that allows for querying and sorting the data properly after duplicate scanning is done.

News: fix for v1.27.3, libjodycode 4 work

I’ve backported a fix in the current development version of jdupes to v1.27.3 which avoids losing the existing hash database on a write failure:

This fix writes out the in-memory hash database to a new temporary file, then removes the existing database and renames the temporary file to the original file name. If anything goes wrong during creation of the new database then the original database is not overwritten or deleted. I ran into this issue because the disk I was deduplicating ran out of disk space and the hash database file was destroyed as a result. I’ve emailed the patch to several jdupes distribution package maintainers but I can’t possibly find everyone. This patch should be applied for anyone who is distribution v1.27.3 anywhere.

I’ve been working on a major update to libjodycode and I’ve converted most of the file management functions called by jdupes that need Windows-specific support into libjodycode wrapper functions. Most of the functions simply pass through the call on non-Windows platforms. I’m hoping to have a few more basic functions ready to go before libjodycode 4 is officially released.