16 December 2022

"Failed to Enable Personal Hotspot on iPhone" – finally, a solution

Some of my older Macs running High Sierra or Mojave were unable to join the Personal Hotspot of one or even two of my iPhones.

I searched the Apple "Community" Support forums and found several related questions, with lots of "I have the same issue" tags, but with no helpful answer:
  • https://discussions.apple.com/thread/8153026
  • https://discussions.apple.com/thread/8337291
  • https://discussions.apple.com/thread/254471591
So I did figure this out on my own.

If you get this error when trying to connect from your Mac to the Personal Hotspot of your iPhone, try the following for a fix:

 There is a file called "com.apple.airport.preferences.plist" in the folder "/Library/Preferences/SystemConfiguration".

 If you cannot locate it directly, use a tool to find it, e.g. with my program Find Any File (no need to pay for it if you only use it once - just download it).

Then make a back-up of the file, in case you get it wrong below! For instance, copy the file to your Desktop.

Then you have two options:
  1. Delete this file. But that will also reset all your network interfaces, e.g. if you had set up a LAN connection with a fixed IP address or made other configurations in the System Preferences' Network panel. And after deleting it, you need to restart the Mac so that it sets up the network interfaces again. Then try to connect to the phone again.
  2. If you're more adept, edit the file, e.g. with the app PrefEdit or BBEdit, then find all entries that mention "phone", and remove those. After saving the changes, you should be able to reconnect to the phone without even requiring a restart.
If something went wrong, copy the backed-up file back into the folder (you'll have to enter your admin password).

Good luck!

19 February 2022

What I did in the past 2 years, programming related

I had stopped to blog here in 2019 because I wasn't too happy with the blogging platform I'm using (Blogger). I was looking for alternatives, but with the constraint that I'd be able to keep the existing blog posts, along with their sometimes valuable comments. I gave Wordpress a few attempts but am not too happy with its complexity and rather frequently discovered new vulnerabilities. I also considered Jekyll, which is nice but can't do comments.

So for now, I'm just staying with Blogger until it dies on its own, probably.

So, here's a brief summary of things I did the past two years I thought of blogging about, but didn't because I was all this time waiting for a better blogging system to come to me.

Excel Diff

I needed to be able to visualize differences when comparing Excel spreadsheets.

I worked with the premise that tables in the sheets have a column or row with an identifying key, and then I compare those with identical keys, even if they had been moved around between the compared versions, identifying which rows & columns got removed or added, and which cells got changed. It's fairly usable for that kind of data.

Here's an example:

I had considered making it public and selling it, but so far the interest seemed to be rather small, so I didn't ever make the effort.

If you, as a programmer, are interested in taking over this project, and if you are familiar with the Xojo development system, contact me.

Repair .textClipping, .webloc and related files restored from git

I store a lot of information in a "Info" folder inside my Documents folder. As I have several computers in use, I like to have those files present on all Macs, and in sync. I don't like using iCloud's automatic sync service for this, so I decided to simply store it all in a git repository.

The problem with git is that it's not made to store arbitrary file data but is mainly meant for plain text files. And because of this, git won't store resource forks and Mac specific metadata.

This led to the effect that many of the older .textClipping and .webloc files I had in that folder did not restore with any data on the other Macs because the had their information in the resource fork, which didn't transport over.

But there is a way to work around this: Modern macOS versions store the information for .textClipping and .webloc in their data fork.

So I wrote a program that would scan a folder for files that were still using the resource fork for data storage, and copy the data over to the data fork. And it does the same in the opposite way as well: It copies the data fork's contents back to the resource fork as well. That's mainly needed to make QuickLook work with .textClipping files (apparently, Apple's QuickLook code has not been updated to look for the text clipping data in the now-preferred data fork, yet).

Finally, the tool also identifies restored Finder Alias files that don't work any more because the information that the file is an Alias got lost in the git operation.

If you're interested in trying this tool, contact me. I still need to get some feedback to gather how well it works for others before I'd release it into the wild.

utis.cc

I set up a website that's supposed to be a central landing point for all things related Unified Type Identifiers. It currently points to two projects:

  • nspasteboard.org (Identifying and Handling Transient or Special Data on the Clipboard)
  • DotPathsFileSpec (".paths" file extension format for storing and exchanging lists of file references), see also next chapter

The .paths file format

In my program Find Any File, which, after a search you perform, lists a bunch of files and folders, I added the option to save that list to a text file. Initially I thought to just write them to a .txt file, but then I thought: Wait, other apps may want to read that file too, and use it as input for more operations on such files. I also realized that other file search programs have a similar needs. So I contacted the author of Scherlokk and proposed to use an easily interchangeable file format. That program used a plist format at that time, so he could also store metadata with it, but he agreed on my proposal to use a plain text file, and we then came up with a method to keep adding metadata, by using the JSON format.

The result is the .paths file specification .

The idea is that a file with the specific .paths  file extension means that it contains a list of file paths, and any app that opens such a file should read the paths in the file and treat them as input files for whatever the program performs with a bunch of given files.

If you are a developer or a user who thinks it would be beneficial to a macOS app to process such a file list as input, or save a given list to a file, please adopt this specification or suggest it to its developer. If you don't need to add metadata, then it's practically nothing more than a bunch of POSIX paths, separated by LFs (or NULs, if you prefer).

Also note that this is the easiest way to pass a bunch of paths that are output by command line tools, even if they're in the 1000s, as passing that large number of paths via the "odoc" AppleEvent (such as via the Terminal command  open ) will not work.

Be aware of NSDockTilePlugIn collisions

Just a quick heads-up: If you are a developer who includes a Dock Tile Plugin in their app, please make sure that your NSPrincipalClass is not DockTilePlugIn! Because, if it is, and there is another app also providing a Dock Tile Plugin, they're likely to clash when loaded by the Dock, as the Dock does not, as one might expect, use the plugin's bundle ID to separate and identify the plugins. Instead, they may end up sharing the same space (especially in macOS 10.14 and earlier, where they even share the same process) and one of them will overpower the other, making the other not work.

So, always make sure to use unique class names for any type of plugin. Apple recommends using three-letter prefixes as namespaces in ObjC, for instance (which may still clash with other plugins if they use the same, so you better integrated your app's name or something more unique into all your own class names in a plugin).

Find Any File

My most popular program has gotten quite a few notable improvements in the past two years:

  • Integration with PopClip, Alfred and Keyboard Maestro.
  • Can save and re-open results saved to ".paths" files.
  • Supports searching for Tags, inodes, Date Last Opened, Date Added.
  • Can ignore diacritics in searches (e.g. that even é is found when searching for e)
  • And just now in the works: Scripting! Write your own search rules in Lua or JavaScript (requires v2.3.3, which is currently in beta).

Comments?

If you have comments or questions, write them below or on Twitter (I'm @tempelorg).

30 June 2019

TT's generic programming guidelines

Declare your local variables close to where they're used

Do not declare all your variables at the top. Instead, declare them where you use them first. That makes it easier to see their type and also leads to better locality, avoiding accidental use of stale (old) values.

Bad style:

var i as integer
var item as FileReference
var files() as FileReference

for i = 1 to directory.Count
    item = directory.Item(i)
    if item.isRegularFile then
        files.Append item
    end
next

Better style:

var files() as FileReference

for i as integer = 1 to directory.Count
    var item as FileReference = directory.Item(i)
    if item.isRegularFile then
        files.Append item
    end
next

Working with Booleans

Naming

Functions, properties and variables of boolean type should be named so that, when reading them in a statement, end up with proper grammar.

Examples:
  • function isHidden() as boolean
  • var hasVisibleItems as boolean

Avoid testing for true & false

It isn't good style to write boolean tests like this:
if hidden = true then ... // bad style
nor:
if hidden = false then ... // bad style
Instead, along with the naming rule above learn to make it into a proper sentence:
if isHidden then ... // good style
and:
if not isHidden then ... // good style

Keep function code concise

Avoid having long functions. Ideally, a function (subroutine) should have no more than 20-40 code lines, so that it's easy to look at in a single editor page, without scrolling.

If the code gets longer, try to split it up into subroutines, even if those subroutines only get called once by your shortened main function. And if you manage to name these subroutines in a good way, so that each subroutine's name expresses clearly what it does, your main function will be quite self-explanatory about what it does, without the need for comments.

Exit early, avoiding nested if / else constructs

Compare these alternative ways to code the same result (which is to collect hidden files into an array, and returning errors otherwise).

Convoluted version (sub-optimal):
if fileReference.isValid then
    if fileReference.isDirectory then
        if fileReference.isHidden then
            hiddenFiles.Append fileReference
            return true
        else
            return false
        end
    else
        return false
    end
else
    return false
end
Alternative, better readable, version:
if not fileReference.isValid then
    return false
end
if not fileReference.isDirectory then
    return false
end
if not fileReference.isHidden then
    return false
end
hiddenFiles.Append fileReference
return true

The latter version is not necessarily shorter - it may even be longer. But by sorting out all the "bad" cases first, it makes it clear what's left over in the end. This flow of control is easily to oversee because you do not have to scan for the else parts that may be coming for each if.

Avoid code duplication

Whenever you have the urge to copy some code and modify it for a similar purpose, don't do it! (There are always exceptions, of course.)

Duplicated code can easily lead to mistakes later when you find an issue with it and only end up changing the one case where you see the issue but then forget to also update the other copies of the code that perform similar operations and need to get fixed as well.

Instead, instead of copying lines and then making small changes to the copied lines, share the same code by putting them into a subroutine, and if there are different details to perform for one or the other, add a parameter that will tell the subroutine what to do. What way, most of the code will be the same and doesn't need to be duplicated.

More on this: DRY

Avoid getting the same values repeatedly

Consider this code that iterates over values in an array:
for i as integer = 0 to items.count-1
    if items(i).isSelected then
        addToResult (items(i))
    end
next
Note that items(i) is fetched twice here. Avoid doing that. Not only may it make your code slower, it also can make it more difficult to alter your code later. Instead, use a local variable to fetch the value and use that, even if that makes the code sligtly longer:
for i as integer = 0 to items.count-1
    var item as MyItemType = items(i)
    if item.isSelected then
        addToResult (item)
    end
next

Document what you want to achieve in your code

If you write code to implement any kind of algorithm, such as searching for matching words in a text (string), don't add comments about the obvious that your code does. Instead, write comments that explain what the code is supposed to do. E.g, explain the algorithm you want to write as code. That way, if you make a mistake in your code, you or someone else can later understand what was planned to be happening in the code, and can fix the code accordingly.

If you do not explain the intent of the code, then it is difficult later to understand if an unexpected result is a bug or an intentional behavior, and when that happens, people tend to not touch the questionable code and instead add another copy of the code which behaves slightly different (with the bug fix in it), and soon you have a mess of duplicate code that will lead to a difficult-to-maintain overall project.

If in doubt, ask your fellow programmer who wrote the original code about the intention and then add the documentation once it's clear. Then add your fix and maybe also add a comment why you fixed it that way, explaining how it changes old behavior.

Document the behavior of a function

If you write a function, it can also be helpful that you write in plain english what you expect the passed parameters to contain and what the result will be. Also: What will happen if they do not meet the expections?

For example, if you write a function that returns the index of the occurance of a string in an array of strings, explain what happens if the item is not found at all (you could return -1, the index past the last item or raise an exception), and what happens if there are multiple possible results (will it return the index of the first or of the last occurance, or will it even do something else to indicate that the result is ambiguous).

Avoid side effects

Functions should not have side effects, i.e. they should not modify state outside of their scope. The reason for this is that such side effects are hard to detect and thus difficult to debug.

For instance, if you make a function that calculates something and returns a result, that function should not also change some properties or static vars unless they're solely meant for that function (such as for caching).

If you want to maintain state, pass the variables/objects for keeping the state as (inout, byref) parameters to the function, so that the caller knows that a state has changed, and can decide where to store it.

Use a version control system for tracking each of your changes

Use git or something like that, and commit your changes frequently, even if you work all by your own.

For instance, if you are fixing a bug, commit that fix as a single commit, so that, if one later suspects that your bug fix was causing new problems, the particular commit can be undone easily to check whether that changes the result of the new problem.

More

See this older article of mine on some more guidelines

11 May 2019

Cloning APFS Volumes & Containers ("APFS inverter failed")

(This is an older article that I hadn't published back then because it might not be fully accurate, i.e. the steps may not be applicable. Yet, it contains some valuable information so I decided to publish it now. Read with a grain of salt. If you have corrections, don't hesitate to comment or email me.)

TL;DR

If Disk Utility fails to clone an APFS container, giving "APFS inverter failed" as error message, a work-around solution is to copy the partition with the "dd" command into a same-sized partition, then fix the cloned container's UUIDs.

About cloning (copying) entire Mac volumes

Usually, when you want to clone a Mac volume, you'd use Disk Utility's Restore operation. It performs a sector-by-sector copy (while skipping unused sector . The alternative would be to perform a file-by-file copy, as it's done by 3rd party tools like Carbon Copy Cloner.

A clone operation by copying individual files has several disadvantages over a sector copy:

  • It's much slower.
  • If you're using Finder Alias files, these may not work any more afterwards (that's because they rely on each file's unique ID, and those IDs change when copying files over to the destination).
  • More programs may request re-activation (re-registration).
However, when I recently tried to make an identical copy of my macOS Mojave system from my MacBook Air (2015), copying it to an external SSD, I ran into problems:

After the copy and verification operation was already apparently finished, an additional inverter process needs to be run on cloned APFS volumes. While I do understand that that's necessary when I copy only a single volume out of an APFS container, or copy the volume into a target APFS container without replacing it entirely, but even if I try to clone the entire container, with erasing the target, it still wants to run an inversion process - and that makes no sense to me.

Now, the error that I kept seeing is: APFS inverter failed to invert the volume

And I'm not the only one, see here and here.

I tried many things, including First Aid and removal of all snapshots, and running the cmdline tool "asr", which showed me more detailed error messages. Still, no success. I keep getting variations of the same issue.

What I'm showing now is a way to clone a complete APFS container (with all contained volumes) the way it should work.

How to clone an APFS container

Note: This may not work with encrypted volumes. Or it might. I have not tried.

We simply copy the entire partition (which contains the APFS container) sector by sector. (Small disadvantage over Disk Utility's Restore operation: This will also copy unused sectors, so it'll take a bit longer.)

Afterwards, we need to change the UUIDs of the cloned container and its volumes, or the Mac (and especially Disk Utility) may get confused when both the original and the cloned volumes are present on the same Mac.

Perform the sector copy operation

In case you want to copy your bootable macOS system, you will have to first start up from a different macOS system. If you have no other external disk or partition with another macOS system, simply start up from the Recover system.

When ready, connect the target disk, then start Terminal.app.


Get an overview of our disk names by entering (and always pressing Return afterwards):

diskutil list
Here's a sample output of my Mac that has four partitions, two of which use HFS+ ("AirElCap", "AirData") and the other two use APFS ("AirMojave", "AirHighSierra"):
/dev/disk0 (internal):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                         121.3 GB   disk0
   1:                        EFI EFI                     314.6 MB   disk0s1
   2:                 Apple_APFS Container disk1         40.9 GB    disk0s2
   3:                  Apple_HFS AirElCap                20.1 GB    disk0s3
   4:                 Apple_Boot Boot OS X               134.2 MB   disk0s4
   5:                 Apple_APFS Container disk2         39.8 GB    disk0s5
   6:                  Apple_HFS AirData                 19.9 GB    disk0s6

/dev/disk1 (synthesized):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      APFS Container Scheme -                      +40.9 GB    disk1
                                 Physical Store disk0s2
   1:                APFS Volume AirMojave               20.7 GB    disk1s1
   2:                APFS Volume Preboot                 47.1 MB    disk1s2
   3:                APFS Volume Recovery                512.7 MB   disk1s3
   4:                APFS Volume VM                      2.1 GB     disk1s4

/dev/disk2 (synthesized):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      APFS Container Scheme -                      +39.8 GB    disk2
                                 Physical Store disk0s5
   1:                APFS Volume AirHighSierra           23.2 GB    disk2s1
   2:                APFS Volume Preboot                 20.9 MB    disk2s2
   3:                APFS Volume Recovery                519.0 MB   disk2s3
   4:                APFS Volume VM                      3.2 GB     disk2s4

I plan to clone "AirMojave", so the disk identifier would be disk1, because that is the container for that volume. In your case it may be a different identifier. I will write the commands below using diskS for the source container and diskTsP for the target partition.

Now unmount all volumes of our source container. In Terminal, enter:

diskutil unmountDisk diskS
If this does not succeed, then you either have programs or files open on one of the volumes or you're trying to unmount the boot volume (in that case, you should have restarted into a different system as explained above). Do not continue if the mount was not successful or you're likely to end up with a corrupted clone.
diskutil apfs list diskS
+-- Container disk2 5099518D-36C0-47CC-B034-0CDA52C51CE8
    ====================================================
    APFS Container Reference:     disk2
    Size (Capacity Ceiling):      39838416896 B (39.8 GB)
    Capacity In Use By Volumes:   27042205696 B (27.0 GB)
    Capacity Not Allocated:       12796211200 B (12.8 GB)
sudo diskutil resizeVolume diskTsP size
In place of size, use the number that's show after "Size (Capacity Ceiling):".
diskutil unmount diskTsP
This is the command that will perform the copying:
sudo dd bs=1m if=/dev/diskS of=/dev/diskTsP

(See also Adrian's comment below, suggesting to use "rdisk" instead of "disk" for higher speed.)

While dd is doing its work, it won't show any progress on its own. And it can take hours, even days. To check where it's at, type ctrl-T in the Terminal window. That will print a line showing its current progress.

Fix the UUIDs

Enter in Terminal:
sudo /System/Library/Filesystems/apfs.fs/Contents/Resources/apfs.util -s /dev/diskT
You'll likely not get any response from this command, telling you success or failure.

To check whether the UUIDs have been changed successfully, enter in Terminal:
diskutil apfs list
In the output, find your source and target disks, and compare the UUIDs of their container and container volumes. They should all be different. If the UUIDs of source and target volumes show the same code, then the fix did not work. Try again.

Make the system volume bootable

The last step is to make the cloned volume bootable again (in case it was before). For that, you need to mount the Preboot volume, then rename the single folder inside, which contains the old UUID, into the new UUID of the main volume.

To mount it:
diskutil mount diskTsX
(X stands for the partition number containing the "Preboot" volume)

To show it in Finder:
open /Volumes/Preboot
Now you should see a Finder window with a folder named after the UUID of the original volume you copied. Rename that folder into the new UUID. To verify that you used the correct UUID, open System Preferences, Startup Disk, select the cloned bootable volume and click the Restart button. If it says that it can't bless it, you got it wrong. Otherwise, the system should now restart, meaning the UUID was correctly set to a bootable volume.

Understanding bugs in Xojo, or not getting them fixed

A former employee of Xojo Inc. once met with an Apple Developer Support (DTS) engineer, looking over some code. The Apple engineer saw a note mentioning my name and told the Xojo employee: You know Thomas Tempelmann? I know him, too: "He's the best type of user, the one that debugs a problem so far so that he tells you exactly what you're doing wrong, and how to fix it."

I'm not infallible, but I believe I can claim that I have quite some experience and understanding of a lot of things under a computer's hood. After all, I've been doing this for nearly 40 years. I'm not so great with abstract algorithms, but when it comes to writing efficient code or debugging it, I'm surely not the best, but have skills that are well above average.


And I've proven that a lot of times with a development system I really love to use: Xojo, formerly known as REALbasic.


I've been using Xojo (or RB) for about 20 years now. I've been one of the first to write plugins for it, and they were quite popular (the plugins eventually became part of the MBS plugins).


Here are a few examples of bugs and solutions I found in Xojo:


  • Back when we still used 68k CPUs, there was a serious issue that only a few customers had: Their apps crashed when they got large. I had some suspicions, looked at the compiled code and soon found the issue - an overflow with a 16 bit offset in a jmp instruction. Worth noting here was that previously the Xojo engineers were not able to find the bug. Yet I, without even having the source code, found it within an hour or so. Because I wrote a 68k compiler once myself and knew what could go wrong. I also like to point out that Xojo's current EULA prohibits us from looking at the compiled code the way I did, in order to learn how it works. I later argued against that, even mentioning this example where I had to do that because no one else was able to do find the bug, to no avail. This was not the last time I had to dig into Xojo's code in order to work around a bug when Xojo refused to look into it, and solved the issue for myself, but I can't publish them any more to the benefit of others because, well, Xojo's EULA and the threat it emcompasses.
  • In 2008, I ran into a very rare case where I'd lose some data when I had 10,000s of objects in a particular data structure (WeakRefs, IIRC). Turned out that the internal dictionary code did not handle collisions correctly. After proposing a code change, the reproducible issue was gone. I had to personally pursue this issue down into the code because, again, the resposible engineer (who was, admittedly, not its original author) did not even believe in the problem I described. Regardless, I was able to get the fix into the framework due to some lucky circumstances, for all of us.
  • The Xojo IDE's Back (History) button does not work reliably, since 2013 when this IDE was introduced. More often than not, the Back button simply does not go back to previously visited locations. I have then, as a proof-of-concept, written an external program that talks to the IDE via the IDE communication socket, regularly requesting the current location, and offering a list with the history. You can then click any history item and the IDE actually jumps back to it. Surprisingly, this works more reliably than the IDE's own back button. Yet, when Xojo CEO Geoff Perlman was recently asked at the MBS conference in Munich about this shortcoming, he insisted that this is a very complex matter that is not easy to solve. I find that hard to believe if even I can do better with an external program.
  • Xojo code can use Threads, but can run them only cooperatively, not concurrently. That's because the runtime functions are not using locking to protect the sensitive operations such as object creation against interruption by another thread. Thus, Xojo's runtime has its own thread scheduler that uses semaphores to make sure only one Xojo thread runs at any time. Now, there are cases where we users need to use Declare statements to invoke OS-provided functions. Some of them may even call back into our own Xojo functions. But if those callbacks happen on threads that Xojo does not control, this can lead to crashes when our Xojo code then accesses Xojo objects. I've come up with a proposal to make this safe, effectively by using locks that suspect the callback task until Xojo is in a safe state. Apart from the possibility of creating a deadlock (which is under the control of the programmer), I was able to supply a demo project that yet has to be proven not to be stable. Xojo, however, ignores all my explanations and demonstrations and simply keeps telling their users that this can't ever be safe.
  • Related to callbacks, there's also a long-known issue with passing function addresses to the OS. This is done by creating a so-called delegate object via the AddressOf operator. The delegate object can be used inside Xojo like a function variable, i.e. one can store the address of a function in a property and call it later. This even works for object instance methods, i.e. methods that are part of an object and have a "self" reference. This self reference is simply a pointer ot the object that gets passed to it when it's invoked from an object. A delegate stores this object reference and passes it to the function if necessary. However, when passing such as delegate to a OS function (via Declare), then Xojo does not pass a pointer into the stub function that sets up the self reference but instead passes the target function address. That means that if the callback is invoked, the self reference is not properly set up, leading to a crash. The issue is known for long, and in the past bug reports of this kind have been closed as "works as designed". Since this could be fixed, the official answer sounds like an excuse for "we don't like to deal with it" or "we don't really care" to me.


All this shows that there are a lot of things in Xojo, mostly low level, that could be fixed - but they don't - because of a lack of comprehension. Sadly, in many cases where I offered solutions, even proofs, I hit a wall. I don't understand how a company that specifically caters to developers can be so ignorant to the needs and offerings of their willing customers.

22 April 2019

Performance considerations when reading directories on macOS

(Latest update: 11 May 2019, see end of text)

I'm developing (and selling) a fairly popular file search program for the Mac called Find Any File, or just FAF.

It works differently from Spotlight, the Mac's primary search tool, in that it always scans the live file system instead of using a database. This makes it somewhat slower in many cases, but has the advantage that it looks at every file on the targeted disk (whereas Spotlight skips system files by default, for instance).

My primary goal is to make the search as fast as possible.

Fast search built into macOS


Until recently, this went quite well, because Mac disks (volumes) were formatted in HFS+ (aka Mac OS Extended), and Apple provides a special file search operation (CatalogSearch or searchfs) for these volumes, by which FAF could ask for the file name the user is looking for, and macOS would search the volume's directory on itself and only return the matching files. This is very fast.

Unfortunately, with Apple's new file system APFS, and the fact that any macOS running High Sierra or Mojave got their startup volume converted from HFS+ to APFS, search performance has decreased by factor 5 to 6! Where searching the entire startup disk for a file like "hosts" did take just 5 seconds on a fast Mac with a HFS volume, it now takes half a minute or more on APFS.

Besides, the old network file server protocol AFP does also support the fast search operation, but only on real Mac servers - some NAS systems pretend to support this as well, but my experience shows that this is very unreliable. The newer SMB protocol, OTOH, does not appear to support searchfs.

Searching the classic way


When the searchfs operation is not available, unreliable or inefficient, FAF falls back to looking at every directory entry itself, looking for matches to the search, then looking at the contents of every subdirectory, and so on. This is called a recursive search (whereas searchfs performs a flat search over all directory entries of a volume).

There are several ways to read these directories. I'll list the most interesting ones:

  • -[NSFileManager contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:]
  • opendir() & readdir_r()
  • getattrlistbulk()
  • fts_open() && fts_read()

The first is the standard high-level (Foundation) function. It lets you choose which attributes (besides the file name) it shall fetch alongside. This is useful if you want to look at the file sizes, for instance. If you let them fetch along, it'll cache them in the NSURL objects, thereby increase performance if you call [NSURL getResourceValue] later to read the value.

readdir() is a very old UNIX / POSIX function to read a directory's file names, and nothing else, one by one.

getattrlistbulk() is a special Mac BSD function that's an extension to the older getattrlist(). It is supposed to be optimized for faster reading, as it can fetch the entire contents of a directory at once, along with attributes such as file dates and sizes. [NSFileManager contentsOfDirectoryAtURL...] supposedly uses this function, thereby making use of its performance advantage.

fts_open() is a long-existing BSD or POSIX function that is specialized on traversing directory trees. I've added this only after the initial tests, i.e. its discussion is a bit more brief below.

Test methods


I've tried to find out which of the various methods of reading directories, looking only for file names, is the fastest: I had to scan the same directory tree with every method separately.

Testing performance is a bit difficult because macOS tends to cache recently read directories for a short while. For instance, the first time I scan a directory tree with a 100,000 items, it may take 10s, and when I run the same test again withing a few seconds, it'll take only 2s. If I wait half a minute, it may again take 10s. And if I'm searching on a file server, that server may also cache the information in RAM. For instance, my NAS, equipped with Hard Disks, will be rather loud the first time I search on it, due to the HDs performing lots of seeking, whereas a repeat search will make hardly any noise due to little to no seeking, which also increases the search performance.

Therefore, I performed the tests twice in succession: Once after freshly mounting the target volume (so that the cache was clear) and once again right after. This would give me both the worst and best case performances. I repeated this several times and averaged the results.

I also had to test on different media (e.g. fast SSD vs. slower HD) and formats (HFS+, APFS, NTFS) and network protocols (AFP vs. SMB, from both a NAS and another Mac) because they all behave quite differently.

The Xcode project I used for timing all three scanning functions can be downloaded here.

Test results


Most tests were performed on macOS 10.13.6. The NAS is a Synology DS213j with firmware DSM 6.2.1, connected over 1 GBit Ethernet, and both AFP and SMB tests were made on the same NAS directory. The Terminal cmd "smbutil statshares -a" indicates that the latest SMB3 protocol was used. The remote Mac ran macOS 10.14.4, and the targeted directory on it was on a HFS+ volume so that I could compare performance between AFP and SMB (APFS vols can't be shared over AFP). I also did a few tests on the 10.14.4 Mac, though I only recorded the best case results as the others were difficult to create (I'd have had to reboot between every test and I wasn't too keen on that).

I was expecting that contentsOfDirectoryAtURL would always be as fast as its low level version getattrlistbulk, whereas readdir would be slower as it wasn't optimized for this purpose. Surprisingly, this was not always the case. (I did not include the fts method when I did this run of tests - its results will instead be discussed in a separate chapter below).

The values show passed time in seconds for completing a search of a deep folder structure. The values are only comparable in each line, but not across lines, because the folder contents were different. The exception are the network volumes, where the same folders were used for AFP and SMB.

The green fields point out the best results. The red one points out an anomaly.


contentsOfDirectoryAtURL
getattrlistbulk
opendir/readdir

worst casebest caseworst casebest caseworst casebest case
HD HFS+12.42.812.42.312.42.35
SSD HFS+4.92.84.62.264.62.47
SSD APFS11.210.6106.88.63.2
SSD NTFS28628684.7
10.14 APFS

12

10

8
10.14 HFS+

4.1

3.8

4
NAS via AFP5.62.54.82.145.62.7
NAS via SMB1515171595.7
Mac via SMB4.45.46.85.56.55
Mac via AFP5.33.65.13.75.94.3


Observations

  • HD vs. SSD shows that the initial search takes much longer on HDs, which makes sense because HDs have a higher latency. Once the data is in the cache, though, both are equally fast (which makes sense as well).
  • contentsOfDirectoryAtURL and getattrlistbulk perform equally indeed, just as predicted, with the latter usually being a bit faster once the data comes from the cache.
  • On APFS, NTFS and SMB, readdir() is significantly faster than the other methods, which is quite surprising to me.
  • SMB performance is worse than AFP (regardless, Apple declared AFP obsolete) in nearly all cases.
  • When accessing a Mac via SMB, contentsOfDirectoryAtURL is faster than the other methods, but only on the first run (see red field). Once the caches have been filled, it's slower. I can't make sense of it, but it's a very consistent effect in my tests.

The fts functions


fts_open() / fts_read() are, in most cases, faster than readdir()contentsOfDirectoryAtURL and getattrlistbulk. Exceptions are network protocols, where especially the retrieval of additional attributes makes it slower than the other methods.

Fetching additional attributes


When extra attributes such as file dates or sizes, are needed during the scan, the timing of the various methods changes as follows:

  • For contentsOfDirectoryAtURL and getattrlistbulk, there is little impact if these extra attributes are requested with the function call.
  • For readdir(), fetching additional attributes (through lstat()) turns it into the slowest method.
  • The fts functions are the least affected by getting attributes that are also available through the lstat() function if a local file system is targeted. However, for network volumes via AFP, they become about 20% slower in my tests, whereas getattrlistbulk stays faster.

Differences between macOS versions


When searching the same volumes (both HFS+ and APFS) from Sierra (10.12.6), High Sierra (10.13.6) and Mojave (10.14.4), I measure a consistent worse performance on Mojave. Meaning that scanning directories got slower in 10.14 vs. 10.13, by about 15%.

Also, getting additional attributes 10.12,  compared to 10.13 and later, takes about twice as long, across all methods. Which could mean that something improved in 10.13 regarding fetching attributes.

Conclusion


It appears that for optimal performance, I need to implement several methods, and select them depending on which file system or protocol I talk to.

Here's my current list of fastest method per file system:

  • HFS+: Always fts
  • APFS: Always fts
  • AFP: Always getattrlistbulk
  • SMB: If not attributes needed: readdir, otherwise fts or getattrlistbulk

Update on 29 Apr 2019


When traversing a directory tree, one must take care not to cross over into other volumes, which can happen if you encounted mounted file systems in your path, such as when you parse "/" into "/Volumes".

The safe way to check for this is to determine the volume a folder is on before you dive into it. To identify volumes is to get their volume or device ID. One way is to call stat(), then check its st_dev value, another is to get the NSURLVolumeIdentifierKey. Or, in the case of fts_read, it's already provided - which adds to its superior efficiency.

My testing shows an unpleasant performance impact, though:

When traversing with contentsOfDirectoryAtURL, calling stat() is less efficient than getting the value for NSURLVolumeIdentifierKey. That makes sense, because the stat() fetches more data, and that could cause additional disk I/O.

OTOH, the file system layer should know the ID of the volume without the need to perform disk I/O.

Meaning, getting the value for NSURLVolumeIdentifierKey should cost no significant time at all, because the information is known to the upper file system level, before even passing the request on to the actual file system driver for the parcular volume. Therefore the value should be readily available at a much higher level - regardless, fetching this volume ID takes about as much time as getting an actual value from the lowest level, such as a file size or date.

However, when I add fetching this volume ID to every encountered file & folder, the scan time increases by over 30%. Fortunately, for the scanning, one only has to fetch this value for directories, not for files, which makes this have a smaller overall impact. Still, the performance of this could be better if Apple engineering would consider this, I believe. After all, identifying  the volume ID is needed by almost any directory scanner.

Update on 11 May 2019


When discussing my findings on an Apple forum (actually, on one of the few remaining Apple mailing lists), Jim Luther pointed out to try enumeratorAtURL. And, indeed, this function does better than any of the others, at least with my tests on local disks, both on HFS+ and APFS. Like fts_read, it takes care of staying on the same volume, so that I do not have to check the volume ID myself.

I have updated my test project with the use of this function.

Comments, concerns?


Feel free to download the Xcode project and run your own tests.

Comments are welcome here or on Twitter to me: @tempelorg