Handmade Network»Forums
105 posts
Architecture on macOS with 2 threads

Hello. It's been awhile since I've posted here, but I've just encountered an issue I'd be grateful for some advice on.

I started a new project, and I'm using the same overall architecture I used for a game I did some years ago, which made use of Core Video Display Link (https://developer.apple.com/documentation/corevideo/cvdisplaylink-k0k) for v-sync, which should result in a nice stable 60fps on any modern display.

This results in two threads that are running the app (not including auxiliary worker threads): the main thread that starts the application, and then the display link thread. For the most part, this was fine and worked well for my game. One issue I had to deal with, though, was input handling, since the method I used for that called a macOS API that must be used on the main thread only (NSApp.nextEvent(...)).

The issue was that the main thread and display link threads would run at different rates, and this wasn't going to work with handling input, especially when I have input buffering. So in order to make the two threads run at the same rate, I used a condition variable to make them run lock-step. This actually worked out really well, and I had no issues with it in my game.

In my new project, however, I'm using Dear ImGui for the UI, and it has to be updated from the main thread. So now I have two views in my application window, one is being updated from the display link thread, and the view with ImGui is being updated on the main thread. While it is working, and for the most part seems perfectly fine, I do have shared data between the two views, which will inevitably require some kind of synchronization (i.e. the ImGui view will update data that is read by the view on the display link thread).

Ideally I would like to avoid this by having both views being updated by the main thread, so I'm curious about other options to make this work.

One thought I've had is to move all work onto the main thread and use the display link thread solely as a "driver" of the main thread (i.e. use it to control its rate/speed). Perhaps I can still use a condition variable for that. Are there any other options, or better ways to do this?

Here is some code to illustrate the architecture I have right now. This is the main.swift file, that starts the application:

let cocoaApp = FSCocoaApp.initApp()

let window = MinuetWindow(title: "Minuet")

window.startDisplayLink()
window.makeKeyAndOrderFront(nil)

var running = true
while running {
    cocoaApp.processEvents() // Calls NSApp.nextEvent(...)
    if cocoaApp.isKeyPressed(keycode: Keycodes.vkEscape.rawValue) {
        running = false
    }
    
    window.syncWithMainThread()
    cocoaApp.resetInput()
}

window.stopDisplayLink()

cocoaApp.shutdown()

The window's syncWithMainThread() method updates the UI/ImGui view, and waits on a condition variable to be signalled by the display link thread:

func syncWithMainThread() {
    uiView?.update()
    
    pthread_mutex_lock(&frameLock)
    pthread_cond_wait(&frameEnd, &frameLock)
    pthread_mutex_unlock(&frameLock)
}

The display link thread calls this next method every time it needs a frame, updating the main view and then signalling the condition variable for the main thread:

private func getFrame(outputTime: UnsafePointer<CVTimeStamp>) -> CVReturn {
    let timeStamp = outputTime.pointee
    let dt = Double(timeStamp.hostTime - lastFrameTime) / CVGetHostClockFrequency()
    lastFrameTime = timeStamp.hostTime
    
    let input = FSCocoaApp.shared.getInput()
    minuetView?.update(input: input, dt: Float(dt))
    
    pthread_mutex_lock(&frameLock)
    pthread_cond_signal(&frameEnd)
    pthread_mutex_unlock(&frameLock)
    
    return kCVReturnSuccess
}
105 posts
Architecture on macOS with 2 threads

I've done a bit more experimenting on how to synchronize the two threads together in a way that would eliminate shared data between them. As I alluded to in the original post, one idea I had was to just use the display link thread to control the rate of the main thread, and have them sort of "ping pong" or run in lock-step fashion. This was the basic idea in pseudocode:

func mainThread() {
    waitOnConditionVariable()
    
    doMainThreadWork()
    
    signalConditionVariable()
}

func displayLink() {
    signalConditionVariable()
    
    waitOnConditionVariable()
}

The problem I was trying to solve here is the case of where the main thread ends up taking longer than the display link. In other words, the display link thread runs at a stable 60Hz, but what happens if the main thread takes longer. Shouldn't the display link thread know about that? At least that was my thinking at the time.

However, as one might expect, I ran into frequent deadlocks (about 50% of the time), especially when it came to starting the threads. While there is certainly work that could be done to prevent the deadlocks, I didn't feel the risk and the effort were worth it. I like simpler solutions. :)

So what I ended up doing was to essentially keep what I had before, and just move work I was doing on the display link thread to the main thread:

func mainThread() {
    doMainThreadWork()
    
    waitOnConditionVariable()
}

func displayLink() {
    signalConditionVariable()
}

The idea here is that even if the main thread is slow, it should end up running at a multiple of the display link's time (i.e. 16ms, 33ms, 50ms, etc.), so it should still, more or less, be synchronized with the display refresh rate. Here is a small snippet of the results of a timing test I ran where the main thread has work that can easily run at 60Hz:

"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.016754564 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.016655273 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.016653811999999997 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.016668004 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.016696715999999997 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.033617309 s"

So the display link thread is extremely stable, and the main thread follows very closely as well, but missed one frame. Since the main thread had practically no work to do in this test, I can only assume this has something to do with the thread's priority or something?

If I add an arbitrary sleep to the main thread to simulate work that takes > 16ms, I get these results:

"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.033330865 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.033367336000000004 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.033352483999999995 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.033612915 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[display link] dt = 0.016680414 s"
"[main thread] update work, dt = 0.049857011 s"

Again, very close, apart from missing one frame for some reason, but the main thread does essentially run at a multiple of the display link's rate.

I do wonder if the slight deviation of the "dt"s between the two threads would affect v-sync at all since it isn't quite as stable as the dedicated display link thread. I haven't done any substantial tests on it, but I think what I have is good enough for my application.

I'd be curious to know if anyone has any additional insight into this approach, or can see any potential issues with it? Although this seems like a pretty good solution for my app at this time, are there better or more robust solutions?