handmade.network » Forums » Windows File Handling Problems
Mr4thDimention
Allen Webster
265 posts
2 projects

Heyo

#10853 Windows File Handling Problems
6 months, 2 weeks ago

I've been failing to solve a really basic problem for two years now.

The two main features I have been trying to get:

1. I need to maintain a set of files being edited that I can quickly query to see what is loaded and to get the buffer data structures associated. I need to be able to add and remove things from this set as the user loads files and kills buffers.

2. I need to get a notification any time one of the files in the set is edited and saved by an application other than my own.

3. I do not care if the system can tell that two different file names are the same file because of links/subst directories, but I do not want feature 1 or 2 to fail because of such a problem.

The various approaches I have taken that have all failed:

Approach 1.
At first I just stored a hash table of the loaded files, hashing them by name. In order to get file changes I used ReadDirectoryChangesW on a separate thread.

This fails because ReadDirectoryChangesW reports files by name, but that name does not necessarily match the name that was used for a key in the hash table. I set up a canonical way of representing names by making them all lower case except for the drive letter. It still failed whenever the files are loaded in a subst'ed directory, which I and many of my users use a lot.

Approach 2.
I tried getting hashes for files by combining nFileIndexHigh and nFileIndexLow to make unique identifiers for each file in the table.

This turned out to have a lot of problems. In particular the file index can change because of the way applications often save files by making a temp, deleting the original, and renaming the temp, according to some documentation out the index can change for other reasons too. So the table hashed by this method was completely unusable once the index is lost.

Maybe this could be improved by fixing the table whenever an event happens on the file that changes it's index, but in order to make that change the application needs the original id, and so it needs a way to look file ids up by name anyway. So it brings me right back to the original problem in Approach 1 of not having a way to store a table of file names because the names can be inconsistent.

Approach 3.
I tried abandoning the hash table and instead getting the file index every time I want to check for equality between two names. While this gave perfectly correct behavior, the need to query the set is high enough that it is too slow to keep.

Approach 4.
I tried opening file handles and keeping them open with some sort of read/write sharing settings in the hopes of being able to skip the CreateFile call which would speed it up pretty well, but that fails because read/write sharing settings apparently don't do exactly what I was expecting.

Approach 5.
Finally I tried building a system to handle subst and just say ignore the problem with link files. This way I can compare files by their names instead of by indexes which skips all the file creating slowness and allowed me to go back to using a hash table. To do this I just queried all the drive letters at start up to find how they were subst'ed. Then in the routine for creating canonical file names I replaced subst'ed directories with their C:\* equivalent. This worked, except that for some users querying the drives would cause a crash at startup for reasons I was never able to understand.

Thanks to anyone who can clear this mess up for me!
ratchetfreak
294 posts
#10854 Windows File Handling Problems
6 months, 2 weeks ago Edited by on Feb. 10, 2017, 10:40 p.m.

Have tried you periodically fstat ing (or GetFileAttributesEx or GetFileTime) each file you have open periodically along side the ReadDirectoryChangesW. This would guarantee that you pick up changes eventually.

Especially on refocus of the application this is nice and after user actions that could change the files (like running a command).
Mr4thDimention
Allen Webster
265 posts
2 projects

Heyo

#10855 Windows File Handling Problems
6 months, 2 weeks ago

Yes I've done stuff like supplementing ReadDirectoryChangesW with rolling through files to check their last save times, but the ReadDirectoryChangesW doesn't do any good if the filename issue is not resolved which is what I'm really getting at here.

One option is to abandon ReadDirectoryChangesW and just manually check file save times, which I used to do, and it was easier to get that to work properly. But it takes a whole lot more time. I could just trigger mass scans on refocus and after completing calls to commands. Which is what I might end up doing.
mmozeiko
Mārtiņš Možeiko
1441 posts
1 project
#10857 Windows File Handling Problems
6 months, 1 week ago

How different are file names reported ReadDirectoryChangesW? Couldn't you simply construct full path to file, convert it to lowercase (because NTFS is case insensitive) and store that as hash key?
Mr4thDimention
Allen Webster
265 posts
2 projects

Heyo

#10858 Windows File Handling Problems
6 months, 1 week ago

With subst directories I can have "W:\project\code\main.cpp" be the name I used to load the file, and the name that is hashed and stored, but "C:\work\project\code\main.cpp" is the name reported by ReadDirectoryChangesW.

I suppose you're suggesting I turn "W:\project\code\main.cpp" into "c:\work\project\code\main.cpp" and store that in the hash table as the "canonical" name?
Quarter
stephen goglin
20 posts

aka Quartertron

#10859 Windows File Handling Problems
6 months, 1 week ago

I am also doing some ReadDirectoryChangesW stuff in a side project.

If you can give me an exact-ish reproducer of why #2 fails I'll give it a look see.

None
Mr4thDimention
Allen Webster
265 posts
2 projects

Heyo

#10860 Windows File Handling Problems
6 months, 1 week ago

I don't have any code demonstrating the approach available right now. If you do try it, see what happens when you edit and save a file in visual studio. You should find that the hash of the file changes, and thus future queries into the table suggest that the file is not loaded even though it is.
mmozeiko
Mārtiņš Možeiko
1441 posts
1 project
#10861 Windows File Handling Problems
6 months, 1 week ago Edited by Mārtiņš Možeiko on Feb. 11, 2017, 2:05 a.m.

Ah, subst'ed directories. Not sure how the notifications interact with that, I haven't looked into that before. But yeah, before calculating hash you could get real location of file and use that as hash, so it will match string retrieved by ReadDirectoryChangesW. I think that would solve this issue. There should be a function that does this (maybe GetFinalPathNameByHandle or GetFileInformationByHandleEx+FileNameInfo).
Quarter
stephen goglin
20 posts

aka Quartertron

#10862 Windows File Handling Problems
6 months, 1 week ago

After a little investigation using my own code, process monitor and VS, I think I know what VS does.

Take this with a grain of salt since I'm no expert and I'm a little tipsy.

When you open a file, watch that file's directory.
When something else, like VS, does the whole new file, delete, rename, thing, you get the events for your idea of the file's path

FILE_ACTION_REMOVED,
FILE_ACTION_RENAMED_NEW_NAME,
and FILE_ACTION_MODIFIED

I tried watching the SUBST'd dir and the non SUBST'd (altering the other) and I always got the filename for the one I was actually watching.

Looking at the Process Monitor for VS I assume they do the same thing and actually open and read the file if they see that it has been "modified" or whatever, and if it differs from their internal buffer they prompt the user to read in the new contents.

Make sense?

None
anael
anaël seghezzi
19 posts
1 project
#10865 Windows File Handling Problems
6 months, 1 week ago

In CToy I use the stat function and "struct stat" (st_mtime) in a thread to know if a file was updated, it's cross-platform.

Used in this code :
https://github.com/anael-seghezzi/CToy/blob/master/src/ctoy_tcc.c