USB Mass storage and local FATFS conflicts

phdussud · April 10, 2024, 7:49pm

There is a fundamental problem in exposing the same storage to a local file system and to a USB Mass Storage. They can’t synchronize with each other. That’s because the USB Mass Storage caches the directory structure and files on the host side and only sends back sector modifications. If you create a file on the host, it won’t be seen on the BP5. ls won’t show it. Vice versa, if you create a file on BP5, you won’t see it on the host until the next restart of BP5. I believe this can lead to lost data and corrupted file system.
I did a little bit of research on the topic and the best remedy seems to be to stop USBMS before a local change on the file system takes place, then restart it after the change is complete. I think it may be acceptable to leave the USBMS endpoint running but simulate a read only file system during BP5 file change activity and then briefly stop and restart the USBMS end point at the end of the BP5 file change activity. Android picked a much more complicated solution, which is to forgo the USBMS solution and use the Media Transfer Protocol instead. I wouldn’t recommend going this way as it is a complicated solution and one that does not exactly emulate a file system on the host.

electronic_eel · April 10, 2024, 7:57pm

Yes, I’ve also noticed (but not yet researched in detail) this architectural problem.

If and how corruption occurs depends on the filesystem and blockdevice implementations on the host. So for example corruption could occur on Linux and not on Windows, or vice-versa. Or differ between operating system versions. It probably would also differ between the exact order of operations like copy, open, move you execute, how much free ram the system has and so on - so very hard to reproduce or debug.

I also think MTP would be a good replacement. It is widely used by mobile phones, so it is well supported and gives exactly the control necessary to prevent any concurrent access problems.

phdussud · April 10, 2024, 8:20pm

The problem with MTP is that it isn’t compatible with any app relying on file system APIs. Apps must be aware of the MTP endpoint and provide for it. On Windows, Explorer.exe knows about it but CLI (cmd, PS, mingw shells) will not interact with MTP devices. The second problem is that on Windows at least, you need a driver for it.
The advantage is that it is possible to have a truly seamless synchronization between the host view and the BP5 file system view.

ian · April 11, 2024, 9:09am

I’ve noticed some bugs here and there. MTP mode is an option, but seems like it would be a monumental project with a lot of downsides.

Interesting suggestion to stop and restart the MSD. I mentioned something similar in another thread. It’s worth looking into, it doesn’t seem like there’s another way to get the OS to purge the cache.

On the Bus Pirate side it’s easy enough to re-init FATFS, say on a timer after a MSD write. I created a folder on my drive. When I copy things to the BP from windows, I cd into the folder and then cd back, the new files are then visible and usable.

phdussud · April 11, 2024, 7:08pm

Note that some messages from MSD can help. I believe you can know when a write has been completed i.e. all blocks have been written to the device and the file system should be coherent at this moment.

ian · April 11, 2024, 7:25pm

The TinyUSB MSD used in the pico SDK just exposes 4 functions. Is it an ioctl command? I can add something for the switch stating there.

phdussud · April 11, 2024, 7:45pm

It is a weak callback that isn’t implemented
msc_device.h line 140:
// Invoke when Write10 command is complete, can be used to flush flash caching
TU_ATTR_WEAK void tud_msc_write10_complete_cb(uint8_t lun);

ian · April 11, 2024, 8:31pm

Nice, thank you. I’ll have a look.

phdussud · April 11, 2024, 10:45pm

It isn’t as nice as I thought. Write10 can result in many USB transactions. The write10_complete call back is called only once at the completion of the Write10 command, but an OS file operation can result in many Write10 commands. Even renaming a file under Windows Explorer will produce 2 Write10 commands, so we get 2 completion calls back.

ian · April 12, 2024, 8:43am

That with a timer before re init should at least improve things on the bus pirate side a bit.

grymoire · April 13, 2024, 2:35pm

I’m having another file sync issue. I downloaded the test.zip file for the tutorial, extracted test.tut. and copied the file onto the filesystem on my Linux machine. I did this while I was not connected to my BP.
I then connected to the BP, and the file was not there.

I used the BP format command to init the internal memory.
I tried again. That didn’t help

Lastly on my Ubuntu machine, I plugged in my BP10 (but did not connect to the interface).

I copied the tutorial file onto the filesystem mounted on my * machine.
I listed the directory and the files were there.
I used the Unix utility to unmount the file system.
I unplugged and replugged the BO.
The files were not there.

In other words, without connecting to the serial port of my BP, I cannot copy and retain files onto the rp2040 filesystem.

grymoire · April 13, 2024, 2:44pm

My BP8 is better but still buggy.

I am not connected to my BP8 but it is plugged in.

I copy a file onto the BP file system in Unix.
I connect to my BP8. The file is not there.
I type the “#” command
When I reconnect to tge BP8 - the file is there.

In other words, my BP8 needs to reconnect twice to see a file loaded from the host system.

phdussud · April 13, 2024, 3:35pm

This is a well-known problem at this point. This is because the host has its own copy of the directory blocks and will not refresh until the drive is unmounted/remounted. Conversely the file system of the BP5 has currently no knowledge of the writes to the flash coming from the host. It will be fixed but for now, follow the procedure you explained when you modify a file on the host side or the BP5 side (restart after file modification)

phdussud · April 13, 2024, 4:17pm

@grymoire: I want to add something. In the case you described, there are 2 connections but only one restart. The connections are not significant because they don’t reset any internal or host state. What made your file visible to BP5 is the fact that you restarted it. You can do this like you did or by USB unplug/replug.

ian · April 13, 2024, 4:43pm

I feel like if we open the terminal maybe it should go onto read only mode?

I wrote some code that latches to the write controls, bit it’s really dirty. I need to dig deeper into the rp2040 timers and how to properly reset them, the SDK doesn’t really provide that, it’s an issue you can see in the pirate.c main loop with the screen saver timer.

phdussud · April 13, 2024, 5:08pm

@ian I believe it would be sufficient that we go into read-only mode when a file gets open in write mode on the BP5 side, then eject/reinsert on close. I have been experimenting with this part and it isn’t that simple, but the code is fairly isolated, if we use new pirate APIs for open and close .
I presume you have been working on changes from the host and how to reset the local file system. I haven’t thought about it yet, but I think it is trickier as you said.

Gabri74 · April 14, 2024, 10:24am

Nice to see this thread, I was thinking about posting something really similar for a couple of days

Completely agree to the problem analysis and that MTP is the only safe long (long long?) solution.

Also I think we should force a FAT sync from FatFs when rebooting and maybe add an user command fsync to force sync when user adds a file from PC to BP.

A possible implementation could be:

make sync_fs() in ff.c non-static
add a storage_fsync() which calls sync_fs from user command fsync
call sync_fs() also from storage_unmount() and then call storage_unmount where needed (before reboot, jump to bootloader, etc).
Something like this:

extern FRESULT sync_fs(FATFS* fs);

void storage_fsync(void)   // <-call this from to-be-created 'fsync' command
{
    printf("Storage sync\r\n");
    sync_fs(&fs);
}

void storage_unmount(void)  // --> call this before reboot and jump to boot
{
    storage_fsync();
    system_config.storage_available=false;
    system_config.storage_mount_error=0;
    printf("Storage removed\r\n");
}

Anonymous · April 15, 2024, 5:18am

Temporary pain relief is to disable write caching to the drive. I think MacOS does this befault for external drives dangling from a wire because unplugging that wire results in data loss.

Linux users can mount the BusPirate with the ‘sync’ option. That’ll reduce the number of outstanding WRITE1{0,2,6}'s in flight and should eliminate long-standing pools of buffers, such as the inode table and superblock, which is what some peope above are (unknowingly) reporting.

As it always the case on Linux, it depends on the version and the distro (and I’d try the tips unless the context just really doesn’t make sense on whatever distro you’re using) but users may generally find the following inspirational:

How to switch off caching for usb device when writing to it? - Ask Ubuntu (I can’t speak to modern Linux, but hdparm used to be for IDE/Sata drives only and not USB…No harm in trying. Answer #11 is likely to be the key.)
Write cache problems when writing files to a USB stick - Linux Mint Forums (maybe telling the kernel to knock it off helps, but I don’t know how that recipe doesn’t impact main system writes.)
usb drive - How do I make usb sticks not ever use any write cache? - Ask Ubuntu
…and similar. You “just” need to tell the OS to quit holding USB-bound data in its buffer bladder and just let it flooooow over the wire.

Note that write performance will be even more terrible, but on an ESP32 and Pi2040, the only available speed is the puny USB 1.1 speed of 12Mbit/sec so it may be so bad you don’t even really notice it.

phdussud · April 15, 2024, 3:58pm

This does not help. Windows automatically disables caching for removable USB drives. The problem is that the host pushes the block updates to the storage layer, but the BP5 file system does not know about the changes. Conversely, when the BP5 file system write to the storage layer, the host does not notice because it won’t refresh its local view of the storage. This isn’t supported by the USBMS. The only way to convince the host to refresh is to eject/insert t the USB storage. This is somewhat supported by USBMS but I found out that you need to unmount the USB storage and leave it unmounted until the host issues a query for storage availability and notices the media ejection then insert the media again. Here ejection and insert are simulated by sofware of course.

phdussud · April 15, 2024, 5:45pm

Sorry @Anonymous I should say that it is necessary to disable caching at the OS level but it isn’t sufficient to avoid the problem.