Bus Pirate 5 resetting itself, causing storage corruption

See PR #90. Windows builds currently fail, so I adjusted the workflow’s strategy to fail-fast: false, so the other builds will complete.

Linux build succeeded, with artifacts available.

I verified the BP5 REV10 can format, save config changes, and rejects all writes from the host.

Can you please try the binary from the linked GitHub build artifact?
(It’s just a ZIP file with the two built firmwares…)

Changes of note:

  1. Removed the 16-block / 512-byte demo disk from ever being reported. Not sure why it was there for REV10, but I’m guessing it’s related.

  2. Always reports the media as being read-only to the host. If corruption continues, then it’s 100% internal to the firmware.

While the overall goal is to see if this prevents the corruption, also of interest:
A. Whether the device resets also disappear as a result. (e.g., [32712.675448] usb 3-3.4.1.4.4: reset full-speed USB device number 86 using xhci_hcd)
B. Whether there is any 8k capacity volume reported … ever.

Note: I’m not part of DangerousPrototypes … just a contributor.

2 Likes

Thanks so much for the work, especially as a contributor. I really appreciate that!

I’ll pull the artifacts tomorrow morning and do some testing and report back.

2 Likes

Downloaded: artifact

Installed: Minicom

Crashed Minicom:

Please Hold…

1 Like

I tried to replicate tonight. Great detail @mbrugman and @henrygab. Unfortunately I’m going to have to call it a night just as I got minicom to connect. I’ll try again tomorrow if I can get to it before someone else does. Wish I could have offered more assistance, but we will see what tomorrow brings.

1 Like

tl;dr - turning off flash write from host did not solve corruption problem (which is kind of a good thing, amirite? :wink: )

I pulled the artifact and loaded it. Verified by doing an info:

Bus Pirate 5 REV10
Firmware main branch @ unknown (2024-09-18T22:26:05Z)
RP2040 with 264KB RAM, 128Mbit FLASH
S/N: 264235D3012961E4
https://BusPirate.com/
Storage:   0.10GB (FAT16 File System)

Connected it to my Linux box, verified in the kernel log that initial Flash is read only:

[177457.364177] usb 3-3.4.1.4.3: new full-speed USB device number 118 using xhci_hcd
[177457.507467] usb 3-3.4.1.4.3: New USB device found, idVendor=1209, idProduct=7332, bcdDevice= 1.01
[177457.507490] usb 3-3.4.1.4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[177457.507497] usb 3-3.4.1.4.3: Product: Bus Pirate 5
[177457.507503] usb 3-3.4.1.4.3: Manufacturer: Bus Pirate
[177457.507508] usb 3-3.4.1.4.3: SerialNumber: 264235D3012961E4
[177457.533790] cdc_acm 3-3.4.1.4.3:1.0: ttyACM0: USB ACM device
[177457.535591] cdc_acm 3-3.4.1.4.3:1.2: ttyACM1: USB ACM device
[177457.536111] usb-storage 3-3.4.1.4.3:1.4: USB Mass Storage device detected
[177457.536419] scsi host1: usb-storage 3-3.4.1.4.3:1.4
[177458.555996] scsi 1:0:0:0: Direct-Access     BP5      Storage          1.0  PQ: 0 ANSI: 2
[177458.557051] sd 1:0:0:0: Attached scsi generic sg1 type 0
[177458.565833] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[177458.566655] sd 1:0:0:0: [sdb] Write Protect is on
[177458.566663] sd 1:0:0:0: [sdb] Mode Sense: 03 00 80 00
[177458.567382] sd 1:0:0:0: [sdb] No Caching mode page found
[177458.567386] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[177458.591213]  sdb: sdb1

Trust, but verify: checked that BP storage was mounted and that I could not create, modify, or delete anything on that volume.

In minicom, I set up SPI mode (which creates the config file for it). Saw some strangeness in the kernel log, but it worked as expected and did create the correct config file:

[177532.650267] sd 1:0:0:0: [sdb] Sense Key : Illegal Request [current] 
[177532.650273] sd 1:0:0:0: [sdb] Add. Sense: Invalid command operation code
[177547.829317] sd 1:0:0:0: [sdb] Sense Key : Illegal Request [current] 
[177547.829322] sd 1:0:0:0: [sdb] Add. Sense: Invalid command operation code
[177547.867596] sd 1:0:0:0: [sdb] Sense Key : Illegal Request [current] 
[177547.867602] sd 1:0:0:0: [sdb] Add. Sense: Invalid command operation code
[177547.868558] sdb: detected capacity change from 191296 to 0
[177547.876969] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[177547.879407] sdb: detected capacity change from 0 to 191296  

I was sitting idle in minicom after that and saw the detach/attach messages in the kernel log:

[177587.199292] usb 3-3.4.1.4.3: reset full-speed USB device number 118 using xhci_hcd
[177587.341601] cdc_acm 3-3.4.1.4.3:1.0: ttyACM0: USB ACM device
[177587.341985] cdc_acm 3-3.4.1.4.3:1.2: ttyACM1: USB ACM device
[177587.353480] sdb: detected capacity change from 191296 to 0
[177587.360928] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[177587.363700] sdb: detected capacity change from 0 to 191296
[177591.080420] sdb: detected capacity change from 191296 to 0
[177591.398629] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[177591.400982] sdb: detected capacity change from 0 to 191296
[177591.430639]  sdb: sdb1

I went ahead and did the flash read -f flash.bin command anyway:

SPI> A 0
IO0 set to OUTPUT: 1

SPI> A 1
IO1 set to OUTPUT: 1

SPI> W
Power supply
Volts (0.80V-5.00V)
x to exit (3.30) > 
Maximum current (0mA-500mA), <enter> for none
x to exit (none) > 
3.30V requested, closest value: 3.30V
Current limit:Disabled

Power supply:Enabled
Vreg output: 3.3V, Vref/Vout pin: 3.3V, Current: 2.6mA

SPI> flash read -f flash.bin
Probing:
                Device ID       Manuf ID        Type ID         Capacity ID
RESID (0xAB)    0x16
REMSID (0x90)   0x16            0xc8
RDID (0x9F)                     0xc8            0x40            0x17

SFDP (0x5A): found 0x50444653 "PDFS"
 Version: 1.0
 Headers: 2

**Param Table 0**
                Type            Ver.    Length  Address
Table 0         JEDEC (0x00)    1.0     36      0x000030

MSB-LSB  3    2    1    0
[0001] 0xFF 0xF1 0x20 0xE5
[0002] 0x03 0xFF 0xFF 0xFF
[0003] 0x6B 0x08 0xEB 0x44
[0004] 0xBB 0x42 0x3B 0x08
[0005] 0xFF 0xFF 0xFF 0xEE
[0006] 0xFF 0x00 0xFF 0xFF
[0007] 0xFF 0x00 0xFF 0xFF
[0008] 0x52 0x0F 0x20 0x0C
[0009] 0xFF 0x00 0xD8 0x10

Density: 8388608 bytes
Address bytes: 3
Write granularity:>=64B
Write Enable Volatile: 0
Write Enable instruction: 0x50
4K erase instruction: 0x20

Fast read:      1-1-2   1-1-4   1-2-2   1-4-4   2-2-2   4-4-4
Instruction:    0x3b    0x6b    0xbb    0xeb    --      --
Wait states:    8       8       2       4       0       0
Mode clocks:    0       0       4       4       0       0

Erase:          1       2       3       4
Instruction:    0x20    0x52    0xd8    0xff
Size:           4K      32K     64K     1B

**Param Table 1**
                Type            Ver.    Length  Address
Table 1         manuf (0xc8)    1.0     12      0x000060

MSB-LSB  3    2    1    0
[0001] 0x27 0x00 0x36 0x00
[0002] 0x64 0x77 0xF9 0x9E
[0003] 0xFF 0xFF 0xEB 0xFC

VCC min: 2700mV
VCC max: 3600mV
/Reset pin: -
/Hold pin: Y
Deep Power Down (DPDM): Y
SW reset: Y (instruction 0x99)
Suspend/Resume program: Y
Suspend/Resume erase: Y
Wrap Read mode: Y (instruction 0x77, length 100)
Individual block lock: - (nonvolatile -, instruction 0xff, default 0)
Secured OTP: Y
Read lock: -
Permanent lock: Y

Initializing SPI flash...
Flash device manufacturer ID 0xC8, type ID 0x40, capacity ID 0x17
SFDP V1.0, 1 parameter headers
                Type            Ver.    Length  Address
Table 0         JEDEC (0x00)    1.0     36B     0x000030
JEDEC basic flash parameter table info:
MSB-LSB  3    2    1    0
[0001] 0xFF 0xF1 0x20 0xE5
[0002] 0x03 0xFF 0xFF 0xFF
[0003] 0x6B 0x08 0xEB 0x44
[0004] 0xBB 0x42 0x3B 0x08
[0005] 0xFF 0xFF 0xFF 0xEE
[0006] 0xFF 0x00 0xFF 0xFF
[0007] 0xFF 0x00 0xFF 0xFF
[0008] 0x52 0x0F 0x20 0x0C
[0009] 0xFF 0x00 0xD8 0x10
4 KB Erase is supported throughout the device (instruction 0x20)
Write granularity is 64 bytes or larger
Flash status register is non-volatile
3-Byte only addressing
Capacity is 8388608 Bytes
Flash device supports 4KB block erase (instruction 0x20)
Flash device supports 32KB block erase (instruction 0x52)
Flash device supports 64KB block erase (instruction 0xD8)
Found a GigaDevice  flash chip (8388608 bytes)
Flash device reset success
Dumping to flash.bin...
[o o o o o o o o o o ]

It finished right away, which is a bad sign. I did an ls and saw corrupted flash:

SPI> ls
   <DIR>   ��������.���
   <DIR>   ��������.���
   <DIR>   ��������.���
   <DIR>   ��������.���
 <<<<<  snipped out 500-some entries of the same >>>>>
   <DIR>   ��������.���
   <DIR>   ��������.���
   <DIR>   ��������.���
   <DIR>   ��������.���
512 dirs, 0 files

SPI> 

Tried to format the flash, got the usual error I’ve been seeing:

SPI> format
Erase the internal storage?
y/n> y
Are you sure?
y/n> y

Formatting...
Error: filesystem not foundError: Format failed...
Error: filesystem not found
SPI> 

A question is why didn’t I see this yesterday when I was doing the testing with automount turned off? That is a good questions - maybe the conditions that cause this just didn’t line up that time :man_shrugging:

Anyway, I think this is good news, right? At least it is contained on the device and not a Linux-related thing.

For reference, I do own a pico probe. I saw the thread on the 3rd com port and saw this as an alternative, so I can help remotely debug that way, too.

As always, thanks the community for the help and response!

1 Like

Yes, definitely! :slight_smile:

I’m starting to think this is either a power issue, or a NAND with bad blocks (and maybe that dhara does not detect them as bad … which I thought dhara handled). Since the local, on-BP5 format fails once this occurs, I’m guessing the cause puts the NAND chip into a bad state (whether it was power fluctuation or bad sectors or other). [I suppose it could also be a runaway pointer corrupting some critical memory area… but I think that’s less likely at the moment.]

On the off chance that these have not been tried already:

  1. Can you try different USB ports?
  2. Can you try an overpriced and/or brand-name USB cable? (less expensive cables, with their thinner wires, sometimes have too much voltage drop to reliably support higher power devices… so common that Proxmark 3 (RFID research tool) support is almost always starting with swapping from one good USB cable to a second different good USB cable.
  3. Can you plug the BP5 into a powered USB hub, so it draws power from the hub instead of the PC?

Finally, when this next occurs, can you attach a disk image of the exposed flash volume:

dd if=/dev/sdb1 of=~/sdb1_backup.img

I’d like to look at the actual FAT structures … it might hint at the problem.

3 Likes

That makes a lot of sense (I’ve seen similar issues with other devices like the proxmark, now that you mention it)

I have changed USB ports so it plugged directly into a powered USB hub. I’ve tried a couple of different cables, including “name branded” cables.

As of now, it is plugged into the powered USB hub (powered by a USB-C PD port connected to a 45W supply). The cable is a 1 meter cable that came with a McBook, so I’m assuming that’s good quality.

I did a self test, and it’s reporting a USB voltage of 4.89V (which according to the test is a “pass”):

HiZ> ~
SELF TEST STARTING
DISABLE IRQ: OK
ADC SUBSYSTEM: VUSB  4.89 OK
....

First, when I plugged it in this morning (in a corrupt state), the sdb1 partition didn’t show in lsblk:

loop24        7:24   0 181.5M  1 loop /snap/#############
sda           8:0    1     0B  0 disk 
sdb           8:16   1     8K  0 disk 
mtdblock0    31:0    0    32M  1 disk 
mtdblock1    31:1    0     4M  0 disk 
nvme0n1     259:0    0 931.5G  0 disk 
├─nvme0n1p1 259:1    0 892.4G  0 part /#############

I thought that was interesting. I looked at the partion table in fdisk, and here’s what I saw:

Command (m for help): p

Disk /dev/sdb: 8 KiB, 8192 bytes, 16 sectors
Disk model: Storage         
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Kind of interesting, right? Matches the 8K we saw in the kernel log. I mounted it, and there was a single text file on it:

No storage mounted.
Kind regards,
Ian and Chris

https://buspirate.com/

I haven’t taken any time to look at the firmware, but I’m guessing it’s a “dummy” drive presented as a mass storage device to the host when internal NOR is hosed. Explains a few things.

I unplugged/replugged and sdb1 was there:

sda           8:0    1     0B  0 disk 
sdb           8:16   1  93.4M  0 disk 
└─sdb1        8:17   1  93.3M  0 part /media/matty/BUS_PIRATE5
mtdblock0    31:0    0    32M  1 disk 

In Linux, the volume mounted but showed as empty. In minicom, an ls of it showed the bad 512 directories seen in an earlier post. I dd’d the sdb1 partition to the host and took a quick look with hexdump. Figured I could save zipping and transferring a 93 M file:

hexdump -C sdb1.bin      
00000000  eb fe 90 4d 53 44 4f 53  35 2e 30 00 08 04 01 00  |...MSDOS5.0.....|
00000010  01 00 02 91 ba f8 38 00  3f 00 ff 00 3f 00 00 00  |......8.?...?...|
00000020  00 00 00 00 80 01 29 00  00 21 50 4e 4f 20 4e 41  |......)..!PNO NA|
00000030  4d 45 20 20 20 20 46 41  54 20 20 20 20 20 00 00  |ME    FAT     ..|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000800  f8 ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000810  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00017800  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00018000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0001c800  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
0001d000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00020800  7b 0a 22 62 61 75 64 72  61 74 65 22 3a 20 34 30  |{."baudrate": 40|
00020810  30 30 30 30 30 2c 0a 22  64 61 74 61 5f 62 69 74  |00000,."data_bit|
00020820  73 22 3a 20 38 2c 0a 22  73 74 6f 70 5f 62 69 74  |s": 8,."stop_bit|
00020830  73 22 3a 20 30 2c 0a 22  70 61 72 69 74 79 22 3a  |s": 0,."parity":|
00020840  20 30 2c 0a 22 63 73 5f  69 64 6c 65 22 3a 20 31  | 0,."cs_idle": 1|
00020850  0a 7d 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |.}..............|
00020860  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00021000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
05d48800

I see the contents of the SPI config file are still there, but not the filename. It’s been quite a while since I’ve dug into the cornflakes of FAT, so I’ll leave it to y’all. If you want the full file, I can send it, but I think this should do.

Thanks so much for all the help and interest!
Matt

2 Likes

Perfect. Apple’s cables tend to be of significantly higher quality than most other cables. The powered USB hub will help negate some of the power issues.

Yes, that’s correct … it’s a dummy drive with static content. For Rev8 (SD card), maybe someone added this to the firmware to prevent / undo the effects of the Windows OS feature of hiding drive letters when no media is installed?

I tried to remove this feature in the firmware I provided, but seems either I failed, or I pointed you to older artifacts.

Thank you for the hexdump. Unfortunately, that doesn’t even show an initialized FAT file system. Lots of zero where I’d expect there to be a FAT table, bitmaps, root directory entries, etc. … The 16-bytes of 0xFF might be a FAT table … but much too small … maybe the right size for an 8k fake disk? It should never be that small for the full 1GiB NAND. Makes me think the method currently used to “surprise remove” the media isn’t being noticed by the Linux system, and thus that cached data from the 8k fake volume isn’t being invalidate (and thus may be improperly used).

@ian … Does the storage code report a UNIT ATTENTION error at least once, and consistently report NOT_READY/MEDIUM_NOT_PRESENT when pretending the media is not available (e.g., for the ejection / reinsertion of media)? It’s absolutely mandatory that the host become aware of the media change for it to invalidate its cached data … the above behavior might be explainable by “mix-n-match” of the 8k fake volume with the actual NAND contents.

@ian … As to the potential cause being power… In my rambling code review while trying to add a third COM port, I noticed the following line:

  TUD_CONFIG_DESCRIPTOR(1, ITF_NUM_TOTAL, 0, CONFIG_TOTAL_LEN, 0x00, 100),

That last argument to the macro set the power requirements to 100mA.

Should that be 500ma? I know the BP5 has a USB-C port, but given the chip, I think it’s just a choice of physical connector, while still being USB2 under the covers… is that right? If so, could this be resulting in a highly-conforming upstream USB port limiting the BP5 power to 100mA?

Note that I’m quite ignorant about the BP5’s power monitoring / etc. … Therefore, I’m just trying to exclude insufficient power as a potential cause when erasing/programming the NAND and dumping external SPI NAND chips, as that seems to be a common thread (or exacerbates / increases likelihood of exposing this problem). May also explain the bus resets initiated by the host…

2 Likes

I went back to the latest release version - I was having some issues with the test version. I’m sorry I didn’t mention that.

1 Like

I opened issue on github for the USB descriptor indicating 100mA (instead of 500mA or more).

There is definitely something awry and lurking, causing the NAND chip to end up in a bad state. Without a better logging framework (whether RTT or 3rd CDC port), I’m unsure of next steps. I predict this issue will continue to raise its head. Anyone?

2 Likes

One way would be to measure the current on the USB to get a realistic figure of the current consumption. What do you think?

1 Like

Great place to start. It would give a quick sanity check.

I connected my BP5 to a USB current test and I monitored the peak current using a Fluke 289. I didn’t load the Vout but I turned it on (3.3V). Idle, the peak consumption is 103mA. I then transferred some files from the PC to the BP5. The new peak is 132mA.
These figures are not too far off the USB descriptor value. I think it should be adjusted to 150mA. However, I am not sure that it matters much. I know of no laptop and no motherboard that limit the USB current based on the USB descriptor current value. The laptops I own, and I have opened have resettable fuses that limit the current north of 500mA for USB 2 sockets and north of 2A for USB 3.1 sockets.

1 Like

For folks following this thread, if you’ve hit this issue in any somewhat reliable way, please try out the firmware from:

Looking for data: Is it no longer occurring? Does it occur just as often? Is it now only occurring sometimes? etc.

Thanks!

Just at the airport on my way home. Should be about to test tomorrow.

Thanks much!!

1 Like

I was able to load and test this version today.

tl;dr: this version behaved fairly well, with a few hiccups, and I did not see any flash storage corruption.

First, to verify I’m running the correct version:

Bus Pirate 5 REV10
Firmware main branch @ 9c0bfa4 (2024-10-01T21:28:04Z)
RP2040 with 264KB RAM, 128Mbit FLASH
S/N: 264235D3012961E4
https://BusPirate.com/
Storage:   0.10GB (FAT16 File System)

Plugging the BP in to my Linux boxen:

[206813.157809] usb 3-3.4.1.4.3: new full-speed USB device number 36 using xhci_hcd
[206813.300190] usb 3-3.4.1.4.3: New USB device found, idVendor=1209, idProduct=7332, bcdDevice= 1.01
[206813.300208] usb 3-3.4.1.4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[206813.300216] usb 3-3.4.1.4.3: Product: Bus Pirate 5
[206813.300221] usb 3-3.4.1.4.3: Manufacturer: Bus Pirate
[206813.300225] usb 3-3.4.1.4.3: SerialNumber: 264235D3012961E4
[206813.325249] cdc_acm 3-3.4.1.4.3:1.0: ttyACM0: USB ACM device
[206813.327114] cdc_acm 3-3.4.1.4.3:1.2: ttyACM1: USB ACM device
[206813.327835] usb-storage 3-3.4.1.4.3:1.4: USB Mass Storage device detected
[206813.328334] scsi host1: usb-storage 3-3.4.1.4.3:1.4
[206814.352551] scsi 1:0:0:0: Direct-Access     BP5      Storage          1.0  PQ: 0 ANSI: 2
[206814.353764] sd 1:0:0:0: Attached scsi generic sg1 type 0
[206814.363294] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[206814.364217] sd 1:0:0:0: [sdb] Write Protect is off
[206814.364234] sd 1:0:0:0: [sdb] Mode Sense: 03 00 00 00
[206814.365768] sd 1:0:0:0: [sdb] No Caching mode page found
[206814.365780] sd 1:0:0:0: [sdb] Assuming drive cache: write through
[206814.388385]  sdb: sdb1
[206814.388786] sd 1:0:0:0: [sdb] Attached SCSI removable disk

All looked good; the BP volume mounted in read/write mode (as implied in the log). I was able to create a simple text file on the volume from the host.

Then I connected to the BP with minicom:

[206868.696145] sdb: detected capacity change from 191296 to 0
[206868.737574] sd 1:0:0:0: [sdb] tag#0 access beyond end of device
[206868.737610] I/O error, dev sdb, sector 252 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[206868.737624] Buffer I/O error on dev sdb1, logical block 0, lost sync page write
[206870.878523] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[206870.880347] sd 1:0:0:0: [sdb] Write Protect is on
[206870.880365] sd 1:0:0:0: [sdb] Mode Sense: 03 00 80 00
[206870.883195] sdb: detected capacity change from 0 to 191296
[206870.893722]  sdb: sdb1
[206959.648352] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[206979.021349] usb 3-3.4.1.4.3: reset full-speed USB device number 36 using xhci_hcd
[206979.166060] cdc_acm 3-3.4.1.4.3:1.0: ttyACM0: USB ACM device
[206979.167045] cdc_acm 3-3.4.1.4.3:1.2: ttyACM1: USB ACM device
[206979.190234] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[206979.190994] sd 1:0:0:0: [sdb] Write Protect is off
[206979.190996] sd 1:0:0:0: [sdb] Mode Sense: 03 00 00 00
[206979.917774] sdb: detected capacity change from 191296 to 0
[206980.209789] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[206980.210549] sd 1:0:0:0: [sdb] Write Protect is on
[206980.210553] sd 1:0:0:0: [sdb] Mode Sense: 03 00 80 00
[206980.212214] sdb: detected capacity change from 0 to 191296
[206980.222518]  sdb: sdb1

There was en error with access beyond end of device. I tried the process several times, but only saw those errors on the first try.

In minicom, I could do an ls and could configure SPI with no problem. In minicom, I did a flash read -f test.bin and the contents of the connected flash chip were stored on the BP’s internal storage.

Interestingly, the volume did not auto re-mount on the Linux host after connecting minicom. I could mount it manually with sudo mount /dev/sdb1 ./mnt and it mounted read-only:

┌──(matty💊s76)-[~/data/projects/cypherCon8.0/flash]
└─$ sudo mount /dev/sdb1 ./mnt
mount: /home/matty/data/projects/cypherCon8.0/flash/mnt: WARNING: source write-protected, mounted read-only.

From there I could copy files off of storage as expected.

There was an xhci_hcd reset along the way. Minicom was connected; it showed a temporary disconnect, then reconnected. Again, the storage didn’t automount on the host, but was mountable manually.

[207131.641121] usb 3-3.4.1.4.3: USB disconnect, device number 36
[207132.901039] usb 3-3.4.1.4.3: new full-speed USB device number 37 using xhci_hcd
[207133.045192] usb 3-3.4.1.4.3: New USB device found, idVendor=1209, idProduct=7332, bcdDevice= 1.01
[207133.045216] usb 3-3.4.1.4.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[207133.045227] usb 3-3.4.1.4.3: Product: Bus Pirate 5
[207133.045234] usb 3-3.4.1.4.3: Manufacturer: Bus Pirate
[207133.045240] usb 3-3.4.1.4.3: SerialNumber: 264235D3012961E4
[207133.070683] cdc_acm 3-3.4.1.4.3:1.0: ttyACM0: USB ACM device
[207133.072329] cdc_acm 3-3.4.1.4.3:1.2: ttyACM1: USB ACM device
[207133.073012] usb-storage 3-3.4.1.4.3:1.4: USB Mass Storage device detected
[207133.073512] scsi host1: usb-storage 3-3.4.1.4.3:1.4
[207134.101619] scsi 1:0:0:0: Direct-Access     BP5      Storage          1.0  PQ: 0 ANSI: 2
[207134.102614] sd 1:0:0:0: Attached scsi generic sg1 type 0
[207134.104449] sd 1:0:0:0: [sdb] Media removed, stopped polling
[207134.105403] sd 1:0:0:0: [sdb] Attached SCSI removable disk
[207136.226572] sd 1:0:0:0: [sdb] 47824 2048-byte logical blocks: (97.9 MB/93.4 MiB)
[207136.228450] sd 1:0:0:0: [sdb] Write Protect is on
[207136.228475] sd 1:0:0:0: [sdb] Mode Sense: 03 00 80 00
[207136.231059] sdb: detected capacity change from 0 to 191296
[207136.241568]  sdb: sdb1

One other thing of note: When I disconnected minicom (but left the BP USB cable connected), the storage volume remained mounted as read-only; it didn’t go back to read-write. This was also reflected by no output in the kernel log when disconnecting minicom (the logical serial connection to /dev/ttyACM0). If I manually unmount with sudo umount /dev/sdb1 and remount, it is still read-only.

The last thing I noted is that the first time I connected using minicom there was a slight delay after opening the USB serial port and it accepting input from the terminal; I’m guessing there’s some blocking process going on when the storage is changing from read-write on the host to read-only.

So, just the slight hiccups where the volume doesn’t always automount on the host. I didn’t see any instances where the “fake” 8K volume appeared; it was always the full 98MB.

From my point of view, this is a huge improvement! :slight_smile: I don’t mind if I have to manually mount the volume on the host, as long as the storage remains uncorrupted. I know not everyone may feel that way… :man_shrugging:

Thanks again for all of the work!

1 Like

Thank you for the great logs and explanations of your processes!

The write beyond end of device error indicates the host was writing to the volume (for whatever reason) when the media was surprise-removed. Until we move to the new architecture (no multi-initiators when the volume is writable by either), there will continue to be edge cases which we cannot really fix.

I don’t know enough about the various linux automount options to know specifics, but my guess is some additional Sense/ASC/ASCQ might be needed.

As soon as RTT is implemented, the first debug channel I want to add is one that dumps all the command blocks sent to the NAND, and the Sense/ASC/ASCQ results … sort of a mini-bustrace. the system logs are helpful.

Tagging @phdussud … as he updated the code, and may have better thoughts.

:tada: Me too! Good enough, imho, that the next focus should be the major architectural change (single-initiator when the volume is writable … either host or firmware). :tada:

2 Likes

@phdussud pushed an additional fix addressing storage issues, feedback appreciated.

2 Likes

Hello, can you still reproduce this in some specific environment? I think regardless of the cause on BP side this sounds like some serious race condition in minicom itself