Issue #255: BPIO binmode beta crashes

henrygab · July 28, 2025, 5:00pm

COBS should definitely not cause that amount of slowdown. There is likely something else at play.

There is a way to avoid the secondary buffer for COBS encoding.

In-place encoding concept

This presumes the buffer is large enough for the encoding overhead, max of which can be pre-calculated for any given message size. The following is pseudo-code … not actually tested.

uint8_t * buffer = ...;
// move the buffer foward for maximum overhead bytes
size_t overhead = calc_cobs_overhead(message_length);
// don't use memcpy ... pointers are restricted there
memmove((uint8_t*)buffer, ((uint8_t*)buffer)+overhead, message_length);

// walk each byte ... cpu-specific optimizations possible here
// such as reading / writing in native word size
uint8_t * next_zero = buffer; // may point up to 254 bytes back...
uint8_t * next_output = buffer + 1;
uint8_t * to_encode = buffer + overhead;
uint8_t bytes_until_next_zero = 1;
size_t bytes_encoded = 0;

while (bytes_encoded < message_length) {
    if (*to_encode == 0) {
        *next_zero = bytes_until_next_zero;
        next_zero = next_output;
        bytes_until_next_zero = 1;
        ++next_output;
        ++to_encode;
        ++bytes_encoded;
    } else if (bytes_until_next_zero == 254) {
        // overhead ... was a long stream without zero data
        *next_zero = 0xFFu;
        next_zero = next_output;
        bytes_until_next_zero = 1;
        ++next_output;
        // does not increase bytes_encoded nor to_encode
    } else {
        *next_output = *to_encode;
        ++next_output;
        ++to_encode;
        ++bytes_encoded;
        ++bytes_until_next_zero;
    }
}

    *next_zero = 0xFFu; // overhead ... not actually zero
    next_zero = next_data;
    bytes_until

}

</details>

I'll try to scan the API / docs of that library to see what they require for in-place encoding.  Most likely, it's just having the buffer large enough when it's first allocated, but maybe it also requires a memmove() before calling the encoding API.

I've got a full schedule today,  but if you have a supported version of JLink, Segger has some profiling tools that may be helpful to understand where the time is coming from.

[quote="robjwells, post:19, topic:1298"]
If you’re using 0x00 as the end-of-packet byte, I wonder if it would help to change it to something else?
[/quote]

Unfortunately, because COBS was originally defined to use zero byte as the sentinel, it seems few libraries expose the ability to use a different sentinel byte.  Thus, changing it would make it more difficult for clients (unfortunately).  Besides, more zeros makes it encode more space efficiently....

ian · July 28, 2025, 5:31pm

I’m not sure how to implement that, because I’m not sure how or how much memory flat buffers are allocating after the buffer is finalized. I did think about running it directly in on the reference to the final buffer to see what happens but imagine it will overrun something eventually.

One of the things on my list is to figure out how memory is being used and optimize the maximum read and write sizes.

Would be really thrilled if this wasn’t the slow down, but I didn’t change anything else. The code with COBS is much cleaner and straight forward, so less room for issues.