Question

Borrow checking for C struct field

Imagine that C API exposes opaque pointer to some data and two accessors to some field void set_string(struct foo*, const char*) and const char* get_string(struct foo*) and that documentation states something along the lines

The string returned by get_string is valid as long as the opaque pointer to foo is valid and no subsequent set_string call is made. Otherwise behavior is undefined

This is simple example of the header from C that exemplifies the issue

//foo.h
struct foo;
const char* get_string(struct foo*);
void set_string(struct foo*, const char*);

I'm wondering if and how is it possible to make rust borrow checker keep an eye on that external reference and avoid potential UB after generating bindings with bindgen. I have been experimenting around with

use foo::foo;
use std::marker::PhantomData;

struct Foo<'a> {
    pointer: *const foo,
    _phantom: PhantomData<&'a CStr>,
}

but it seems to be a dead end.

 3  138  3
1 Jan 1970

Solution

 4

So the documentation essentially tells you the value returned by get_string is borrowed from struct foo and set_string mutates it so needs a mutable borrow, you can do that without any lifetime in struct Foo on the Rust side, just wrap the calls to set|get_string in safe, abstracting methods:

// creating a module to contain the code that needs to be checked for soundness
mod foo_mod {
    use std::ffi::{c_char, CString, CStr};
    struct foo;
    pub struct Foo {
        // musn't be `pub`
        pointer: *mut foo,
    }
    impl Foo {
        pub fn set_string(&mut self, s: CString) {
            extern "C" {
                fn set_string(this: *mut foo, s: *const c_char);
            }
            // SAFETY:
            // this is safe assuming:
            // - `self.pointer` is always a valid pointer to a `struct foo`
            // - `set_string` in C does not deallocate `s` (unless by leveraging the appropriate Rust code to do so)
            unsafe { set_string(self.pointer, s.into_raw()) }
        }
        pub fn get_string(&self) -> &CStr {
            extern "C" {
                fn get_string(this: *mut foo) -> *const c_char;
            }
            // SAFETY:
            // this is safe assuming:
            // - `self.pointer` is always a valid pointer to a `struct foo`
            // - `char* get_string()` always returns a valid C string given a valid `struct foo` pointer
            unsafe {
                let ptr = get_string(self.pointer);
                CStr::from_ptr(ptr)
            }
        }
        pub fn new() -> Self {
            // this or any constructors must make sure to only create `Foo`s that contain a valid `pointer`
            todo!()
        }
    }
}

Playground


Note: Foo::set_string in Rust is safe under the stated assumptions, but it leaks the passed in CString everytime you use it. In a production environment that's unlikely what you want, but properly dealing with it depends on how Cs set_string expects the string to be passed as kmdreko notes.

2024-07-07
cafce25