Hey! This is my second post, technically, but it's my first real one. (Just gonna ignore the t3d post.)
I also haven't used this account in forever but whatever
In this post, I'm going to cover what rs4j is, how to use it, and how I built it.
rs4j is a Rust library I created to ease the creation of Java libraries that use native code written in Rust. It generates JNI (Java Native Interface) code to accomplish this.
rs4j allows you to offload high-computation work to a much faster runtime (looks at you, garbage collector) instead of running it all in the JVM and ruining performance. Minecraft mods like Create Aeronautics (or Create Simulated, to be more correct) use this technique to do some of their physics calculations that would otherwise be very laggy with Java.
rs4j allows you to easily create native interfaces like this with minimal code and easily port entire libraries for use with Java with minimal code.
Using it is easy! Just follow these steps:
# Cargo.toml [lib] crate-type = ["cdylib"]
cargo add rs4j
cargo add rs4j --build -F build # Enable the `build` feature # Also add anyhow for error handling cargo add anyhow --build
// build.rs use rs4j::build::BindgenConfig; use anyhow::Result; fn main() -> Result<()> { // Make a new config BindgenConfig::new() // Set the package for export .package("your.package.here") // Where to save the Rust bindings .bindings(format!("{}/src/bindings.rs", env!("CARGO_MANIFEST_DIR"))) // Where the input files are .glob(format!("{}/bindings/**/*.rs4j", env!("CARGO_MANIFEST_DIR")))? // Where to save java classes (is a directory) .output(format!("{}/java", env!("CARGO_MANIFEST_DIR"))) // Enable JetBrains annotations (this is a TODO on my end) .annotations(true) // Go! .generate()?; Ok(()) }
rs4j uses a post-build script to accomplish actions after building.
This is technically optional, but recommended.
# Cargo.toml [features] default = [] post-build = ["rs4j/build", "anyhow"] [[bin]] name = "post-build" path = "post-build.rs" required-features = ["post-build"] [dependencies] anyhow = { version = "[...]", optional = true } # Set the version to whatever you want rs4j = "[...]" # Whatever you had before
// post-build.rs use anyhow::Result; use rs4j::build::BindgenConfig; fn main() -> Result<()> { let out_path = format!("{}/generated", env!("CARGO_MANIFEST_DIR")); let src_path = format!("{}/java/src/generated", env!("CARGO_MANIFEST_DIR")); BindgenConfig::new() // This should be the same as the normal buildscript .package("com.example") .bindings(format!("{}/src/bindings.rs", env!("CARGO_MANIFEST_DIR"))) .glob(format!("{}/bindings/**/*.rs4j", env!("CARGO_MANIFEST_DIR")))? .output(&out_path) .annotations(false) // Run post-build actions .post_build()? // Copy it to your Java project .copy_to(src_path)?; Ok(()) }
This is optional if you don't want to use the post-build script.
cargo install rs4j --features cli
Modify any scripts like so:
- cargo build + rs4j build # `rs4j build` supports all of `cargo build`'s arguments after a `--`.
Here's a basic run-down of the syntax:
// This class, Thing, takes in one type parameter, `A`. // You can omit this if it doesn't take any type parameters. class Thing<A> { // This makes it so that Rust knows that the type for `A` // will have `Clone + Copy`. This doesn't change anything // on the Java side, it's just so that Rust will compile. bound A: Clone + Copy; // This will generate getters and setters for the field `some`. field some: i32; // Here, the Rust function's name is `new`, and Java will treat // it as a constructor. static init fn new(value: A) -> Thing; // This gets the value. Since this is in snake_case, rs4j will // automatically convert it into camelCase, renaming this to // `getValue` on the Java side. fn get_value() -> A; // This marks this function as mutable, meaning in Rust it will // mutate the struct, as if it took a `&mut self` as an argument. mut fn set_value(value: A); // You can even include trait methods, as long as Rust can find the // trait it belongs to! fn clone() -> A; };
rs4j uses a peg parser to process its language. This parser directly turns the parsed structure into an abstract syntax tree, that is turned into code.
rs4j is strongly-typed. I have a Type struct and a TypeKind enum to accomplish this.
These are parsed using this code:
parser! { /// The rs4j parser. pub grammar rs4j_parser() for str { ... // Type kinds rule _u8_k() -> TypeKind = "u8" { TypeKind::U8 } rule _u16_k() -> TypeKind = "u16" { TypeKind::U16 } rule _u32_k() -> TypeKind = "u32" { TypeKind::U32 } rule _u64_k() -> TypeKind = "u64" { TypeKind::U64 } rule _i8_k() -> TypeKind = "i8" { TypeKind::I8 } rule _i16_k() -> TypeKind = "i16" { TypeKind::I16 } rule _i32_k() -> TypeKind = "i32" { TypeKind::I32 } rule _i64_k() -> TypeKind = "i64" { TypeKind::I64 } rule _f32_k() -> TypeKind = "f32" { TypeKind::F32 } rule _f64_k() -> TypeKind = "f64" { TypeKind::F64 } rule _bool_k() -> TypeKind = "bool" { TypeKind::Bool } rule _char_k() -> TypeKind = "char" { TypeKind::Char } rule _str_k() -> TypeKind = "String" { TypeKind::String } rule _void_k() -> TypeKind = "()" { TypeKind::Void } rule _other_k() -> TypeKind = id: _ident() { TypeKind::Other(id) } rule _uint_k() -> TypeKind = _u8_k() / _u16_k() / _u32_k() / _u64_k() rule _int_k() -> TypeKind = _i8_k() / _i16_k() / _i32_k() / _i64_k() rule _float_k() -> TypeKind = _f32_k() / _f64_k() rule _extra_k() -> TypeKind = _bool_k() / _char_k() / _str_k() / _void_k() ... } }
As you can see, there's a different rule for each primitive type, and then a catch-all. This allows me to verify and output the correct code easily.
You can see more of the parser here.
rs4j uses a custom codegen system that heavily uses format!() to create the code. While this isn't the most correct or safe, it creates correct code in almost all of my tests (the only issue is generics, which I'm working on).
The codegen is done with each AST node having its own functions to turn it into Java and Rust code.
In your lib.rs, you have to include!() your bindings.rs file, which contains the native implementations.
Each struct you generate bindings for will be wrapped with JNI. Here's an example of what this looks like:
class MyOtherStruct { field a: String; field b: MyStruct; static init fn new() -> Self; fn say_only(message: String); fn say(p2: String); fn say_with(p1: MyStruct, p2: String); };
// lib.rs ... #[derive(Debug)] pub struct MyOtherStruct { pub a: String, pub b: MyStruct, } impl MyOtherStruct { pub fn new() -> Self { Self { a: String::new(), b: MyStruct::new(), } } pub fn say_only(&self, message: String) { println!("{}", message); } pub fn say(&self, p2: String) { println!("{}{}", self.b.a, p2); } pub fn say_with(&self, p1: MyStruct, p2: String) { println!("{}{}", p1.a, p2); } } include!("bindings.rs"); // bindings.rs // #[allow(...)] statements have been removed for brevity. #[allow(non_camel_case_types)] pub struct __JNI_MyOtherStruct { pub a: String, pub b: *mut MyStruct, } impl __JNI_MyOtherStruct { pub unsafe fn of(base: MyOtherStruct) -> Self { Self { a: base.a.clone(), // yes, this is an intentional memory leak. b: Box::leak(Box::new(base.b)) as *mut MyStruct, } } pub unsafe fn to_rust(&self) -> MyOtherStruct { MyOtherStruct { a: self.a.clone(), b: (&mut *self.b).clone(), } } pub unsafe fn __wrapped_new() -> Self { let base = MyOtherStruct::new(); Self::of(base) } pub unsafe fn __wrapped_say_only(&self, message: String) -> () { MyOtherStruct::say_only(&self.to_rust(), message).clone() } pub unsafe fn __wrapped_say(&self, p2: String) -> () { MyOtherStruct::say(&self.to_rust(), p2).clone() } pub unsafe fn __wrapped_say_with(&self, p1: MyStruct, p2: String) -> () { MyOtherStruct::say_with(&self.to_rust(), p1, p2).clone() } }
When an object is constructed, it calls the wrapped method which intentionally leaks every nested object to get its pointer. This allows me to access the object whenever I need to, in any context.
All methods are wrapped to allow JNI to call them much more easily.
Speaking of which, the JNI code looks like this:
// This is a field, here's the getter and setter. // #[allow(...)] statements have been removed for brevity. #[no_mangle] pub unsafe extern "system" fn Java_com_example_MyOtherStruct_jni_1set_1a<'local>( mut env: JNIEnv<'local>, class: JClass<'local>, ptr: jlong, val: JString<'local>, ) -> jlong { let it = &mut *(ptr as *mut __JNI_MyOtherStruct); let val = env.get_string(&val).unwrap().to_str().unwrap().to_string(); it.a = val; ptr as jlong } #[no_mangle] pub unsafe extern "system" fn Java_com_example_MyOtherStruct_jni_1get_1a<'local>( mut env: JNIEnv<'local>, class: JClass<'local>, ptr: jlong, ) -> jstring { let it = &*(ptr as *mut __JNI_MyOtherStruct); env.new_string(it.a.clone()).unwrap().as_raw() }
This is pretty standard stuff for the jni crate, except for accessing the object. That &*(ptr as *mut __JNI_MyOtherStruct) might look unsafe, and that's because it is. This is intentional, however, as the pointer should always be valid if done correctly.
Notice that at the end of the setter, it returns the pointer of the object. This is intended. This allows Java to reset its internal pointer, keeping track of the latest valid pointer.
Freeing memory essentially reclaims the pointer and then drops it. It frees all non-primitive fields too.
// #[allow(...)] statements have been removed for brevity. #[no_mangle] pub unsafe extern "system" fn Java_com_example_MyOtherStruct_jni_1free<'local, >(_env: JNIEnv<'local>, _class: JClass<'local>, ptr: jlong) { // Reclaim the pointer let it = Box::from_raw(ptr as *mut __JNI_MyOtherStruct); // Reclaim the other field let _ = Box::from_raw(it.b); }
There is a known bug with this method, however, which is that the method will always end up leaking memory if there is a nested object more than one level deep. I have some ideas on how to fix this, but I've been focused on other things.
Every Java class that rs4j generates will inherit from two other interfaces, ParentClass, and NativeClass.
Here's the definition of both:
// NativeClass.java package org.stardustmodding.rs4j.util; public interface NativeClass { long getPointer(); } // ParentClass.java package org.stardustmodding.rs4j.util; public interface ParentClass { void updateField(String field, long pointer); }
Each class is made up of a few parts, including:
// Notice how all of these functions take a `long ptr` as an argument. This is the pointer to the underlying struct in Rust. // This is a constructor - it takes no pointer but returns one. private native long jni_init_new(); // Methods private static native void jni_say_only(long ptr, String message); private static native void jni_say(long ptr, String p2); private static native void jni_say_with(long ptr, long p1, String p2); // Getters & Setters private static native long jni_set_a(long ptr, String value); private static native String jni_get_a(long ptr); // Notice how this field isn't primitive, so it uses the pointer instead. private static native long jni_set_b(long ptr, long value); private static native long jni_get_b(long ptr); // Freeing memory private static native void jni_free(long ptr);
// The pointer to the Rust object private long __ptr = -1; // If this is a field in another class, it keeps track of it for updating purposes private ParentClass __parent = null; // The name of the field in the other class private String __parentField = null;
public MyOtherStruct() { // Sets the pointer using the constructor __ptr = jni_init_new(); }
// Notice how these all just call the JNI method, providing the pointer. public void sayOnly(String message) { jni_say_only(__ptr, message); } public void say(String p2) { jni_say(__ptr, p2); } public void sayWith(MyStruct p1, String p2) { jni_say_with(__ptr, p1.getPointer(), p2); }
// Notice how the setters all update the field in the parent. This allows the user to have Java-like behavior, where modifying a class that is a property of another will update that reference. public void setA(String value) { __ptr = jni_set_a(__ptr, value); if (__parent != null) { __parent.updateField(__parentField, __ptr); } } public String getA() { return jni_get_a(__ptr); } public void setB(MyStruct value) { // .getPointer() gets the underlying pointer, this is from the NativeClass interface. __ptr = jni_set_b(__ptr, value.getPointer()); if (__parent != null) { __parent.updateField(__parentField, __ptr); } } public MyStruct getB() { // Essentially this is a glorified cast. return MyStruct.from(jni_get_b(__ptr), this, "b"); }
// Just creates an instance from a pointer. private MyOtherStruct(long ptr) { __ptr = ptr; } // Creates an instance from a pointer, with a parent private MyOtherStruct(long ptr, ParentClass parent, String parentField) { __ptr = ptr; __parent = parent; __parentField = parentField; } // These are for other classes to "cast" to this class. public static MyOtherStruct from(long ptr) { return new MyOtherStruct(ptr); } public static MyOtherStruct from(long ptr, ParentClass parent, String parentField) { return new MyOtherStruct(ptr, parent, parentField); }
// I'M FREE!!!! // This is ESSENTIAL for memory management, as Rust will otherwise never know when to free the memory that was leaked. public void free() { jni_free(__ptr); } // Override from NativeClass. @Override public long getPointer() { return __ptr; } // Override from ParentClass. @Override public void updateField(String field, long pointer) { // `b` is non-primitive, so when it's updated it also has to be updated here. if (field == "b") { __ptr = jni_set_b(__ptr, pointer); } }
This project is probably one of my proudest projects right now, as it's taken so much work and is proving to be pretty useful for me. I hope you'll check it out and play around with it, too!
Anyway, see you in the next one! I'll try to post more often if I can!
Thanks to @RyanHCode for giving me a few tips on this!
The above is the detailed content of rs Building a JNI Framework. For more information, please follow other related articles on the PHP Chinese website!