Lisp globals#
Functions#
This remacs documentation compares the C implementation for the atan
function with the ported Rust version. Since emacs-ng isn't about
porting the C code base, this example is only intended to show the
differences. However these features can be used by functions that are
part of new features.
The first thing to look at is atan
. It takes an optional second
argument, which makes it interesting. The complicated mathematical
bits, on the other hand, are handled by the standard library. This
allows us to focus on the porting process without getting distracted
by the math.
The Lisp values we are given as arguments are tagged pointers; in this
case they are pointers to doubles. The code has to check the tag and
follow the pointer to retrieve the real values. Note that this code
invokes a C macro (called DEFUN
) that reduces some of the
boilerplate. The macro declares a static variable called Satan
that
holds the metadata the Lisp compiler will need in order to
successfully call this function, such as the docstring and the pointer
to the Fatan
function, which is what the C implementation is named:
DEFUN ("atan", Fatan, Satan, 1, 2, 0,
doc: /* Return the inverse tangent of the arguments.
If only one argument Y is given, return the inverse tangent of Y.
If two arguments Y and X are given, return the inverse tangent of Y
divided by X, i.e. the angle in radians between the vector (X, Y)
and the x-axis. */)
(Lisp_Object y, Lisp_Object x)
{
double d = extract_float (y);
if (NILP (x))
d = atan (d);
else
{
double d2 = extract_float (x);
d = atan2 (d, d2);
}
return make_float (d);
}
extract_float
checks the tag (signaling an "invalid argument" error
if it's not the tag for a double), and returns the actual
value. NILP
checks to see if the tag indicates that this is a null
value, indicating that the user didn't supply a second argument at
all.
Next take a look at the current Rust implementation. It must also take an optional argument, and it also invokes a (Rust) macro to reduce the boilerplate of declaring the static data for the function. However, it also takes care of all of the type conversions and checks that we need to do in order to handle the arguments and return value:
/// Return the inverse tangent of the arguments.
/// If only one argument Y is given, return the inverse tangent of Y.
/// If two arguments Y and X are given, return the inverse tangent of Y
/// divided by X, i.e. the angle in radians between the vector (X, Y)
/// and the x-axis
#[lisp_fn(min = "1")]
pub fn atan(y: EmacsDouble, x: Option<EmacsDouble>) -> EmacsDouble {
match x {
None => y.atan(),
Some(x) => y.atan2(x)
}
}
You can see that we don't have to check to see if our arguments are of
the correct type, the code generated by the lisp_fn
macro does this
for us. We also asked for the second argument to be an
Option<EmacsDouble>
. This is the Rust type for a value which is
either a valid double or isn't specified at all. We use a match
statement to handle both cases.
This code is so much better that it's hard to believe just how simple
the implementation of the macro is. It just calls .into()
on the
arguments and the return value; the compiler does the rest when it
dispatches this method call to the correct implementation.
Attributes#
This macro creates the necessary FFI functions and symbols
automatically. It handles normal functions and functions that take an
arbitrary number of arguments (functions with MANY
as the maximum
number of arguments on the C side)
It is used like this:
/// Return the same argument
#[lisp_fn(name = "same", c_name = "same", min = "1"]
fn same(obj: LispObject) -> LispObject {
obj
}
Here the name
argument specifies the symbol name that is going
to be use in Emacs Lisp, c_name
specifies the name for the Fsame
and Ssame
statics used in C, and min
specifies the minimum
number of arguments that can be passed to this function, the
maximum number of arguments is calculated automatically from the
function signature.
All three of these arguments are optional, and have sane defaults.
Default for name
is the Rust function name with _
replaced by -
.
Default for c_name
is the Rust function name. Default for min
is
the number of Rust arguments, giving a function without optional
arguments.
In this example the attribute generates the Fsame
function that is
going to be called in C, and the Ssame
structure that holds the
function information. You still need to register the function with
defsubr
to make it visible in Emacs Lisp. To make a function visible
to C you need to export it in the crate root (lib.rs) as follows:
use somemodule::Fsome;
Functions with a dynamic number of arguments (MANY
)#
This attribute handles too the definition of functions that take an
arbitrary number of arguments, these functions can take an arbitrary
number of arguments, but you still can specify a min
number of
arguments.
They are created as follows:
/// Returns the first argument.
#[lisp_fn(min = "1")]
fn first(args: &mut [LispObject]) -> LispObject {
args[0]
}
Variables#
At the end of the C file where the DEFUN is defined there is a called syms_of.... In this file the C code calls defsubr to setup the link between the C code and the Lisp engine. When porting a DEFUN from C, the defsubr call needs to be removed as well. For instance, if syntax-table-p is being ported then find the line like defsubr (&Ssyntax_table_p); and remove it. The all Rust functions declared with lisp_fn have a defsubr line generated for them by the build so there is nothing to do on the Rust side. DEFSYM
In C, the DEFSYM macro is used to create an entry in the Lisp symbol table. These are analogous to global variables in the C/Rust code. Like defsubr you will most often see these in the syms_of... functions. When porting DEFUNs check to see if there is a matching DEFSYM as well. If there is remove it from the C and below the ported Rust code add a line like this: def_lisp_sym!(Qsyntax_table_p, "syntax-table-p");. Lisp Variables
You may also be aware that the C code must quickly and frequently access the current value of a large number of Lisp variables. To make this possible, the C code stores these values in global variables. Yes, lots of global variables. In fact, these aren't just file globals accessible to only one translation unit, these are static variables that are accessible across the whole program. We've started porting these to Rust now as well.
DEFVAR_LISP ("post-self-insert-hook", Vpost_self_insert_hook,
doc: /* Hook run at the end of `self-insert-command'.
This is run after inserting the character. */);
Vpost_self_insert_hook = Qnil;
Like DEFUN, DEFVAR_LISP takes both a Lisp name and the C name. The C name becomes the name of the global variable, while the Lisp name is what gets used in Lisp source code. Setting the default value of this variable happens in a separate statement, which is fine.
/// Hook run at the end of `self-insert-command'.
/// This is run after inserting the character.
defvar_lisp!(Vpost_self_insert_hook, "post-self-insert-hook", Qnil);
The Rust version must still take both names (this could be simplified if we wrote this macro using a procedural macro), but it also takes a default value. As before, the docstring becomes a comment which all other Rust tooling will recognize.
You might be interested in how this is implemented as well:
#define DEFVAR_LISP(lname, vname, doc) \
do { \
static struct Lisp_Objfwd o_fwd; \
defvar_lisp (&o_fwd, lname, &globals.f_ ## vname); \
} while (false)
The C macro is not very complicated, but there are two somewhat subtle points. First, it creates an (uninitialized) static variable called o_fwd, of type Lisp_Objfwd. This holds the variable's value, which is a Lisp_Object. It then calls the defvar_lisp function to initialize the fields of this struct, and also to register the variable in the Lisp runtime's global environment, making it accessible to Lisp code.
The first subtle point is that every invocation of this marco uses the same variable name, o_fwd. If you call this macro more than once inside the same scope, then they would all be the exact same static variable. Instead the macro body is wrapped inside a do while false loop so that each one has a separate little scope to live in.
The other subtlty is that the Lisp_Objfwd struct actually only has a pointer to the value. We still have to allocate some storage for that value somewhere. We take the address of a field on something called globals here. That's the real storage location. This globals object is just a big global struct that holds all the global variables. One day when Emacs is really multi-threaded, there can be one of these per thread and a lot of the rest of the code will just work.
#[macro_export]
macro_rules! defvar_lisp {
($field_name:ident, $lisp_name:expr, $value:expr) => {{
#[allow(unused_unsafe)]
unsafe {
use $crate::bindings::Lisp_Objfwd;
static mut o_fwd: Lisp_Objfwd = Lisp_Objfwd {
type_: $crate::bindings::Lisp_Fwd_Type::Lisp_Fwd_Obj,
objvar: unsafe { &$crate::bindings::globals.$field_name as *const _ as *mut _ },
};
$crate::bindings::defvar_lisp(
&o_fwd,
concat!($lisp_name, "\0").as_ptr() as *const libc::c_char,
);
$crate::bindings::globals.$field_name = $value;
}
}};
}
The Rust version of this macro is rather longer. Primarily this is because it takes a lot more typing to get a proper uninitialized value in a Rust program. Some would argue that all of this typing is a bad thing, but this is very much an unsafe operation. We're basically promising very precisely that we know this value is uninitialized, and that it will be completely and correctly initialized by the end of this unsafe block.
We then call the same defvar_lisp function with the same values, so that the Lisp_Objfwd struct gets initialized and registered in exactly the same way as in the C code. We do have take care to ensure that the Lisp name of the variable is a null-terminated string though.