Skip to main content

ctr0stack


Nathaniel
Next, let's talk about ctr0stack. And we give an example of how this works. But there's not much documentation for this. And so I'll give a little overview of what this is about. So when an application is started in Linux, by the Linux kernel, this is not that doesn't have anything to do with Enarx. When an application is started by the Linux kernel, the kernel sets up environment for that process and has to communicate some information to that process. So that it knows about things like environment variables, and so forth. And the way that this actually works is that when the kernel starts a new process, it sets up the stack for the new process. And then it pushes data onto the stack in a specific format. And that format is called ctr0stack. And so when the application starts up, it can look at the data that's on the stack, it can introspect it. And it can see, oh, I have certain environment variables set and so forth. And so we needed an implementation of this because we are internally in Enarx, we are emulating a Linux binary, and we needed the ability to write out this information in the correct format, but also to read this information because there's some cases where we need to read this information as well. And so this crate ctr0stack is really just a utility crate to be able to read and write the stack format that is used by the Linux kernel. For a process, this can be used outside of Enarx, as well. t doesn't really have anything directly tied to Enarx, other than we needed to support this format. So you can see some examples of this here. So this is creating a stack using a builder. And so we create this stack with 512 bytes. And we get a mutable reference to it. And then we create a builder, the first thing we do is we pass the executable name for the binary onto the stack. And then that's followed by an environment variable. And then the last section is a set of other information like the user ID and the group ID, and so forth. And then when you're all done, you just call builder done. And that gives you a pointer to the top of the stack. And you can then set the stack pointer to point to that, and everything should work. And so we use this inside the shims, which we'll talk about later when we are setting up and preparing to execute the exec layer. Okay, so this is the generated documentation for this crate. And so you'll note here that we create a builder, and then we create a new binding for the variable every time we call this done method. And the reason for this is that a ctr0stack has three different regions. And there's different kinds of data that you can put in each region. And so the builder object is actually a state machine. And we start off, when you create this new builder, we start off in the first region, and then you call done. And this consumes the builder object and creates a new builder for the next phase of the operation, the next region of the ctr0stack. So the first region is where we put the arguments, right. So this is the arguments array. And if you know, arg zero points to the application, and we're saying that slash in it is the name of the application.

Mayank
Yeah, so I just wanted to ask, like, why do we unwrap it? At every step? Like the result? Why are we unwrapping it?

Nathaniel
There's some cases where it can fail. I think the only case where it fails is if you run out of space to to write additional data. And so in this example, we're calling unwrap, which basically creates an assertion that an error will either will have the success case or the application will panic. And that's pretty typical in Rust in documentation. If you're writing the same real code, you would need to handle the error appropriately. And do something appropriate with that. I'm pretty sure the only error case here is if you actually run out of memory to write the stack into.

So just to continue on where I was going, each time we call done, we're moving to the next stage or the next region of data that we're writing in the stack. So if we actually look at the builder type, you'll see that the builder type is itself. Right? Here is itself generic or parameterised. And so each stage has its own qualified type. So basically, every time you call done, it moves to the next section. Notice that when we create a new builder, we get back a builder that is in the argument region. So builder with a qualified type of arg. And then when we call done on that we get back on in the success case, it builder with an environment section. And so now we're writing to the environment section. This is enforcing with the type system that the state machine is being respected.

So yeah, so all of this compiles out at runtime, it's a zero cost abstraction. And we're just making sure that the type system enforces which aspect of the stack we're writing into. And this is pretty important for this because when you need to write out a ctr0stack, handling errors at that point can be somewhat complicated, because you're just about to start a new process. And you've set up a ton of other things for that process. So you basically don't want it to fail, and you want to know, ahead of time, from the compiler that everything's going to actually work correctly.

Mayank
If it's okay, can you just once again explain about why have you have multiple mutable bindings for builder?

Nathaniel
Yes. Because what's actually happening is every time you call done, the done method is consuming the previous builder, and creating a new instance of builder for a new, a new phase of the state machine, a new state and the state machine. So if you look, when we go back to the docs here, again, if you look at this done method here, it doesn't take a reference to self, it takes a move of self. So it actually consumes the previous type, and then it returns on success, the new, the next state in the state machine.

Mayank
So this one question. So if we push multiple times before doing a build, how does it change the sequence of the state machine.

Nathaniel
it doesn't, you can push as many times as you want, as long as you have memory, sufficient memory in the stack to write to. We, for example, in the first phase, you're writing the argument variables. So this is, you know, whenever you if you're writing a program in C, right, you you're going to take in an integer to the count of the number of arguments. And you're going to take an array of those arguments. This is actually what we're constructing in this space before that application is run. And so right now, we're only putting one argument on the stack, which is the name of the application, but if we wanted to put additional arguments, we would call push multiple times for each of the arguments.

Mayank
So we are doing it in this way right now. So that we know we have run out of memory if we encountered an error, right?

Nathaniel
Is it correct? Correct. And we in this example case, we have allocated 512 bytes of memory. It's enough to run this example. A real stack. You know, I think Linux allocates 2M by default for the stack size. So an actual stack is going to be significantly larger. But this is enough for our demo.