rct1

Nathaniel
We have a crate called our crt1. And I will likewise, actually, I don't think I'm gonna open the documentation for this. I think I am going to just show you the code for this. Notice that this code cannot be just run directly, you can't add it as a dependency, you must add this section to your cargo .toml, to optimize it. And the reason for that is that this crate only contains one function, and it's the dyn_reloc function. And what this code does is this is very carefully written code to relocate a binary and what this actually means is that when you load a binary from disk, there's a binary, it can operate in one of two modes, it can operate in static mode, or it can operate in dynamic mode. In static mode, all of the functions are put in the binary and loaded at a very specific address, and they can't be moved to, all of the instructions are fully resolved, to bring up you're going to call another function, you actually have the address of that function in memory. But static binaries are actually really bad for a whole bunch of reasons, particularly security. So yeah, the reason it's bad for security, because if you have a static binary, then it means all your functions are at a fixed location in memory. And so if an attacker can find a security bug, it also means that they know all of the locations of all of your functions, and so they have lots of opportunity to be able to attack your binary. So a much better solution is to have a dynamic binary and a dynamic binary, does not actually have all the addresses of functions. What it does is it loads all the functions, all the code sections in the elf binary, are all loaded at position zero, and therefore are referenced as offsets from zero. Then when the Linux kernel loads that binary, the Linux kernel can load that anywhere at once randomly in memory. And all of the linkage works, because it's all offsets. If so, it's offset from this instruction. And so if you've ever seen, instruction relative offsets, right, for pointers and such in generated code, that's precisely what's happening here. But the problem is, if you're going to load a dynamic binary, which we do in Enarx, you then when that binary starts executing, or before that binary starts executing, you actually have to do what's called a relocation. And the relocation, there's a table in the binary that says, you know that this function is at such and such an offset, and you have to actually fill in that table with the actual addresses. And so this function actually performs the relocation, I'm going to show you in another tab here, where one example of where this code is used. And this is used in the main Enarx repo, we're going to go look at the SGX shim. And it's in main. And so this is the assembly code that is called with the very first time you enter an SGX enclave. And so you'll notice that right here is the start function. So this is the code that's called immediately when we enter the enclave. And one of the problems is that we can't actually jump from the assembly code into Rust code until relocation is done. Because without that, your Rust presumes that relocation has already been completed when it's when it's executing. And so it's one of the steps of things you have to bring up before you actually jump to Rust code. So you can see here, we very clearly have an instruction here on line 250, where we jump to Rust. So somewhere before this in the assembly code, we must do the relocation and the answer is that we do it on line 243. We call the relocation function. And if you look down here, the relocation function is actually this relocate symbol, which is up here. And then we locate symbol, stores all of the register state, that's what these top push statements are doing. And then it does a special setup for the relocation function. And then finally on line 169, it calls the rct1 crate dynamic relocation function, which is where the actual relocation is performed. And then after everything is done, we then restore all the CPU state and return. And now the now all of the function has been relocated. So it's kind of a really fiddly, very horrible piece of code. There's not really a whole lot you can do to make it better. It's just how the system works. So we didn't build that, we inherited that. Any questions? That's the only thing that's in the rct1 crate. Oh, shoot, wrong tab. Yeah, so now we're at the rct1 crate. Yeah, that's the only function that's in this crate is just this one relocation function. And the reason why it has to be separate, is because we need to turn on these optimization flags. Otherwise, the compiler will presume that relocations have been done, and try to do other things. And we want Rust to generate pure code that's gonna assume nothing in that regard.

So if I'll switch back to that tab, if you ever find yourself writing inline assembly for this project, you should emulate this style. So each instruction goes in a separate line with a separate string. And it has a Rust comment on every line that describes exactly what's happening on that line. And you should generally group together blocks of related instructions into common functionality, and then have a comment at the top of the block identifying what's done. It's really the only way you can keep assembly looking clean, and make it approachable to somebody who is perhaps not familiar with all the intricacies.

In the coding conventions section in the main repo, spotted here. Style, that's what it is an inline assembly, we define precisely what the code style should look like for inline assembly. And it roughly looks like that. So you group the instructions together, and you have comments on each line. If an instruction is incredibly obvious, you don't have to have an a comment on it. But you should err on the side of over commenting for inline assembly. And all of this makes you know, we want eventually Enarx to be audited by an actual auditing company. So all of this will make auditing faster and cheaper. And lets many eyes see find any problems that may exist. All right, any any other questions about rct1?