在GNU C内联汇编中编写Linux int 80h系统调用包装器

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (1)
  • 关注 (0)
  • 查看 (21)

我试着用内联装配.。我读了这一页http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx但是我不明白传递给我函数的参数。

I'm writing a C write example.. this is my function header:

write2(char *str, int len){
}

And this is my assembly code:

global write2
write2:
    push ebp
    mov ebp, esp
    mov eax, 4      ;sys_write
    mov ebx, 1      ;stdout
    mov ecx, [ebp+8]    ;string pointer
    mov edx, [ebp+12]   ;string size
    int 0x80        ;syscall
    leave
    ret

What do I have to do pass that code to the C function... I'm doing something like this:

write2(char *str, int len){
    asm ( "movl 4, %%eax;"
          "movl 1, %%ebx;"
          "mov %1, %%ecx;"
          //"mov %2, %%edx;"
          "int 0x80;"
           :
           : "a" (str), "b" (len)
    );
}

这是因为我没有输出变量,那么如何处理呢?此外,使用以下代码:

global main
main:
    mov ebx, 5866       ;PID
    mov ecx, 9      ;SIGKILL
    mov eax, 37     ;sys_kill
    int 0x80        ;interruption
    ret 

How can I put that code inline in my code.. so I can ask for the pid to the user.. like this.. This is my precode

void killp(int pid){
    asm ( "mov %1, %%ebx;"
          "mov 9, %%ecx;"
          "mov 37, %%eax;"
           :
           : "a" (pid)         /* optional */
    );
}
提问于
用户回答回答于

嗯,您没有具体说明,但通过您的帖子,您似乎使用了GCC及其带有约束语法的内联ASM(其他C编译器有非常不同的内联语法)。也就是说,你可能需要使用AT&T汇编程序语法,而不是英特尔,因为GCC就是这么做的。

So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "ebp + offset" as you don't know how it's being laid out.

这就是约束的意义-你说你想让GCC把值(任何寄存器、内存、特定寄存器)和在ASM代码中的使用“%X”放在什么样的地方。最后,如果在ASM代码中使用显式寄存器,则需要在第三节(在输入约束之后)列出它们,以便GCC知道您正在使用它们。否则,它可能会在其中一个寄存器中放置一些重要的值,然后就会破坏这个值。

You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is not implied.

So with all that, your write2 function looks like:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"      // SYS_write
        "movl $1, %%ebx;"      // file descriptor = stdout_fd
        "movl %0, %%ecx;"
        "movl %1, %%edx;"
        "int $0x80"
        :: "g" (str), "g" (len)       // input values we MOV from
        : "eax", "ebx", "ecx", "edx", // registers we destroy
          "memory"                    // memory has to be in sync so we can read it
     );
}

Note the AT&T syntax -- src, dest rather than dest, src and % before the register name.

现在,这将是可行的,但它的低效,因为它将包含许多额外的MOV。通常,您不应该在ASM代码中使用mov指令或显式寄存器,因为您最好使用约束来说明您想要的东西,并让编译器确保它们在那里。这样的话,优化器可能会去除大部分MOV,特别是如果它嵌入了函数(如果您指定-O3,它就会这样做)。方便地,i 386机器模型对特定寄存器具有约束,因此您可以这样做:

void write2(char *str, int len) {
    __asm__ volatile (
        "movl $4, %%eax;"
        "movl $1, %%ebx;"
        "int $0x80"
        :: "c" (str), /* c constraint tells the compiler to put str in ecx */
           "d" (len)  /* d constraint tells the compiler to put len in edx */
        : "eax", "ebx", "memory");
}

or even better

// UNSAFE: destroys EAX (with return value) without telling the compiler
void write2(char *str, int len) {
    __asm__ volatile ("int $0x80"
        :: "a" (4), "b" (1), "c" (str), "d" (len)
        : "memory");
}

Note also the use of volatile which is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asm with no output operands is already implicitly volatile, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)

edit

One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:

int write2(const char *str, int len) {
    __asm__ volatile ("int $0x80" 
     : "=a" (len)
     : "a" (4), "b" (1), "c" (str), "d" (len),
       "m"( *(const char (*)[])str )       // "dummy" input instead of memory clobber
     );
    return len;
}

All system calls return in EAX. Values from -4095 to -1 (inclusive) are negative errno codes, other values are non-errors. (This applies globally to all Linux system calls).

If you're writing a generic system-call wrapper, you probably need a "memory" clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBue for an example that breaks if you leave it out, and this answer for more details about dummy memory inputs/outputs.

With this output operand, you need the explicit volatile -- exactly one write system call per time the asm statement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)

扫码关注云+社区

领取腾讯云代金券