Linux Overflow Vulnerability General Hardened Defense Technology

Catalog

1. Grsecurity/PaX
2. Hardened toolchain
3. Default addition of the Stack Smashing Protector (SSP): Compiler Flag: GS
4. Automatic generation of Position Independent Executables (PIEs): System Characteristic + Compiler Flag: ASLR
5. Default to marking read-only, sections that can be so marked after the loader is finished (RELRO): Compiler Flag + System Characteristic: DEP / NX
6. Default full binding at load-time (BIND_NOW)
7. Heap Protector
8. Pointer Obfuscation
9. Built with Fortify Source
10. /proc/$pid/maps protection
11. ptrace scope
12. /dev/mem protection
13. Block module loading
14. Syscall Filtering
15. How To Harden Linux 

 

1. Grsecurity/PaX

0x1: 什么是Pax

PaX是针对linux kernel的一个加固版本的补丁,它让linux内核的内存页受限于最小权限原则,是一个有效的防御"系统级别0DAY"的方案,第1版的设计和实现诞生于2000年,但是当年Linux内核不收PaX进入upstream是因为很多人觉得PaX不是那么的好维护,之后linux内核推出了LSM(Linux Security Module),LSM利用了一堆CAPABILITY的机制提供了一些限制用户态程序访问控制的接口,SELinux和Apparmor就是基于LSM开发的。需要注意的是LSM并不是一个传统意义上的linux kernel module

1. 必须在bootloader启动内核时启动,不能在内核加载完后启动 
2. 不能同时启动2个LSM的实现,当然后来有人实现了一套LSM Stacking堆栈式调用方式,但是并没有进入Linux内核原生支持

当我们谈到PaX时都会写Grsecurity/PaX,这是因为PaX从一开始就主要关注如何防御和检测memory corruption,后来Grsecurity社区发现PaX和他们所关注的非常类似,所以就合并了,在很长的一段时间里PaX主要关注memorycorruption,而Grsecurity则实现其他的功能包括RBAC,但后来2个社区的工作开始模糊了:包括USERCOPY, STACKLEAK, RANDSTRUCT, etc..都是整个Grsecurity/PaX共同实现的特性

PaX team认为会导致漏洞利用的bug给予了攻击者(区分攻击者和黑客是不同的term)在3个不同层面上访问被攻击的进程

1. 执行任意代码
2. 执行现有代码但打破了原有的执行顺序
3. 原有的执行顺序执行现有代码,但加载任意数据

0x2: PaX里vma mirroring的设计

在2003年的晚些时候PaX实现了虚拟内存空间的镜像( vma mirroring),vmamirroring的目的是为了在一组物理页上做特殊文件隐射时有2个不同的线性地址,这2个地址即使在swap-out/swap-in或者COW后依然是不会改变的。这样做的目的为了满足几种场景

1. 把可执行的区域在使用SEGMEXEC隐射进入代码段。在32-bit的linux内核里的4GB地址空间是其中3GB给用户空间,1GB给内核空间,而vma mirroring把用户空间的3GB划分成了2个1.5GB分别是给代码段和数据段,在可执行区域里包含的数据的部分(常量字符串,函数指针表等)都会mirroring到数据段里
2. 实现可执行区域的地址随机化(RANDEXEC)
3. 这个引出了第3种情况,就是SEGMEXEC和RANDEXEC同时激活,这个的效果和PIE+ASLR的效果类似,不同的不是整个elf binary的代码段随机化,而是在mirroring时对代码段和数据段进行随机化 

Relevant Link:

https://raw.githubusercontent.com/citypw/DNFWAH/master/4/d4_0x01_DNFWAH_archeological_hacking_on_pax.txt
https://wiki.gentoo.org/wiki/Hardened/PaX_Quickstart

 

2. Hardened toolchain

Hardened toolchain introduces a number of changes to the default behaviour of the toolchain (gcc, binutils, glibc/uclibc) intended to improve security. It supports other initiatives taken by the hardened project; most directly PaX and Grsecurity, but can also be applied to SELinux and RSBAC.

0x1: Check Tools

技术分享

Relevant Link:

https://github.com/slimm609/checksec.sh
https://wiki.gentoo.org/wiki/Hardened/Toolchain
https://wiki.ubuntu.com/Security/Features

 

3. Default addition of the Stack Smashing Protector (SSP): Compiler Flag: GS

First developed by Dr Hiroaki Etoh at IBM for the 3.x series of GCC (originally under the name ProPolice) and re-developed in a different way for the 4.x series by RedHat, the Stack Smashing Protector attempts to protect against stack buffer overflows. It causes the compiler to insert a check for stack buffer overflows before function returns. If an attempt is made to exploit a previously unfixed (and probably undiscovered) error that exposes a buffer overflow vulnerability, the application will be killed immediately. This reduces any potential exploit to a denial-of-service.
Normally the compiler must be explicitly directed to switch on the stack protection via compiler options.
类似于windows上的GS编译技术,本质是在函数return返回的时候,检测当前栈空间是否遭到污染
The Stack Smashing Protector (SSP) compiler feature helps detect stack buffer overrun by aborting if a secret value on the stack is changed. This serves a dual purpose in making the occurrence of such bugs visible and as exploit mitigation against return-oriented programming. SSP merely detects stack buffer overruns, they are not prevented. The detection can be beaten by preparing the input such that the stack canary is overwritten with the correct value and thus does not offer perfect protection. The stack canary is native word sized and if chosen randomly, an attacker will have to guess the right value among 2^32 or 2^64 combinations (and revealing the bug if the guess is wrong), or resort to clever means of determining it.

0x1: Description

Compilers implement this feature by selecting appropriate functions, storing the stack canary during the function prologue and checking the value at the epilogue, invoking a failure handler if it was changed. For instance, consider the code:

void foo(const char* str)
{
    char buffer[16];
    strcpy(buffer, str);
}

SSP automatically illustratively transforms that code into this:

/* 
Note how buffer overruns are undefined behavior and the compilers tend to optimize these checks away if you wrote them yourself, this only works robustly because the compiler did it itself. 
*/
extern uintptr_t __stack_chk_guard;
noreturn void __stack_chk_fail(void);
void foo(const char* str)
{
    //在最靠近return函数返回的位置设置canary(SSP Check)
    uintptr_t canary = __stack_chk_guard;
    char buffer[16];
    strcpy(buffer, str);
    if ( (canary = canary ^ __stack_chk_guard) != 0 )
        __stack_chk_fail();
}

Note how the secret value is stored in a global variable (initialized at program load time) and is copied into the stack frame, and how the it is safely erased from the stack as part of check. Since stacks grow downwards on many architectures, the canary gets overwritten whenever input to strcpy is at least 16 characters. The caller return pointer exploited in return-oriented programming attacks is not accessed until after the value was validated, thus defusing such attacks.

0x2: Implementation

Run-time support needs only two components: A global variable and a check failure handler. For instance, a minimal implementation could be:

#include <stdint.h>
#include <stdlib.h>
 
#if UINT32_MAX == UINTPTR_MAX
#define STACK_CHK_GUARD 0xe2dee396
#else
#define STACK_CHK_GUARD 0x595e9fbd94fda766
#endif
 
uintptr_t __stack_chk_guard = STACK_CHK_GUARD;
 
__attribute__((noreturn))
void __stack_chk_fail(void)
{
#if __STDC_HOSTED__
    abort();
#elif __is_myos_kernel
    panic("Stack smashing detected");
#endif
}

Note how the secret guard value is hard-coded rather than being decided during program load. You should have the program loader (the bootloader in the case of the kernel) randomize the values. You can do this by putting the guard value in a special segment that the loader knows to randomize.

0x3: compile flag

GCC 4.1 中三个与堆栈保护有关的编译选项

1. -fstack-protector: 启用堆栈保护,不过只为局部变量中含有 char 数组的函数插入保护代码 
2. -fstack-protector-all: 启用堆栈保护,为所有函数插入保护代码 
3. -fno-stack-protector: 禁用堆栈保护 

Relevant Link:

http://en.wikipedia.org/wiki/Buffer_overflow_protection
http://wiki.osdev.org/Stack_Smashing_Protector 
https://www.ibm.com/developerworks/cn/linux/l-cn-gccstack/ 
http://www.linuxfromscratch.org/hints/downloads/files/ssp.txt 

 

4. Automatic generation of Position Independent Executables (PIEs): System Characteristic + Compiler Flag: ASLR

Standard executables have a fixed base address, and they must be loaded to this address otherwise they will not execute correctly. Position Independent Executables can be loaded anywhere in memory much like shared libraries, allowing PaX ‘s Address Space Layout Randomisation (ASLR) to take effect. This is achieved by building the code to be position-independent, and linking them as ELF shared objects.
类似于windows上的ASLR技术,通过将栈地址空间随机化,在很大程度上提高了黑客的shellcode的编写难度

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address.

1. PIC is commonly used for shared libraries, so that the same library code can be loaded in a location in each program address space where it will not overlap any other uses of memory (for example, other shared libraries)
2. PIC was also used on older computer systems lacking an MMU,[1] so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

Position-independent code can be executed at any memory address without modification. This differs from relocatable code, where a link editor or program loader modifies a program before execution, so that it can be run only from a particular memory location.

0x1: Technical details

Procedure calls inside a shared library are typically made through small procedure linkage table stubs, which then call the definitive function. This notably allows a shared library to inherit certain function calls from previously loaded libraries rather than using its own versions.
Data references from position-independent code are usually made indirectly, through global offset tables (GOTs), which store the addresses of all accessed global variables. There is one GOT per compilation unit or object module, and it is located at a fixed offset from the code (although this offset is not known until the library is linked). When a linker links modules to create a shared library, it merges the GOTs and sets the final offsets in code. It is not necessary to adjust the offsets when loading the shared library later.
Position independent functions accessing global data start by determining the absolute address of the GOT given their own current program counter value.

0x2: compile flag

1. -fpic
Generate position-independent code (PIC) suitable for use in a shared library, if supported for the target machine. Such code accesses all constant addresses through a global offset table (GOT). The dynamic loader resolves the GOT entries when the program starts (the dynamic loader is not part of GCC; it is part of the operating system). If the GOT size for the linked executable exceeds a machine-specific maximum size, you get an error message from the linker indicating that -fpic does not work; in that case, recompile with -fPIC instead. (These maximums are 8k on the SPARC and 32k on the m68k and RS/6000. The x86 has no such limit.)
Position-independent code requires special support, and therefore works only on certain machines. For the x86, GCC supports PIC for System V but not for the Sun 386i. Code generated for the IBM RS/6000 is always position-independent.
When this flag is set, the macros __pic__ and __PIC__ are defined to 1. 

2. -fPIC
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on the m68k, PowerPC and SPARC.
Position-independent code requires special support, and therefore works only on certain machines.
When this flag is set, the macros __pic__ and __PIC__ are defined to 2. 

3. -fpie
4. -fPIE
These options are similar to -fpic and -fPIC, but generated position independent code can be only linked into executables. Usually these options are used when -pie GCC option is used during linking.
-fpie and -fPIE both define the macros __pie__ and __PIE__. The macros have the value 1 for -fpie and 2 for -fPIE. 

Position Independent Executables (PIE) are an output of the hardened package build process. A PIE binary and all of its dependencies are loaded into random locations within virtual memory each time the application is executed. This makes Return Oriented Programming (ROP) attacks much more difficult to execute reliably.

要实现PIE安全加固技术,需要操作系统和编译器同时提供支持,即操作系统提供基础能力,编译器基于此对待编译的程序进行处理

1. Linux kernel features 
Address Space Layout Randomization (ASLR) can help defeat certain types of buffer overflow attacks. ASLR can locate the base, libraries, heap, and stack at random positions in a processs address space, which makes it difficult for an attacking program to predict the memory address of the next instruction
    1) echo value > /proc/sys/kernel/randomize_va_space
    2) vim /etc/sysctl.conf : kernel.randomize_va_space = value : sysctl -p 

2. compiler flag
-fpie、-fpic

Relevant Link:

http://en.wikipedia.org/wiki/Position-independent_code
https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
https://securityblog.redhat.com/2012/11/28/position-independent-executables-pie/ 

 

5. Default to marking read-only, sections that can be so marked after the loader is finished (RELRO): Compiler Flag + System Characteristic: DEP / NX

There are several sections that need to be writable by the loader before the application starts, but do not need to be writable by the application itself later. Setting relro instructs the linker to record which sections this applies to, and the loader will mark them read-only before passing or returning execution control to the application. Typical sections affected include .ctors, .dtors, .jcr, .dynamic and .got, although the exact list varies according to arch.

The difference between Partial RELRO and Full RELRO is that the Global Offset Table (and Procedure Linkage Table) which act as kind-of process-specific lookup tables for symbols (names that need to point to locations elsewhere in the application or even in loaded shared libraries) are marked read-only too in the Full RELRO. Downside of this is that lazy binding (only resolving those symbols the first time you hit them, making applications start a bit faster) is not possible anymore.

0x1: GOT Overwrite Demo

// Include standard I/O declarations
#include <stdio.h>
// Include string declarations
#include <string.h>
 
// Program entry point
int main(int argc, char** argv) 
{
    // Terminate if program is not run with three parameters.
    if (argc != 4) 
    {
        // Print out the proper use of the program
        puts("./a.out <size> <offset> <string>");
        // Return failure
        return -1;
    }
 
    // Convert size to an integer
    int size = atoi(argv[1]);
    // Convert offset to an integer
    int offset = atoi(argv[2]);
    // Place string into its own string on the stack
    char* str = argv[3];
    // Declare a 256 byte buffer on the stack
    char buffer[256];
 
    // Print the location of the buffer for calculating the offset.
    printf("Buffer:\t\t%8x\n", &buffer);
 
    // Fill the buffer with the letter ‘A‘.
    memset(buffer, 65, 256 - 1);
    // Null-terminate the buffer.
    buffer[255] = 0;
    // Attempt to copy the specified string into the specified location.
    strncpy(buffer + offset, str, size);
    // Print out the buffer.
    printf("%s", buffer);
 
    // Return success
    return 0;
}
//gcc -g -O0 -Wl,-z,norelro -fno-stack-protector -o a.out a.c

技术分享

这个示例DEMO展示了一个内存复制导致的任意虚拟内存覆盖的漏洞,黑客可以借此覆盖进程的GOT表,以此实现函数劫持的目的
首先获取GOT虚拟内存地址
readelf -a a.out

技术分享

计算偏移量并进行溢出POC测试

(gdb) x 0x0600a58
0x600a58 <_GLOBAL_OFFSET_TABLE_>:    0x006008c0
(gdb) p -(0x1c6f0940 - 0x0600a58) % 0x80000000
$1 = 1676738840
(gdb) r 4 1676738840 \$\$\$\$
Starting program: /zhenghan/relro/a.out 4 1676738840 \$\$\$\$
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
Buffer:        ffffe800

Program received signal SIGSEGV, Segmentation fault.
0x0000003e72e78fc9 in strncpy () from /lib64/libc.so.6

Here, we see that we can overwrite the GOT entry for printf with our string. The program crashes because it’s trying to jump to memory that is not mapped.

0x2: RELRO: RELocation Read-Only

To prevent the above exploitation technique, we can tell the linker to resolve all dynamically linked functions at the beginning of execution and make the GOT read-only.

gcc -g -O0 -Wl,-z,relro,-z,now -fno-stack-protector -o b.out a.c 

首先获取GOT虚拟内存地址
readelf -a a.out

技术分享

(gdb) x 0x0600a58
0x600a58 <_GLOBAL_OFFSET_TABLE_>:    0x006008c0
(gdb) p -(0x7fffffffe800 - 0x0600a58) % 0x80000000
$1 = -2141183400
(gdb) r 4 -2141183400 hello
Starting program: /zhenghan/relro/a.out 4 -2141183400 hello
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
Buffer:        ffffe7f0

Program received signal SIGSEGV, Segmentation fault.
0x0000003e72e78fc9 in strncpy () from /lib64/libc.so.6

Here, we see that we cannot overwrite the GOT entry for printf with our string. The program crashes because it trying to write to a memory segment that is read-only.

要实现RELRO安全加固技术,需要操作系统和编译器同时提供支持,即操作系统提供基础能力,编译器基于此对待编译的程序进行处理

1. Linux kernel features 
The Data Execution Prevention (DEP) feature prevents an application or service from executing code in a non-executable memory region. Hardware-enforced DEP works in conjunction with the NX (Never eXecute) bit on compatible CPUs.
    1) You cannot disable the DEP feature.

2. compiler flag
relro

Relevant Link:

http://larrythecow.org/universe/archives/2011-07-16.html.gz 
https://isisblogs.poly.edu/2011/06/01/relro-relocation-read-only/
http://docs.oracle.com/cd/E37670_01/E36387/html/ol_kernel_sec.html
http://www.win.tue.nl/~aeb/linux/hh/protection.html

 

6. Default full binding at load-time (BIND_NOW)

To reduce the time between starting an application and actually being able to use it, most software is built with "lazy binding". This means that references to functions in shared libraries are resolved when they are actually used for the first time, rather than when the application is loaded. The hardened toolchain changes this behaviour so that by default it will set the "BIND_NOW" flag, which causes the loader to sort out all of these links before starting execution. It improves the effectiveness of RELROintro

Relevant Link:

https://wiki.ubuntu.com/Security/Features

 

7. Heap Protector

The GNU C Library heap protector (both automatic via ptmalloc and manual) provides corrupted-list/unlink/double-free/overflow protections to the glibc heap memory manager (first introduced in glibc 2.3.4). This stops the ability to perform arbitrary code execution via heap memory overflows that try to corrupt the control structures of the malloc heap memory areas.
This protection has evolved over time, adding more and more protections as additional corner-cases were researched. As it currently stands, glibc 2.10 and later appears to successfully resist even these hard-to-hit conditions.

 

8. Pointer Obfuscation

Some pointers stored in glibc are obfuscated via PTR_MANGLE/PTR_UNMANGLE macros internally in glibc, preventing libc function pointers from being overwritten during runtime.

Relevant Link:

http://udrepper.livejournal.com/13393.html

 

9. Built with Fortify Source

Programs built with "-D_FORTIFY_SOURCE=2" (and -O1 or higher), enable several compile-time and run-time protections in glibc:

1. expand unbounded calls to "sprintf", "strcpy" into their "n" length-limited cousins when the size of a destination buffer is known (protects against memory overflows).
2. stop format string "%n" attacks when the format string is in a writable memory segment.
3. require checking various important function return codes and arguments (e.g. system, write, open).
4. require explicit file mask when creating new files.

 

10. /proc/$pid/maps protection

With ASLR, a process‘s memory space layout suddenly becomes valuable to attackers. The "maps" file is made read-only except to the process itself or the owner of the process. Went into mainline kernel with sysctl toggle in 2.6.22. The toggle was made non-optional in 2.6.27, forcing the privacy to be enabled regardless of sysctl settings (this is a good thing).

 

11. ptrace scope

A troubling weakness of the Linux process interfaces is that a single user is able to examine the memory and running state of any of their processes. For example, if one application was compromised, it would be possible for an attacker to attach to other running processes (e.g. SSH sessions, GPG agent, etc) to extract additional credentials and continue to immediately expand the scope of their attack without resorting to user-assisted phishing or trojans.
In Ubuntu 10.10 and later, users cannot ptrace processes that are not a descendant of the debugger. The behavior is controllable through the/proc/sys/kernel/yama/ptrace_scope sysctl, available via Yama(centos kernel not support)

技术分享

 

12. /dev/mem protection

Some applications (Xorg) need direct access to the physical memory from user-space. The special file /dev/mem exists to provide this access. In the past, it was possible to view and change kernel memory from this file if an attacker had root access. The CONFIG_STRICT_DEVMEM kernel option was introduced to block non-device memory access (originally named CONFIG_NONPROMISC_DEVMEM).

 

13. Block module loading

In Ubuntu 8.04 LTS and earlier, it was possible to remove CAP_SYS_MODULES from the system-wide capability bounding set, which would stop any new kernel modules from being loaded. This was another layer of protection to stop kernel rootkits from being installed. The 2.6.25 Linux kernel (Ubuntu 8.10) changed how bounding sets worked, and this functionality disappeared. Starting with Ubuntu 9.10, it is now possible to block module loading again by setting "1" in /proc/sys/kernel/modules_disabled.

技术分享

 

14. Syscall Filtering

Programs can filter out the availability of kernel syscalls by using the seccomp_filter interface. This is done in containers or sandboxes that want to further limit the exposure to kernel interfaces when potentially running untrusted software.

 

15. How To Harden Linux

要实现系统级的安全加固,防御Error Based DOS、Buffer Overflow攻击,需要从Linux内核配置和编译器的编译选项两个方面同时入手,但是大多数时候,编译器选项是我们没法控制的,我们只能从系统内核配置这个角度入手

0x1: Ubuntu

1. echo 2 > /proc/sys/kernel/randomize_va_space
2. echo 1 >  /proc/sys/kernel/yama/protected_sticky_symlinks
3. echo 1 > /proc/sys/kernel/yama/protected_nonaccess_hardlinks
4. echo 1 > /proc/sys/kernel/yama/ptrace_scope 
5. echo 1 > /proc/sys/kernel/modules_disabled: 需要慎重,这个Kernel选项禁止root用户加载LKM
6. echo 1 > /proc/sys/kernel/kptr_restrict
7. echo 1 > /proc/sys/kernel/dmesg_restrict: 需要慎重,这个Kernel选项禁止root用户查看内核调试信息

0x2: Centos

Relevant Link:

https://fedoraproject.org/wiki/SecurityBasics
https://wiki.ubuntu.com/Security/Features
http://wiki.centos.org/HowTos/Custom_Kernel
https://www.usenix.org/legacy/event/sec08/tech/full_papers/dalton/dalton_html/
https://www.blackhat.com/presentations/bh-usa-04/bh-us-04-silberman/bh-us-04-silberman-paper.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.208.9387&rep=rep1&type=pdf
http://www.informit.com/articles/article.aspx?p=2036582&seqNum=6

 

Copyright (c) 2015 LittleHann All rights reserved

 

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。