c-language-tips

Overview

parameter passing

x86 always uses stack to pass function parameter, while x86-64 always uses registers to pass function if possible(as it has more registers than x86)

  • ESP (Extended Stack Pointer), always points to the top of the stack. RSP(x86-64)

  • EBP (Extended Stack Base pointer), like a stack boundary for a function, all stack variables are offset from it.RBP(x86-64), EBP bottom of stack which has the highest address.

  • push

    • decrease ESP(get a slot), save value to stack(slot, 4 bytes or 8 bytes etc)
  • pop

    • get the value, increase ESP

stack change when call a function on x86-64

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <stdio.h>
#include <stdlib.h>
int foobar(int a, int b, int c)
{
int xx = a + 2;
int yy = b + 3;
int zz = c + 4;
int sum = xx + yy + zz;

return xx * yy * zz + sum;
}

int main()
{
return foobar(77, 88, 99);
}

On x86-64, arguments passed into foobar() use registers, while local variables of that function, along with some other data, are going to be stored on the stack when foobar is called. This set of data on the stack is called a frame for this function.

stack

  1. push IP into stack(called by CPU at runtime,you can not see this in assembly)
  2. push rbp(parent) into stack
  3. set rbp with rsp
  4. pass parameter into register

after call foobar

  1. push local variable into stack
  2. calculate
  3. pop rbp
  4. pop IP

Note: One process just has one stack, not each for a function, but xx,yy,zz location may be different based on complier or os arch, before calling a function, CPU will put the next ins(IP) onto stack.

x86-64 assembly

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
4:a.c           **** int foobar(int a, int b, int c)
5:a.c **** {
11 0000 55 pushq %rbp
14 0001 4889E5 movq %rsp, %rbp
16 0004 897DEC movl %edi, -20(%rbp)---->get parameter from register, save to local stack
17 0007 8975E8 movl %esi, -24(%rbp)
18 000a 8955E4 movl %edx, -28(%rbp)
6:a.c **** int xx = a + 2;
20 000d 8B45EC movl -20(%rbp), %eax
21 0010 83C002 addl $2, %eax
22 0013 8945F0 movl %eax, -16(%rbp)---> save result to xx
7:a.c **** int yy = b + 3;
24 0016 8B45E8 movl -24(%rbp), %eax
25 0019 83C003 addl $3, %eax
26 001c 8945F4 movl %eax, -12(%rbp)---> save result to yy
8:a.c **** int zz = c + 4;
28 001f 8B45E4 movl -28(%rbp), %eax
29 0022 83C004 addl $4, %eax
30 0025 8945F8 movl %eax, -8(%rbp)----> save result to zz
9:a.c **** int sum = xx + yy + zz;
32 0028 8B55F0 movl -16(%rbp), %edx
33 002b 8B45F4 movl -12(%rbp), %eax
34 002e 01C2 addl %eax, %edx ---> edx has result xx + yy
35 0030 8B45F8 movl -8(%rbp), %eax
36 0033 01D0 addl %edx, %eax
37 0035 8945FC movl %eax, -4(%rbp)-->save result to sum
10:a.c ****
11:a.c **** return xx * yy * zz + sum;
39 0038 8B45F0 movl -16(%rbp), %eax
40 003b 0FAF45F4 imull -12(%rbp), %eax
41 003f 0FAF45F8 imull -8(%rbp), %eax
42 0043 89C2 movl %eax, %edx
43 0045 8B45FC movl -4(%rbp), %eax
44 0048 01D0 addl %edx, %eax
12:a.c **** }
...
14:a.c **** int main()
15:a.c **** {
58 004c 55 pushq %rbp----->save rbp
61 004d 4889E5 movq %rsp, %rbp-->move rbp points to rsp
16:a.c **** return foobar(77, 88, 99);
64 0050 BA630000 movl $99, %edx---->use register to pass parameter
64 00
65 0055 BE580000 movl $88, %esi
65 00
66 005a BF4D0000 movl $77, %edi
66 00
67 005f E8000000 call foobar
67 00
17:a.c **** }
68 .loc 1 17 0
69 0064 5D popq %rbp

function vs macro

Both are used to replace similar code that are always used in different places, macro makes writing code simple and easy to read, but the assembly code is not reduced after used macro while function has the same advantages as Macro does, and the assembly code is small, takes small disk space and small memory when loaded after used function. but the disadvantages is that calling a function takes some time if it's not a inline function. below are some general rules to obey when use function or macro.

rules to obey

  • large similar use function

    here I say similar not same, same code is easy to define a function, but similar code may have some trouble, similar code here the logical process is same, but the variable may be different TYPE, as function parameter only has one TYPE, so it’s not easy for this case, the solution for this is to abstract the same part for each of TYPE, access the same part of it in the function, if TYPES have no same parts(members), define each of a tinny function(different part depends on type)(void*) for each of the TYPE, call the tinny function in the common function, so that common function(with void*) can replace the 'similar code' and easy to read and extend for new TYPES.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// A and B function have the same logical but only small part is different.
void tiny_a(void *v) {
}

void A(int a) {
common(a, type)
}
void tiny_b(void *v) {
}

void B(char b) {
common(b, type)
}

void common(void* v, type) {
//same logical
if (type == A) {
tiny_a(v); //special for A
}
if (type == B) {
tiny_b(v); // special for B
}
//same logical
}
  • small similar code two choices depending on the code
    • similar code accesses the same type of variable, use inline function(can use macro as well),
    • similar code accessed the different types of variables use Macro

tips for using macro

  • you can pass parameter from one macro to another.

  • # used to get the string of “passed_value” in the body

  • ##used to link parameter with other in the body

  • you can NOT use # and ## during parameter passing, there is no effect

  • do NOT define variable inside macro while used outside of it

  • macro is only effect from the beginning to the end of that file, but if you define it in *.h, as you know *.h will be copied to the file(*.c) who includes it

difference between ## and # in macro

1
2
3
4
5
6
7
8
9
10
// '#' used to get the string of the parameter, you can pass hello without "" but #msg return the value of "hello" !!!
#include <stdio.h>
#include <stdlib.h>

#define print(msg) printf("msg: %s\n", #msg)

int main() {
print(hello);
print("hello");
}

msg: hello

msg: “hello”

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
file: a.c
// '##' used to link tokens, normally, parameter will replace token in the body of macro
// but how to identify a token in the body of macro
// a token is separated by whitespace or ; or others(parameter) after C compiler parsed.

#include <stdio.h>
#include <stdlib.h>

#define print(msg) \
{\
printf("%s\n", b_msg);\ // here msg is NOT replaced.
printf("%s\n", b_##msg);\ // msg is replaced with hello
printf("%s\n", b_##msg##_a);\
}

int main() {
print(hello);
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
root@dev:/tmp# gcc -o t a.c
a.c: In function ‘main’:
a.c:6:20: error: ‘b_msg’ undeclared (first use in this function)
printf("%s\n", b_msg);\
^
a.c:12:9: note: in expansion of macro ‘print’
print(hello);
^
a.c:6:20: note: each undeclared identifier is reported only once for each function it appears in
printf("%s\n", b_msg);\
^
a.c:12:9: note: in expansion of macro ‘print’
print(hello);
^
a.c:7:20: error: ‘b_hello’ undeclared (first use in this function)
printf("%s\n", b_##msg);\
^
a.c:12:9: note: in expansion of macro ‘print’
print(hello);
^
a.c:8:20: error: ‘b_hello_a’ undeclared (first use in this function)
printf("%s\n", b_##msg##_a);\
^
a.c:12:9: note: in expansion of macro ‘print’
print(hello);

ifdef/if vs macro

macro and ifdef are executed before compiling(**precompile**) but macro is more earlier than ifdef, so first taking replacement with macro, then ifdef.

  • how to identify a token(token separated by whitespace) that will be replaced by macro.
    • function name
    • parameter
    • lvalue/rvalue

if Token is equal(same with macro parameter) with macro, replace it with macro!!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>
#include <stdlib.h>

#define HELLO hello
#ifdef HELLO
#define NUM 12
#endif
void hello(int a)
{
printf("hello %d \n", a);
}

int main()
{
HELLO(NUM);
return 0;
}
  • what can be used as a condition in ifdef

literal object(integer operation), number 12 or char ‘H’ , if you use variable, its evaluation is always 0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <stdio.h>
#include <stdlib.h>

//it's ok as replacement take first like 12 >10
#define NUM 12
#if NUM > 10
#define var1 10
#endif

//as we never define name so name is 0!! var3 is defined
#if name!='H'
#define var3 30
#endif

//as we never define name and name1, so they always 0, var4 is defined
#if name==name1
#define var4 40
#endif

int main()
{
printf("%d %d %d %d\n", var1, var3, var4);
return 0;
}

Free memory when process exits

It depends on the operating system. The majority of modern (and all major) operating systems will free memory when process ends.

But relying on this is bad practice and it is better free it explicitly. The issue isn’t just that your code looks bad. You may decide you want to integrate your small program into a larger, long running one. Then a while later you have to spend hours tracking down memory leaks.

Relying on a feature of an operating system also makes the code less portable.

Even if your OS (not all do that) frees memory at exit, there are some reasons to free it explicitly.

  • it’s good manner
  • it adds symmetry, so code looks better
  • if someone takes this code and place it in a program that uses the libraries only in a small part of his runtime, the resources should be free when not needed.
  • if you are looking for bad memory leaks, your debugger won’t find these unimportant ones.
  • OS does not automatically free some resources at exit, like devices (sensors, scanners…), temporary files, shared memory.

Send file descriptor to another process

File descriptor can be sent only by AF_LOCAL or AF_UNIX, as during sending, Actually the fd number is not transmitted(it's local for that process), but file descriptor instance(file struct in kernel) is transmitted in skb(scm data), so that receiver can add the transfered file descriptor instance to its process with new fd in its scope, so that after transmission, the file descriptor can be accessed by sender and receiver, actually both linked to it, in most case, the sender will close the fd at its side, hence only receiver accessed that file descriptor.

Steps to take

  • Giving SOL_SOCKET to cmsg_level of a struct cmsghdr
  • Giving SCM_RIGHTS to cmsg_type of the struct cmsghdr
  • Placing the file descriptor at last of the struct cmsghdr
  • Assigning the struct cmsghdr to msg_control of a struct msghdr
  • Passing the struct msghdr to sendmsg(2) with the Unix domain socket

Using CMSG_DATA is easy to get a pointer to last of the struct cmsghdr for 3

Client

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/un.h>
#include <sys/wait.h>
#include <sys/socket.h>

#define handle_error(msg) \
do { \
perror(msg); \
exit(EXIT_FAILURE); \
} while (0)

static void send_fd(int socket, int* fds, int n) // send fd by socket
{
struct msghdr msg = { 0 };
struct cmsghdr* cmsg;
char buf[CMSG_SPACE(n * sizeof(int))], dup[256];
memset(buf, '\0', sizeof(buf));
struct iovec io = { .iov_base = &dup, .iov_len = sizeof(dup) };

msg.msg_iov = &io;
msg.msg_iovlen = 1;
msg.msg_control = buf;
msg.msg_controllen = sizeof(buf);

cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN(n * sizeof(int));

memcpy((int*)CMSG_DATA(cmsg), fds, n * sizeof(int));

if (sendmsg(socket, &msg, 0) < 0)
handle_error("Failed to send message");
}

int
main(int argc, char* argv[])
{
int sfd, fds[1];
struct sockaddr_un addr;

if (argc != 2) {
fprintf(stderr, "Usage: %s <file-name1>\n", argv[0]);
exit(1);
}

// use AF_LOCAL to transfer fd
sfd = socket(AF_UNIX, SOCK_STREAM, 0);
if (sfd == -1)
handle_error("Failed to create socket");

memset(&addr, 0, sizeof(struct sockaddr_un));
addr.sun_family = AF_UNIX;
strncpy(addr.sun_path, "/tmp/fd-pass.socket", sizeof(addr.sun_path) - 1);

fds[0] = open(argv[1], O_RDONLY);
if (fds[0] < 0)
handle_error("Failed to open file 1 for reading");
else
fprintf(stdout, "Opened fd %d in parent\n", fds[0]);

// connect with UNIX server to send
if (connect(sfd, (struct sockaddr*)&addr, sizeof(struct sockaddr_un)) == -1)
handle_error("Failed to connect to socket");

send_fd(sfd, fds, 1);

fprintf(stdout, "Sleep 1000s after sent fd\n");
sleep(1000);
exit(EXIT_SUCCESS);
}

Server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/un.h>
#include <sys/wait.h>
#include <sys/socket.h>

#define handle_error(msg) \
do { \
perror(msg); \
exit(EXIT_FAILURE); \
} while (0)

static int* recv_fd(int socket, int n)
{
int* fds = malloc(n * sizeof(int));
struct msghdr msg = { 0 };
struct cmsghdr* cmsg;
char buf[CMSG_SPACE(n * sizeof(int))], dup[256];
memset(buf, '\0', sizeof(buf));
struct iovec io = { .iov_base = &dup, .iov_len = sizeof(dup) };

msg.msg_iov = &io;
msg.msg_iovlen = 1;
msg.msg_control = buf;
msg.msg_controllen = sizeof(buf);

// get transferred file descriptor
if (recvmsg(socket, &msg, 0) < 0)
handle_error("Failed to receive message");

cmsg = CMSG_FIRSTHDR(&msg);

memcpy(fds, (int*)CMSG_DATA(cmsg), n * sizeof(int));

return fds;
}

int
main(int argc, char* argv[])
{
ssize_t nbytes;
char buffer[256];
int sfd, cfd, *fds;
struct sockaddr_un addr;
int i;

sfd = socket(AF_UNIX, SOCK_STREAM, 0);
if (sfd == -1)
handle_error("Failed to create socket");

if (unlink("/tmp/fd-pass.socket") == -1 && errno != ENOENT)
handle_error("Removing socket file failed");

memset(&addr, 0, sizeof(struct sockaddr_un));
addr.sun_family = AF_UNIX;
strncpy(addr.sun_path, "/tmp/fd-pass.socket", sizeof(addr.sun_path) - 1);

if (bind(sfd, (struct sockaddr*)&addr, sizeof(struct sockaddr_un)) == -1)
handle_error("Failed to bind to socket");

// listen on given address
if (listen(sfd, 5) == -1)
handle_error("Failed to listen on socket");

// get one connection
cfd = accept(sfd, NULL, NULL);
if (cfd == -1)
handle_error("Failed to accept incoming connection");

fds = recv_fd(cfd, 1);

for (i = 0; i < 1; ++i) {
fprintf(stdout, "Reading from passed fd %d\n", fds[i]);
while ((nbytes = read(fds[i], buffer, sizeof(buffer))) > 0)
// read file and write it to stdout
write(1, buffer, nbytes);
*buffer = '\0';
}

printf("sleep 1000 after read file\n");
sleep(1000);
free(fds);

if (close(cfd) == -1)
handle_error("Failed to close client socket");

return 0;
}

IPC

  • Pipe, socketpair, socket(AF_LOCAL)
  • signal, eventfd
  • shared memory, file, need lock

pipe

Only for parent process—->child process(can be used by different processes as well by send one fd to process), it’s stream-oriented

It’s a one-way communication mechanism, with one end opened for reading(fd[0]) and the other end for writing(fd[1]). Therefore, parent and child need to agree on which way to use the pipe, from parent to child or the other way around. A pipe is also a stream communication mechanism, that is all messages sent through the pipe are placed in order,when readers asks for a certain number of bytes from this stream, he is given as many as bytes as are available, up to the amount of request, Note that these bytes may have come from the same call to write() or from several calls to write() which are concatenated.

socketpair

Only for parent process<—->child process, it can be stream or datagram

Two-way communication mechanism, an extension of pipe.

named pipe

any process<—->any process

int mkfifo(const char *pathname, mode_t mode)

A FIFO special file is similar to a pipe, except that it is created in a different way. Instead of being an anonymous communications channel, a FIFO special file is entered into the filesystem by calling mkfifo().make sure delete it by your application(call remove() in your application, otherwise it’s left on system

One side

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void main()
{
int fd, n;

char buf[BUFFSIZE];


mkfifo("fifo_x", 0666);
// create file on disk you can see it
//prw-r--r-- 1 root root 0 Aug 5 16:50 fifo_x

if ( (fd = open("fifo_x", O_WRONLY)) < 0)
err("open")

while( (n = read(STDIN_FILENO, buf, BUFFSIZE) ) > 0) {
if ( write(fd, buf, strlen(buf)) != strlen(buf)) {
err("write");
}
}
close(fd);
}

the other

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void main()
{
int fd, n;
char buf[BUFFSIZE];

if ( (fd = open("fifo_x", O_RDONLY)) < 0)
err("open")


while( (n = read(fd, buf, BUFFSIZE) ) > 0) {

if ( write(STDOUT_FILENO, buf, n) != n) {
exit(1);
}
}
close(fd);
}

Two-way communication mechanism.

AF_LOCAL(AF_UNIX) socket

any process<—->any process

Unix domain allows communication between any two processes that are in same machine.

signal

any process<—->any process within same machine

1
2
# send signal to process with kill() API
int kill(pid_t pid, int signal);

tiny, only signal number sent, can not take payload

eventfd

it’s fd but used for event counter, parent<—->child

between threads or between process to notify something happens by write a counter into kernel and walkup the reader to handle things.

1
2
3
4
5
6
7
8
9
uint64_t u;
efd = eventfd(0, 0);
switch(fork()) {
case 0: //child
u = 10;
write(efd, &u, sizeof(u));
default:
read(efd, &u, sizeof(uint64_t));
}

tiny, can only take integer payload

share memory

Any process<—>any process on same machine, most of time a lock is needed.

system-v style

it uses key to identify the named shared memory, so that other process can access the shared memory, but can also without key for private memory for parent/child, as the shared memory is created before fork(), so both see it.

1
2
3
4
5
6
7
8
9
//one process
key_t key;
key = 1000;
shmget(key, SHMSZ, IPC_CREAT | 0666));// create shared memory with flag: IPC_CREAT

//another process
key_t key;
key = 1000;
shmget(key, SHMSZ, 0666);//attach to the shared memory identified by key 1000
  • Private share memory parent/child

shmget() with IPC_PRIVATE as the key

IPC_PRIVATE isn't a flag field but a key_t type. If this special value is used for key, the system call ignores everything but the least significant 9 bits of shmflg and creates a new shared memory segment (on success). The name choice IPC_PRIVATE was perhaps unfortunate, IPC_NEW would more clearly show its function. Both child and parent see the same memory, they always use same memory!!!

posix style

Intend to share memory between unrelated processes. One process creates the file, then other process opens it and maps it for sharing (think it as a normal file, actually it's an identifier of memory)

it uses special file to identify the share memory located at /dev/shm

1
2
3
4
5
6
7
8
9
10
11
12
13
//one process
fd = shm_open("test", O_CREAT | O_RDWR, 0666);//create a file at /dev/shm/test, O_CREAT

// set the shared memory size
addr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);


// the other
shm_fd = shm_open("test", O_RDONLY, 0666);
//attach to it
addr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);

As shm_open() creates a file descriptor, so user must call shm_unlink() to close it(close fd, decrease the reference counter)

Warning

if process does not call shm_unlink, when fd is closed, reference count is not decreased!!! so the shared memory may be never freed at all.

  • share memory between parent/child

without shm_open(), use MAP_ANONYMOUS flag and fd(-1) as parent/child all know the shared memory

1
2
3
//before fork()
addr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
  • for mmap() with MAP_PRIVATE

Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region

That means each process has its own copy of it if it writes to shared memory, other process does not see changed part!!!!

Multiplexing

The poll() API performs the similar API as the existing select() API. The only difference between these two APIs is the interface provided to the caller.

The select() API requires that the application pass in a set of bits in which one bit is used to represent each descriptor number(for example if we only create a socket with fd 1000, the first argument for select is 1000+1, and kernel will create 1001 structure in kernel and unnecessary loop!!!, even if we only create one socket). When descriptor numbers are very large, it can overflow the 30KB allocated memory size, forcing multiple iterations of the process. This overhead can adversely affect performance.


The poll() API allows the application to pass an array of structures rather than a set of bits. Because each pollfd structure can contain up to 8 bytes, the application only needs to pass one structure for each descriptor (we created), even if descriptor numbers are very large.


Both poll() and select() require the full list of file descriptors to watch on each invocation. The kernel must then walk the list of each file descriptor to be monitored. When this list grows large, it may contain hundreds or even thousands of file descriptors walking the list on each invocation becomes a scalability bottleneck.

  • select needs to set monitor fd for each call

  • select limits fd number 1024, not true for poll

  • even with one fd=1000, select has to iterate from 0–1000 times, not true for poll

  • poll needs set monitor fd only once!!

  • poll no limits fd number

  • if only with one fd=1000, poll iterates only once, it only iterates all the monitored fd to see if it has event.


epoll circumvents this problem by decoupling the monitor registration from the actual monitoring. One system call initializes an epoll context, another adds monitored file descriptors to or removes them from the context, and a third performs the actual event wait.

  • epoll needs set monitor fd only once!!
  • epoll no limits fd number
  • it only iterates the monitored fd which has event, fast.

select

One API: select()

1
2
3
4
5
6
select(fdmax+1, &read_fds, NULL, NULL, NULL)
for(i = 0; i <= fdmax; i++) {
// check which one has read event
}
// fdmax the max fd descriptor we monitors
// read_fds: 1024 bit sets, set the fd bit if we want to monitor it.

For each loop

  • tell the max fd and fds that monitors
  • if event happens, check all from 0--max to know event on which fd(both in kernel and user)

The major disadvantages include:

  • select modifies the passed fd_sets so none of them can be reused. Even if you don’t need to change anything, such as if one of descriptors received data and needs to receive more data, a whole set has to be either recreated again or restored from a backup copy via FD_COPY. And this has to be done each time the select is called.

  • To find out which descriptors raised the events you have to manually iterate through all the descriptors in the set and call FD_ISSET on each one of them. When you have 2,000 of those descriptors and only one of them is active, likely, the last one, you’re wasting CPU cycles each time you wait.

  • Max number of supported fd 1024, Did I just mention 2,000 descriptors? Well, select cannot support that much. At least on Linux. The maximum number of the supported descriptors is defined by the FD_SETSIZE constant, which Linux happily defines as 1024. And while some operating systems allow you to hack this restriction by redefining the FD_SETSIZE before including the sys/select.h, this is not portable. Indeed, Linux would just ignore this hack and the limit will stay the same.

  • Can not close fd if it's in select() now, You cannot modify the descriptor set from a different thread while waiting. Suppose a thread is executing the code above. Now suppose you have a housekeeping thread which decided that sock1 has been waiting too long for the input data, and it is time to cut the cord. Since this socket could be reused to serve another paying working client, the housekeeping thread wants to close the socket. However the socket is in the fd_set which select is waiting for. Now what happens when this socket is closed? man select has the answer, and you won’t like it. The answer is, “If a file descriptor being monitored by select() is closed in another thread, the result is unspecified”.

  • Can't send data on fd if it's in select() now Same problem arises if another thread suddenly decides to send something via sock1. It is not possible to start monitoring the socket for the output event until select returns.

  • select puts extra burden on you when filling up the descriptor list to calculate the largest descriptor number and provide it as a function parameter.

is there is any reason to use select at all?

  • The first reason is portability. select has been around for ages, and you can be sure that every single platform around which has network support and nonblocking sockets will have a working select implementation while it might not have poll at all. And unfortunately I’m not talking about the tubes and ENIAC here; poll is only available on Windows Vista and above which includes Windows XP – still used by the whooping 34% of users as of Sep 2013 despite the Microsoft pressure. Another option would be to still use poll on those platforms and emulate it with select on those which do not have it; it is up to you whether you consider it reasonable investment.

  • The second reason is more exotic, and is related to the fact that select can – theoretically – handle the timeouts with in the one nanosecond precision, while both poll and epoll can only handle the one millisecond precision. This is not likely to be a concern on a desktop or server system, which clocks doesn’t even run with such precision, but it may be necessary on a realtime embedded platform while interacting with some hardware components. Such as lowering control rods to shut down a nuclear reactor – in this case, please, use select to make sure we’re all stay safe!

The case above would probably be the only case where you would have to use select and could not use anything else. However if you are writing an application which would never have to handle more than a handful of sockets (like, 200), the difference between using poll and select would not be based on performance, but more on personal preference or other factors.

poll

One API: poll()

1
2
3
4
5
poll(ufds, 2, 3500);
/* ufds array of fds monitor
* 2 the array size (how many fds we want to monitor)
* 3500 3.5s
*/

For each loop

  • tell only the fds monitors not max fd as select does.
  • if event happens, check all monitors to see event on which fd(both in kernel and user), iterate all and compare to find th
    e fd that has events.

poll was mainly created to fix the pending problems select had, so it has the following advantages over it:

  • There is no hard limit on the number of descriptors poll can monitor, so the limit of 1024 does not apply here.

  • It does not modify the data passed in the struct pollfd data. Therefore it could be reused between the poll() calls as long as set to zero the revents member for those descriptors which generated the events. The IEEE specification states that “In each pollfd structure, poll() shall clear the revents member, except that where the application requested a report on a condition by setting one of the bits of events listed above, poll() shall set the corresponding bit in revents if the requested condition is true“. However in my experience at least one platform did not follow this recommendation, and man 2 poll on Linux does not make such guarantee either (man 3p poll does though).

  • It allows more fine-grained control of events comparing to select. For example, it can detect remote peer shutdown without monitoring for read events.

poll still has a few issues which need to be kept in mind:

  • Like select, it is still not possible to find out which descriptors have the events triggered without iterating through the whole list and checking the revents. Worse, the same happens in the kernel space as well, as the kernel has to iterate through the list of file descriptors to find out which sockets are monitored, and iterate through the whole list again to set up the events.

  • Like select, it is not possible to dynamically modify the set or close the socket which is being polled

poll should be your preferred method even over epoll if the following is true:

  • You need to support more than just Linux, and do not want to use epoll wrappers such as libevent (epoll is Linux only);

  • Your application needs to monitor less than 1000 sockets at a time (you are not likely to see any benefits from using epoll);

  • Your application needs to monitor more than 1000 sockets at a time, but theconnections are very short-lived (this is a close case, but most likely in this scenario you are not likely to see any benefits from using epoll because the speedup in event waiting would be wasted on adding those new descriptors into the set – see below)

  • Your application is not designed the way that it changes the events while another thread is waiting for them (i.e. you’re not porting an app using kqueue or IO Completion Ports).

epoll


Two APIs: epoll_ctl() and epoll_wait()

For each loop

  • no need to tell fds monitors as kernel keeps it from another API
  • if event happens, no need to check to see event on which fd, as epoll() returns only the fd that has events.

epoll has some significant advantages over select/poll both in terms of performance and functionality:

  • epoll returns only the list of descriptors which triggered the events. No need to iterate through 10,000 descriptors anymore to find that one which triggered the event!

  • You can attach meaningful context to the monitored event instead of socket file descriptors. In our example we attached the class pointers which could be called directly, saving you another lookup.

  • You can add sockets or remove them from monitoring anytime, even if another thread is in the epoll_wait function. You can even modify the descriptor events. Everything will work properly, and this behavior is supported and documented. This gives you much more flexibility in implementation.

  • Since the kernel knows all the monitoring descriptors, it can register the events happening on them even when nobody is calling epoll_wait. This allows implementing interesting features such as edge triggering

  • It is possible to have the multiple threads waiting on the same epoll queue with epoll_wait(), something you cannot do with select/poll. In fact it is not only possible with epoll, but the recommended method in the edge triggering mode.

epoll is not a “better poll”, and it also has disadvantages when comparing to poll:

  • Changing the event flags (i.e. from READ to WRITE) requires the epoll_ctl syscall, while when using poll this is a simple bitmask operation done entirely in userspace. Switching 5,000 sockets from reading to writing with epoll would require 5,000 syscalls and hence context switches (as of 2014 calls to epoll_ctl still could not be batched, and each descriptor must be changed separately), while in poll it would require a single loop over the pollfd structure.

  • Each accepted socket needs to be added to the set, and same as above, with epoll it has to be done by calling epoll_ctl which means there are two required syscalls per new connection socket instead of one for poll. If your server has many short-lived connections which send or receive little traffic, epoll will likely take longer than poll to serve them.

  • epoll is exclusively Linux domain, and while other platforms have similar mechanisms, they are not exactly the same ,edge triggering, for example, is pretty unique (FreeBSD’s kqueue supports it too though).

  • High performance processing logic is more complex and hence more difficult to debug, especially for edge triggering which is prone to deadlocks if you miss extra read/write.

Therefore you should only use epoll if all following is true:

  • Your application runs a thread poll which handles many network connections by a handful of threads. You would lose most of epoll benefits in a single-threaded application, and most likely it won’t outperform poll.

  • You expect to have a reasonably large number of sockets to monitor (at least 1,000); with a smaller number epoll is not likely to have any performance benefits over poll and may actually worse the performance;

  • Your connections are relatively long-lived; as stated above epoll will be slower than poll in a situation when a new connection sends a few bytes of data and immediately disconnects because of extra system call required to add the descriptor into epoll set;

  • Your app depends on other Linux-specific features (so in case portability question would suddenly pop up, epoll wouldn’t be the only roadblock), or you can provide wrappers for other supported systems. In the last case you should strongly consider libevent.

epoll Triggering modes

epoll provides both edge-triggered and level-triggered modes. In edge-triggered mode, a call to epoll_wait will return only when a new event is enqueued with the epoll object, you should receive all data when event happens, otherwise, next call epoll_wait will block if no new data, while in level-triggered mode, epoll_wait will return as long as the condition holds.

For instance, if a pipe registered with epoll has received data, a call to epoll_wait will return, signaling the presence of data to be read. Suppose, the reader only consumed part of data from the buffer. In level-triggered mode, further calls to epoll_wait will return immediately, as long as the pipe’s buffer contains data to be read. In edge-triggered mode, however, epoll_wait will return only once new data is written to the pipe.

edge-triggered mode must read them all if read event happens

  • how do I know all data is read for this time(read event)?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
n = recv(fd, buf, buflen)
if (n == 0) {
//end of file
close(fd);
}
if (n == -1) {
if (error == EAGAIN) {
//no data to read, !!!read them all!!!
} else (error == EINTR) {
//interrupted when block
}
}
if (n < buflen) {
// !!!read them all!!!
}

Multi-thread

pthread_cleanup_pop and pthread_cleanup_push

As multi-threads share the address namespace, if a resources is shared by multi-thread for writing, a mutex lock is needed for this, something like this in all threads, but in some case one thread may call pthread_cancel() to abort another thread, if that thread just gets lock but did not run unlock, it will be terminated, no chance to run unlock, hence causes other threads waiting on the lock which will never be freed,deadlock. so we can add a callback that will be called even abort signal by pthread_cleanup_pop and pthread_cleanup_push.

1
2
3
pthread_mutex_lock(&mutex);
// do_something()
pthread_mutex_unlock(&mutex);

pthread_cleanup_push added a callback that will be called even receives abort signal, so we can unlock the mutex in
the callback.

the callback executes in three cases.

  • The thread exits (that is, calls pthread_exit()).
  • The thread acts upon a cancellation request.
  • The thread calls pthread_cleanup_pop() with a non-zero execute argument
1
2
3
4
5
6
// void pthread_cleanup_push(void (*routine)(void*), void *arg); 
pthread_cleanup_push(some_clean_func,...)
pthread_mutex_lock(&mutex);
// do_something
pthread_mutex_unlock(&mutex);
pthread_cleanup_pop(0); # remove the callback, never call it.

pthread_join

In multiple threads, one thread may depend on another thread, for instance, A must execute after B quits, or before A quits we must quit B firstly, that’s why pthread_join(wait another thread exits, blocked caller) comes in. one typical case:

In main thread, we create a new thread, if no pthread_join, the two threads may run across, if main thread quits before the new thread (new thread still some work to do), the left work can’t be done, because when the main thread quits, all resources will be freed by OS, the new thread will exit as well!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
void *start(void *str)
{
int i;
for (i = 0; i < 10; ++i)
{
sleep(2);
printf( "This in the thread : %d\n" , i );
}
return NULL;
}

int main()
{
pthread_t pth;
int i;
int ret = pthread_create(&pth, NULL, start, (void *)(i)); //start to run now
pthread_join(pth, NULL); //wait until thread exits
for (i = 0; i < 10; ++i)
{
sleep(1);
printf( "This in the main : %d\n" , i );
}

return 0;
}

// If without pthread_join(pth, NULL), the output is unpredictable!!!! Because you
// don’t know when the main thread will exit.

thread-safe vs multi-safe vs signal-safe

lockless queue/stack

In order to achieve lockless, there are two main points.

  • use CMPXCHG(x86) directive atomic operation
  • optimize your code and data structure, to compress your code to access critical area in one statement or check and set flags in one ins, then access critical area to block from user level.

what does CMPXCHG do

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//simulate it in C code
int val_compare_and_swap ( int *memory_location, int expected_value, int new_value)
{
int old_value = *memory_location;
if (old_value == expected_value)
*memory_location = new_value;
return old_value;
}

bool bool_compare_and_swap ( int *memory_location, int expected_value, int new_value)
{
int old_value = *memory_location;
if (old_value == expected_value) {
*memory_location = new_value;
return true;
}
return false;
}

Thanks to gcc, we no need to write assembly code to do it, as it provides us two functions with written in assembly(CMPXCHG)

GCC built-in

1
2
bool __sync_bool_compare_and_swap (type *ptr, type oldval, type newval, ...)
type __sync_val_compare_and_swap (type *ptr, type oldval, type newval, ...)

lockless stack

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct node {
int val;
struct node *next;
};

struct list {
struct node *head;
void push(int val) {
struct node *p = malloc(sizeof(struct node));
if (p) {
p->val = val;
while(1) {
p->next = head;
//check if head is changed by another thread, if not switch
//otherwise, do another loop.
if (__sync_bool_compare_and_swap(&head, p->next, p)) {
break;
}
}
}
}
};

glibc pthread mutex Semaphores and futex

Before introduce futex(fast userspace mutex), let’s see how pthread_mutex implement in earlier days, when you call pthread_mutex_lock(), it calls another sytem call whick will do below things in kernel.

  • check if the lock available
  • if unavailable, sleep the caller.

what’s the problem with such solution? let’s say if the lock is not used frequently, we still need to go to kernel to check and get the lock. but for check and get the lock we can move it to user space, so that if lock is not used highly, we do not need to go to kernel as no need to sleep the process which must be done by kernel. this is the core concept that futex does, futex includes two parts one part is in glibc that checks the lock, the other part is in kernel when lock not available, call a system api(futex()) to sleep the caller, futex can also wake a process based on parameter

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
//futex 的逻辑可以用如下C语言表示

int val = 0;
void lock()
{
int c
if ((c = cmpxchg(val, 0, 1)) != 0) {
if (c != 2)
c = xchg(val, 2);
while (c != 0) {
futex_wait((&val, 2);
c = xchg(val, 2);
}
}
}

void unlock()
{
if (atomic_dec(val) != 1)
futex_wake(&val, 1);
}

futex(uaddr, FUTEX_WAKE, 1)

futex(uaddr, FUTEX_WAIT, 1)

From kernel 2.5.7(2002), glic uses futex to implement pthread_mutex and semaphores, pthread_join() also calls futex to sleep the caller

get thread id

1
2
3
4
5
#include <sys/types.h>
#include <sys/syscall.h>

pid_t tid = syscall(SYS_gettid);
printf("%d", tid);

condition

Condition used like this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
//thread 1:
pthread_mutex_lock(&mutex);
while (!condition) // condition is var protected by mutex while cond is pthread_cond_t
pthread_cond_wait(&cond, &mutex); // it's always block!!! and mutex is freed
// when wake up next time mutex is gained again!!!

/* do something that requires holding the mutex and condition is true */
pthread_mutex_unlock(&mutex);

thread2:
pthread_mutex_lock(&mutex);
/* do something that might make condition true */
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

Multi-process

wait() vs waitpid()

Always call wait in parent process, otherwise child process will never die(destroyed by os), without wait() call, exited child is in zombie state, memory is freed, but task struct is not freed.

zombie process

When a process exits, OS frees all memory for it and closes all fds, only left task_struct which has meta data of exited process like process id, exit state, cpu usage etc, then OS sends SIGCHILD to its parent, before parent call wait or waitpid, the exited process is a zombie, if parent never calls wait/waitpid, it’s always zombie, after parent calls wait/waitpid, the only left task_struct is freed.

The wait() system call suspends execution of the current process until one of its children terminates. The call wait(&status) is equivalent to: waitpid(-1, &status, 0), but waitpid support non-block mode if no child exited by setting option.

  • WNOHANG return immediately if no child has exited.

The waitpid() system call suspends execution of the current process until a child specified by pid argument has changed state(terminated or stopped).

waitpid(pid, &status, WIFEXITED)

waitpid(pid, &status, WIFSTOPPED)

Always use waitpid() as it’s powerful,it can return when [any one] or ]particular one] or ]no one exited] while wait() always block, wakeup only when any child quits

execve()

exec() familiy executes the program referred to by pathname. This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segments, except below:

  • File descriptors open except for those whose close-on-exec flag (FD_CLOEXEC) is set
  • Process ID
  • Parent process ID
  • Process group ID
  • Session membership
  • Real user ID
  • Real group ID
  • Current working directory
  • File mode creation mask
  • Process signal mask
  • Pending signals

In short

You can think it roughly all are replaced except keeping the file descriptors, signal, pid, uid so make sure close these opened file, there are two ways to do this.

  • One way

    Add flag when open a file(socket) with O_CLOEXEC

  • The other way

    after open, use fcntl() to set a flag FD_CLOEXEC

demo for each function

Mix process and thread, call fork in thread?

Should never call fork in multi-thread as there are lots of non-obvious problems that you can’t detect them easily.

The non-obvious problem in this approach(mix them) is that at the moment of the fork(2) call some threads may be in critical sections of code, doing non-atomic operations protected by mutexes. In the child process the threads just disappears and left data half-modified without any possibility to “fix” them, there is no way to say what other threads were doing and what should be done to make the data consistent. Moreover: state of mutexes is undefined, they might be unusable and the only way to use them in the child is to call pthread_mutex_init() to reset them to a usable state. It’s implementation dependent how mutexes behave after fork(2) was called. On my Linux machine locked mutexes are locked in the child.

malloc, syslog(), printf() use lock internally, be careful with them in multi-thread or multi-process.

More details, refer to think-twice-before-using-tem

pipe vs pipe2 vs socketpair and dup vs dup2

pipe(fd[2])/pipe2(fd[2], flag) create a unidirection channel with two fds(fd[0] for read, while fd[1] for write), pipe2 gives more control when creating the channel like set NON_BLOCK etc.

pipe2() is to avoid race conditions by taking the O_CLOEXEC | O_NONBLOCK, https://man7.org/linux/man-pages/man2/open.2.html

socketpair(domain, proto, fd[2]) is similar to pipe, actually, it's extension of pipe, it's two-way communication.

dup(oldfd)/dup2(oldfd, newfd) make duplicate the fd, hence dup2 lets you provide the new fd while dup() pick the smallest unused fd

dup(oldfd) will duplicate the oldfd, the return value is new fd(dup always pick the smallest fd unused)

dup2(oldfd, newfd) takes two fd, oldfd must be created before dup2, it will use newfd points to oldfd, if newfd is open, it will silently close it before reuse it.

typical use for pipe

parent creates a pipe, then fork child, child/parent use pipe for one-way communication.

1
2
3
4
5
6
7
8
9
10
11
12
int fd[2];
pipe(fd);
int pid = fork();
if(pid) {
//parent read
close(fd[1]);
read(fd[0]);
} else {
//child write
close(fd[0]);
write(fd[1]);
}

typical use for dup/dup2 along with pipe

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int fd[2];
pipe(fd);
int pid = fork();
if (pid) {
//parent read
close(fd(1));
read(fd(0));
close[fd[0]);
} else {
//child write
close(fd[0])
//redirect STDOUT, hence parent will get the output from execv!!!
dup2(fd[1], STDOUT_FILENO);
execlp("grep", "grep", "pipe",NULL);
close(fd[1]);
}

little-endian and big-endian

“Little Endian” means that the low-order byte of the number is stored in memory at the lowest address, and the high-order byte at the highest address. (The little end comes first.) For example, a 4 byte Int

`Byte3 Byte2 Byte1 Byte0`

will be arranged in memory as follows:

1
2
3
4
Base Address+0   Byte0
Base Address+1 Byte1
Base Address+2 Byte2
Base Address+3 Byte3

Intel processors (those used in PC’s) use “Little Endian” byte order.
“Big Endian” means that the high-order byte of the number is stored in memory at the lowest address, and the low-order byte at the highest address. (The big end comes first.) Our LongInt, would then be stored as:

1
2
3
4
Base Address+0   Byte3
Base Address+1 Byte2
Base Address+2 Byte1
Base Address+3 Byte0

Motorola processors (those used in Mac’s) use “Big Endian” byte order.

daemonize

To daemonize, there are two ways to do it.

  • Make itself as daemon
    1. parent -> fork (child)
    2. parent exit
    3. init take orphan child
  • It’s daemon, want to daemonize its child(forked from me)
    1. parent ->fork(child)
    2. child -> fork(grandchild)
    3. child exit
    4. init take orphan grandchild

gcc

gcc provides lots of built-in keywords like inline, typeof etc for better use.

Note: while -ansi and the various -std options disable certain keywords (asm, inline typeof) in such case, use__asm__, __inline__, __typeof__ when -ansi or -std is enable during compiling.

typeof

get type of variable then define internal variable in Macro, this is what ‘typeof’ is always used.

1
2
3
4
5
//a and b are two pointers
typeof(int*) a, b;

//a is a pointer while b is int
typeof(int) *a, b;

inline

GCC does not inline any functions without optimizing(-o0) unless you specify the ‘always_inline’ attribute for the function.

  • suggest inline, inline or not determined by compiler
    inline void foo() {}
  • force inline
    inline void foo()__attribute__((always_inline)) {}

Note: inline keyword is added at function definition not declaration

What does compiler do if it inlines that function?

Actually for a inline function, compiler will repace the function call with extended code, that means there is no function call happens for inline function, fast but with more code, but there are exceptions here, if we assign inline function to a function pointer, compiler will create function(even it’s inline) for it with an address, otherwise, if this no function in assemble code for inline function.

inline in dot c file

if you define inline function in dot c and want to call it in another dot c, compiler will create a function address for inline function as well, another dot c sees it as a normal function, but in the file it defines, it's a inline. function.

inline in dot h file

if you define it in dot h, as dot h will be copied to the c file who includes it, so that there are several copied of this inline function like it's defined in that dot c file so it's inlined for dot c file.

For short shared function, define it as inline in dot h file with static keyword

static used in header

One important thing that must know is that for header file it will be copied to dot c file who includes it. that means if you static int b = 12 in a header file, if it’s included by two dot c files, it has the same effect that two dot c files define static int b = 12 for themself.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//File q7a.h:
static int err_code = 3;
void printErrCode(void);

//File q7a.c:
#include <stdio.h>
#include "q7a.h"

void printErrCode(void)
{
printf ("%d\n", err_code);
}

//File q7main.c:
#include "q7a.h"

int main(void)
{
err_code = 5;
printErrCode();

return 0;
}
1
2
3
4
5
$gcc –Wall –c q7a.c –o q7a.o
$gcc –Wall –c q7main.c –o q7main.o
$gcc q7main.o q7a.o –o q7
$./q7
the output is 3

attribute of function

__attribute__ can be used only in declaration! here only list common attributes that may be used in your daily life.

  • deprecated: indicate you’d better not use me as I’m deprecated, if used, warning will appear/
  • constructor: run before main
  • destructor: run after main
  • section: specify the section that I’m defined
  • warn_unused_result: warning if result is not used
  • noinline: prevent me from being considered for inlining.
  • always_inline: force inline
  • ununsed: prevent printing warning if not used.

void fun() __attribute__((deprecated));

attribute of variable

  • aligned: This attribute specifies a minimum alignment for the variable or structure field, measured in bytes, if type default alignment is large than this value, use default. alignment=max(default, aligned)

Whenever you leave out the alignment factor in an aligned attribute specification, the compiler automatically sets the alignment for the declared variable or field to the largest alignment which is ever used for any data type on the target machine you are compiling for. Doing this can often make copy operations more efficient, because the compiler can use whatever instructions copy the biggest chunks of memory when performing copies to or from the variables or fields that you have aligned this way.
The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well

  • packed: The packed attribute specifies that a variable or structure field should have the smallest possible alignment–one byte for a variable, and one bit for a field, alignment=min(default, packed)
  • deprecated: see function
  • section: see function
  • unused: see function

attribute of type

  • aligned: same as variable
  • packed: specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
struct T1{
char a;
short b;
char c;
int d __attribute__((packed));
};
//sizeof(struct T1)==10

struct T2{
char a;
short b;
char c;
int d;
} __attribute__((packed)); // like each member __attribute__((packed))
//sizeof(struct T2)==8
  • deprecated: see variable.

typedef int T1 __attribute__((deprecated));

built-in function

  • the most/less significant bit
1
2
3
4
5
6
7
8
9
10
11
#include <stdio.h>
#include <stdlib.h>

int main()
{
unsigned int val = 0b1100; // binary
printf ("there are %d zero before the fist bit 1 for 1100\n",__builtin_clz (val));
printf ("there are %d zero after the fist bit 1 for 1100\n",__builtin_ctz (val));

return 0;
}

tips

should add void to function parameter who has no parameter?

Yes, it’s better do that as if compiling with -Werror=strict-prototypes, it will show error if function did not add void to function that does not need parameter.

void hello(void);

Eliminate unused warning

In some case, you do not want to comment unsed variable out , but avoid compiling warning, or avoid warning for unused return value.

  • (void)unsed_var;
  • (void)function();

use do{}while(0) if want to use block(var) in macro

1
2
3
4
5
6
7
8
9
10
11
12
// group macro in block with do/while, limit var scope if defined in macro
// to avoid conflict with caller.
#define hello(msg) \
do { \
int a = 12; \
printf("%d %s", a, msg); \
} while(0) \

int main() {
hello("boy");
return 0;
}

dynamic argument in macro

When the macro is invoked, all the tokens in its argument list […], including any commas, become the variable argument. This sequence of tokens replaces the identifier VA_ARGS in the macro body wherever it appears.

1
2
3
4
5
6
7
#define ENUMS(name, ...) \
static const char *name[] = {__VA_ARGS__};

int main() {
ENUMS(tests, "a", "b")
printf("%s, %s\n", tests[0], tests[1]);\
}

see assembly with source code

gcc -c -g -Wa,-a,-ad a.c > test.lst

show convertion warning when compile code

gcc -Wconversion

cacheline aligned

In some case, we want a variable(struct) cache line aligned, so that CPU can read them once(cpu read memory cache line every time), so it’s better put related field(if access a, will access b soon, a, b are related) together in a cache line, check struct layout with pahole

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# install it from rpm or from source with below link
$ yum install dwarve

# usage
$ cat test.c
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

struct T1{
char a;
short b;
char c;
int d __attribute__((packed));
};

int main() {
struct T1 t;
return 0;
}

# must compile it with -g option.
$ gcc -o test -g test.c

# show all structs
$ pahole -V test

# show particular one
$ pahole -V -C T1 test
struct T1 {
char a; /* 0 1 */

/* XXX 1 byte hole, try to pack */

short int b; /* 2 2 */
char c; /* 4 1 */
int d; /* 5 4 */

/* size: 10, cachelines: 1, members: 4 */
/* sum members: 8, holes: 1, sum holes: 1 */
/* padding: 1 */
/* last cacheline: 10 bytes */
};

Two typical cache line size: 32 bytes or 64 bytes.

1
2
3
# get cache line size of given cpu
$cat /sys/devices/system/cpu/cpu0/cache/index1/coherency_line_size
$getconf LEVEL1_DCACHE_LINESIZE

bit shift

Most of time, bit shift for unsigned integer.

left shift: always pad with 0

right shift:

  • unsigned operator, padding with 0
  • signed operator, padding with flag bit(for positive 0, 1 for negative)

send/recv API differences

pairs of API for sending /recving data.

'connected' socket can be [STREAM, DGGRAM who called connect()]

  • read/readv: read one buffer/several buffers on fd [file like API]
  • write/writev: write with one buffer/several on ‘connected fd [file like API]
  • recv/send: with one buffer on ‘connected’ socket
  • recvfrom/sendto: with on buffer on ‘unconnected’ socket
  • recvmsg/sendmsg: with several buffers on ‘connected’ socket

In the kernel side, all mapped to sendmsg and recvmsg of kernel version.

STREAM vs Datagram

Stream communication implies several things, communication takes place across a connection between two sockets, the communication is reliable, error-free, and no message boundaries are kept, reading from a stream may result in reading the data sent from one or several calls to write() or only part of the data from single call, if there is not enough room for the entire message, or if not all data from a large message has been transferred, the protocol implementing such a style will retransmit message received with errors, it will also return error if one tries to send a message after the connection has been broken.

Datagram communication doesn’t use connection, each message is addressed individually, if the address is correct, it will generally be received, although this is not guaranteed, the individual datagram will be kept separate when they are read, that is message boundaries(from user level not from kernel level) are preserved.

1
2
3
4
TCP-----byte--------stream-----send receive---------------------in order --- -------------may partial
UDP----message------datagram—--send—receive---------------------not guarantee(may lost)---full message
Like TCP: sender: AA BB----recver: A AB B
UDP: sender: AA BB---recver: BB AA

when call sendto(), there are two limitation, one is the socket sendbufer, the other is

MAX_MESSAGE_SIZE(udp payload)==65507==65535(max ip length)-20(ip header)-8(udp header) but ip packet will may be fragment due to MTU(1500), so you can send max udp message(not include udp header) is 65507, fragment always happens at IP layer for UDP if possible(not happend when GSO is enabled, in that case, IP fragment happens when sending to driver after go through network stack), but for TCP, there is no IP fragment at sender machine as TCP already split into smaller message before call IP layer(GSO/TSO is another case), but if socket send buffer is not enough to hold it, sendto() will blocked until kernel send some message and free the room.

TCP bi-direction

For a tcp connection, it’s bi-direction, that means both sides can send and recv data from the other, but you can close one direction, to make one side as reader, the other side as writer only. shutdown() allows you to only stop data transmission in a certain direction, while data transmission in one direction continues. If you can close the write operation of a socket and allow to continue to accept data on the socket until all data is read.

More control about TCP close

1
2
3
4
5
6
int shutdown (int sockfd, int how);

There are three ways of how:
SHUT_RD (0): Turn off the read function on sockfd. This option will not allow sockfd to read.
SHUT_WR (1): Turn off the write function of sockfd, this option will not allow sockfd to write.
SHUT_RDWR (2): Turn off the read and write function of sockfd.

add a wrapper for a function who defined in dynamic library

this is used when need to add a wrapper for a function that’s defined in dynamic library(so that you don’t see the source code of it!!) like C dynamic library or other dynamic library (dynamic library!!!!!!!!)

  • step 1: create you own dynamic library
    1.1 program wrapper library
    void* malloc(size_t size)
    {
    void ret;
    static void
    (realmalloc(size_t size)) = NULL;
    if (realmalloc == NULL) {
    realmalloc = dlsym(RTLD_NEXT, “malloc”); /
    function that will be wrapped
    }
    ret = realmalloc(size);
    !!!!add trace here!!!!!!!!!!!!
    return ret;
    }
    1.2 compile it into wrapper.so

  • step 2: create your program like before, see nothing happens
    ptr = malloc(64);

  • setp 2: load it, must load your dynamic library before any other dynamic libray!!!
    LD_PRLOAD=./wrapper.so ./program
    (wrapper.so is your dynamic library that wrapps
    the dynamic library(like C dynamic library)

NULL vs 0

when evaluate the value of NULL, it’s 0, but if you assign NULL to non-pointer, it's a warning, but a has value 0!

1
2
3
4
5
int a = NULL? 1: 2; //now a is 2!!!
if (NULL == 0) {
printf("equal");//printed
}

function pointer

1
2
3
4
5
6
7
8
void hello(void(*h)())
{
// h is function pointer var
h();
}

// BY is a function pointer type
typedef void (*BY)();

sizeof vs strlen

strlen returns memory byte size when it sees '\0' while sizeof calculates the size of bytes that the variable takes.

1
2
char *p = "a"; //sizeof(p)==8 (x86-64), strlen(p)==1
int p[] = {1, 2, 3}; //sizeof(p)==12 even p is a pointer!!!
1
2
3
4
5
// char *p and char p[] behave same when used in parameter.
void test(char p []) // same as char *p!!!
{
printf("%d\n", sizeof(p));
}

what about char p[0] in a struct

1
2
3
4
5
6
7
struct prot {
int a
char p[0]; //always as a label for dynamic size allocated after a!!! take no memory
// struct prot * p = malloc(sizeof(struct prot) + 100);
};

sizeof(struct prot) == 4!!!

what about function in a struct

1
2
3
4
5
6
struct prot {
int a
int hello() {printf("hello");} //define a function inside a struct
};

sizeof(struct prot) == 4!!!

shared variable between source(global variable)

  • define it at xxx.c

int global_v = 10;

  • export it at xxx.h others can include such header, like other x.c written with extern int global_v as well.
    extern int global_v;

Can we return a local pointer variable from a function

Actually you should NOT do this, as it’s not a good way as if local pointer points to memory on the stack after function call, it will be freed!!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
char * f1 ()
{
char * p = "abc";
return p; // as 'abc' is literal, so its memory is not on stack!!
}
char * f2 ()
{
char p[] = "cde";
return p;
}

int main()
{
char *str;
str = f1 ();
printf ("%s\n", str); //we still see right value as str points to the memory which is not on stack!!!
str = f2 ();
printf ("%s\n", str); //segment fault!
return 0;
}

type length and overflow

1
2
3
4
5
6
7
char c = 0x40; //64
char x = 64;
char b = 2;

printf("%d\n",(char)(x*b)); // the highest bit is flag(0(+), 1(-1)), result: -128
printf("%d\n", c<<3); // result: 512 as %d is used, like int tmp = c<<3
printf ("%d\n", (char)(c<<3)); //truncted result: 0

what does volatile really mean

volatile is a keyword to prevent compiler optimizing(like cache value, reorder etc) for the described variable. its value can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object

it may be used to

  • describe an object corresponding to a memory-mapped input/output port(driver also does)
  • global object accessed by an asynchronously interrupting function(ISR, signal handler), ISR(Signal handler) + Thread
  • global object accessed by two threads Thread + Thread
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
// test.c
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

volatile int flag = 1;

void thread1_func(void* args)
{
printf("child thread %lu is running \n", pthread_self());
while (1) {
if (!flag) {
printf("break happen, thread %lu exiteds\n", pthread_self());
break;
}
}
}

void thread2_func(void* args)
{
printf("child thread %lu is running \n", pthread_self());
while (1) {
sleep(10);
printf("change flag to 0 in thread\n");
flag = 0;
break;
}
}

void install_signal(int signo, void (*handler)(int signo))
{
struct sigaction act;
bzero(&act, sizeof act);
sigfillset(&act.sa_mask);
act.sa_handler = handler;

if (sigaction(signo, &act, NULL) != 0) {
printf("install signal failed");
}
}

void sig_ctrl_c(int signo)
{
printf("thread %lu received sigint \n", pthread_self());
printf("change flag to 0 in signal handler\n");
flag = 0;
}

int main()
{

pthread_t id1;
pthread_t id2;

install_signal(SIGINT, sig_ctrl_c);

if (pthread_create(&id1, NULL, (void*)thread1_func, NULL) != 0) {
printf("create thread failed\n");
}

if (pthread_create(&id2, NULL, (void*)thread2_func, NULL) != 0) {
printf("create thread failed\n");
}

// should join until children exits
while (1);

return 0;
}
// MUST compile with 03
// $gcc -o test test.c -lpthread -O3

ISR

when ISR always mapps port/io/device register from device exported, we should get its value from memory not cache register as it can be changed by DMA out scope of processor.

how dynamic library is searched when running

Old way, when program starts, it checks below path for dynamic library searching

  • directories from LD_LIBRARY_PATH;
  • directories from /etc/ld.so.conf(used for additional library)
  • /lib64;
  • /usr/lib64.

But check path is slow, hence to improve the speed, A cache is used which is located at /etc/ld.so.cache which stores library and its path, fast.

So new way, dynamic library is searched with below

  • $LD_LIBRARY_PATH
  • /etc/ld.so.cache

ld.so.cache is generated after scanning /etc/ld.conf.conf, /lib64 and /usr/lib64, ldconfig is a tool to do this which is called after each dynamic library is installed by yum/apt.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# if you manually remove or copy a library, you need to rebuild the cache
(base) [root@dev ]# ldconfig

# check /etc/ld.so.cache content
(base) [root@dev ]# ldconfig -p
538 libs found in cache `/etc/ld.so.cache'
p11-kit-trust.so (libc6,x86-64) => /lib64/p11-kit-trust.so
libzstd.so.1 (libc6,x86-64) => /lib64/libzstd.so.1
libz.so.1 (libc6,x86-64) => /lib64/libz.so.1
libz.so (libc6,x86-64) => /lib64/libz.so
libyaml-0.so.2 (libc6,x86-64) => /lib64/libyaml-0.so.2
libxtables.so.10 (libc6,x86-64) => /lib64/libxtables.so.10
libxshmfence.so.1 (libc6,x86-64) => /lib64/libxshmfence.so.1
libxml2.so.2 (libc6,x86-64) => /lib64/libxml2.so.2
libxdot.so.4 (libc6,x86-64) => /lib64/libxdot.so.4
libxcb.so.1 (libc6,x86-64) => /lib64/libxcb.so.1
libxcb.so (libc6,x86-64) => /lib64/libxcb.so
libxcb-xvmc.so.0 (libc6,x86-64) => /lib64/libxcb-xvmc.so.0
libxcb-xvmc.so (libc6,x86-64) => /lib64/libxcb-xvmc.so
libxcb-xv.so.0 (libc6,x86-64) => /lib64/libxcb-xv.so.0

how dynamic library is searched when compiling

pkg-config is a tool to check dependencies for a library, it outputs version, header path, libs of that library, so that someone who uses this library passes these to compiler for building.

pkg-config gets all these information by checking xxx.pc from several paths, so that if a library wants to be managed by pkg-config, it must proivde a xxx.pc file at some path.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# get default search paths for pkg-config
$pkg-config --variable pc_path pkg-config

change search paths for pkg-config
$ export PKG_CONFIG_PATH=/usr/lib64/pkgconfig:/usr/share/pkgconfig:/new/path

# list all known packages
$pkg-config --list-all


##################### example ###########################
$pkg-config --modversion gnutls
3.3.29
$pkg-config --libs gnutls
-L/usr/lib64 -lgnutls
$pkg-config --cflags gnutls
-I/usr/include/p11-kit-1

REF