c-language-basic

C

  • 变量的声明和定义在c语言中是不同的,声明不开辟空间,而是告诉编译器该变量在其他地方定义了,而定义则是要给变量开辟空间
  • c语言中式没有引用类型
  • 虽然C语言中有const,但是const不可以修饰函数属性, const 只能修饰变量和参数
  • C 不支持函数重置,不同的函数必须使用不同的函数名!!!

printf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
%o  八进制整数(无论char,short,int, long
%x 16进制整数(无论char,short,int, long
%d 有符号10机制输出(singed char, short, int 都可以使用)
%u 无符号10机制输出(unsigned char, unsigned short, unsigned int 都可以使用)
%ld 有符号长整数
%lu 无符号长整数
%f 浮点数输出
%c 字符输出,输出表面字符,而不是字符的对应值!
%s 字符串输出

// NO %l at all!!!
// print uint64_t
#define __STDC_FORMAT_MACROS
#include <inttypes.h>

uint64_t i = 10;
int64_t n = 7;

printf("%"PRId64"\n", n);
printf("%"PRIu64"\n", i);

C99 printf format

pointer

  • void*是无类型的指针,该类指针可以和其他指针完成相互转换! 但是它(void*)无法进行指针运算
  • 不同类型的指针转化是有危险的!!可能导致数据的丢失和非预期的访问

pointer pointer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int a, b, c;
int* p[3]; //pointer array
p[0] = &a;
p[1] = &b;
p[2] = &c;

typedef void (*ftype)(int a); //ftype 是一个函数指针类型
void hello(int a);
ftype pfun = hello;

// 可变参数的函数指针
#ifdef __cplusplus
typedef void (*PTR)(...);
#else
//和无参函数指针一样
typedef void (*PTR)();
#endif

const with pointer

1
2
3
4
5
6
7
8
9
10
11
12
const int a = 12;              //(这是定义一个只读类型的变量!)
const char *p; === char const *p; //那么p指向的内存区的内容不可以更改, 但是p可以指向不同内存区.
int*const p = &a; //p初始化以后不可更改(定义的时候必须初始化)!但p所指向的内容可以更改
````

### pointer operation
![](https://cyun.tech/images/C/pointer_ops.png)

```c
int a[2];
int *p = a; // p+1 == address(a) + 4 ---->sizeof(int) == 4
int **pp = &p; // pp + 1 == address(p) + 8 ---->sizeof(int*) == 8

p+1就是下一个元素的地址,p是int*,因此下一个元素就是在上个元素的地址上加+sizeof(int).

指针的减法表示的地址之间的该类型元素的个数,而不是地址之间的字节数

1
2
int a[3];
&a[2]-&a[0] = 2 // 而不是2*sizeof(int)

NULL and 0

空指针和0是不同的,但是当一个变量是指针变量的时候,指针变量和0的赋值,比较操作编译器会把0转化为NULL
也就是说下面的代码编译后的结果是相同的

1
2
3
4
5
6
7
8
9
int * p;

//一样
if (p!=NULL)
if (p!=0)

//一样
p = 0;
p = NULL;

integer type

On major 32-bit platforms:

  • int is 32 bits
  • long is 32 bits as well
  • long long is 64 bits

On major 64-bit platforms:

  • int is 32 bits
  • long is either 32 or 64 bits
  • long long is 64 bits as well

Explicit type


  • int32_t, int64_t, int16_t, int8_t
  • u_int32_t, u_int64_t, u_int16_t, u_int8_t

range

有符号数是有符号位的,无符号数是没有符号位的!
不同类型的变量,长度可能是不同的,因此它所表示的数值范围也是有限制的,而同一种类型的变量,根据是否有无符号,范围也是不同的下面列举几种常见类型的大小.

1
2
3
4
类型		长度			     无				有
char 1 255 -128---127
short 2 65535 -32768—32767
int 4 4294967295 -2147483648---2147483647

符号影响了不同类型的表示范围,而更为严重的是,编译器在比较和赋值的时候可能会有类型的隐式转化!!

what is the reason for explicitly declaring L or UL for long values

When a suffix L or UL is not used ,compiler uses the first type that can contain the constant from a list (see details in C99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long, long long).

As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long.

There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:

1
2
printf("%lld", 1LL); // correct
printf("%lld", 1); // undefined behavior 1 type is int!! but use lld format.

A common reason to use a suffix is ensuring that the result of a computation doesn’t overflow. Two examples are:

1
2
long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;

In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.

As Lightness Races in Orbit puts it, the litteral’s suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.

Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.

1
2
int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12

With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:

1
printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large

In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.

Integral Promotion

(char,short,enum---->promoted to int or uint)

A character, a short integer, or an integer bit-field, all either signed or not, or an object of enumeration type, may be used in an expression wherever an integer may be used. If an int can represent all the values of the original type, then the value is converted to int; otherwise the value is converted to unsigned int. This process is called integral promotion.

有无符号的char,short,位域整数,枚举当使用在需要整数的地方的时候,会进行类型的提升,值的大小决定了到底转化为有符号int,还是无符号int.

char,short,enum---->promoted to int or uint

1
2
3
4
5
6
short a = 0x7fff;
short b = 2;
short c = (a*b)/(short)2; a*b提升为整数
(中间值都是int or unsigned int)
printf ("%x\n", a); //==0x7fff
// gcc -o test a.c -g -Wconversion
1
2
3
short a = 1;
short b = 2;
short c = a+b; //类型转换。

Integral Conversions

Any integer is converted to a given unsigned type by finding the smallest non-negative value that is congruent to that integer, modulo one more than the largest value that can be represented in the unsigned type. In a two’s complement representation, this is equivalent to left-truncation if the bit pattern of the unsigned type is narrower, and to zero-filling unsigned values and sign-extending signed values if the unsigned type is wider.
When any integer is converted to a signed type, the value is unchanged if it can be represented in the new type and is implementation-defined otherwise.

整数(short,char,int,unsigned,long)转换

  • 转换为有符号的类型,如果值能够用新类型表示,那么值不变,如果值太大,那么结果是未定义的,根据具体的实现,
    然后把最高位当符号为看待。然后就可以给你补码规则知道具体的值。
  • 转换为无符号的类型,如果新类型比较宽,用符号位填充,如果新类型比较‘窄’,截取最左端!也就是丢弃左端!!保留低字节位,而不管big-endian和little-endian。
    例如:
    1
    2
    char var = 0xff12; //在big-endian和little-endian的结果都是0x12
    // 然后 没有了符号位,计算新的结果。

Arithmetic Conversions

Many operators cause conversions and yield result types in a similar way. The effect is to bring operands into a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions.

如果决定转化的目标类型: 按照下面顺序

  • First, if either operand is long double, the other is converted to long double.
  • Otherwise, if either operand is double, the other is converted to double.
  • Otherwise, if either operand is float, the other is converted to float.

integer operation must has the same type!!!

  • Otherwise, the integral promotions are performed on both operands; then, if either operand is unsigned long int, the other is converted to unsigned long int.
  • Otherwise, if one operand is long int and the other is unsigned int, the effect depends on whether a long int can represent all values of an unsigned int; if so, the unsigned int operand is converted to long int; if not, both are converted to unsigned long int.
  • Otherwise, if one operand is long int, the other is converted to long int.
  • Otherwise, if either operand is unsigned int, the other is converted to unsigned int.
  • Otherwise, both operands have type int。

examples

有符号和无符号的数,运算和比较的话,有符号数要向无符号数转化!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <stdio.h>
#include <stdlib.h>
int main()
{
int a = -1;
if (a > 12u) // 12u type: unsigned int
{ // a转化为无符号数 a prompted to unsigned int
printf("greater\n");
}

// 同符号数运算结果仍然是同符号数
unsigned int c = 1;
unsigned int d = 2;
if (c - d > 0) { // 0提升为无符号数!
//说明c-d>0 也不能保证c是大于d的!
printf("greater\n");
}

int e = -1;
unsigned short f = 1;
if (e < f) { // f prompted to int!!!
printf("e is smaller than f\n");
}
}

greater
greater
e is smaller than f

constant

Ways to define

  • 常量字符
  • 常量字符串
  • enum定义的常量(其中的值类型是int) can be used to define array
  • define定义的常量(可以为常量数也可以是字符串) can be used to define array
  • const 定义的常量数和字符串(其实是变量,只读属性,can NOT be used to define array)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    const char * p = "hello"
    enum day{mon=2, tus=1,wed};
    #define A 12
    #define B "hello"
    enum {size=10}
    const int c_size = 10; // actually c_size is a variable with read only attribute!!!!
    int main()
    {
    int a[size];//right
    int b[c_size];//wrong
    }

type of constant

  • integer constant: The type of an integer constant depends on its form, value and suffix
    • If it is not suffixed and decimal, it has the first of these types in which its value can be represented: int, long int, unsigned long int

    • If it is not suffixed, octal or hexadecimal, it has the first possible of these types: int, unsigned int, long int, unsigned long int

    • If it is suffixed by u or U, then unsigned int, unsigned long int

    • If it is suffixed by l or L, then long int, unsigned long int

    • If an integer constant is suffixed by UL, it is unsigned long

  • 枚举常量类型是int
  • 字符常量类型是char

variable

static var

  • 初始化为 0
  • static定义的变量的存储空间是开辟在 data segment if initialized by user, otherwise in BSS segment, 只有在程序退出的时候才消失
  • 如果定义在函数内,反复调用也不会重新初始化。

var address

变量是有类型的,而不同的类型的长度也是不同的,因此当为一个变量分配多个字节的时候(显然这些字节是连续的!),该用哪个字节的地址表示变量的地址呢,显然最合理的就是用地址最小的那个字节!

1
int a;

那么a的地址是A,想要取得所有分片(one byte one slot)内的内容,就需要按照字节访问, 方法就是定义一个char*,指向a

1
char * p = (char*)&a

这样就可以按照字节访问一个int每个byte了.

struct

按照常理,结构体的长度就是所有变量的长度之和,但是为了对存取快捷,硬件也规定了不同类型的地址是有规律的!例如short虚拟地址必须是2的倍数,int的虚拟地址必须是4的倍数,因此导致了编译器会对结构体做调整,使其变长了,虽然浪费了空间,但是提高了内存的存取速度!

每种类型的都有其对齐模数(k),也就是该类型变量的地址必须是k的整数倍,而基本类型的对齐模数一般都是其本身长度 结构体类型的对齐模数一般则是所有成员中最大的(基本类型)成员对齐模数。对齐模数和地址紧密相关.

x86-64

padding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct ms{
char c;
//pading here
int b;
};```

由于struct ms能够保证变量的地址是必须是max(sizeof(char),sizeof(int))==4的倍数,但是也要保证内部变量的地址满足其自身要求。因此需要在 两成员之间 填充3字节!,保证b的地址为4的倍数,因此sizeof(struct ms)==8

```c
struct ms1{
int b;
char c;//类型的模数是关键!
//pading here
};

还是按照上述分析对于单个变量而言,情况满足,那么sizeof(struct ms1)==5?,其实并非如此,因为当定义数组的时候标准规定任何类型(包括自定义结构类型)的数组所占空间的大小一定等于一个单独的该类型数据的大小乘以数组元素的个数,也就是数组的每个元素都是紧密相邻的。

1
struct ms1 array[2];

对于第2个元素,显然无法保证b的地址为4的倍数!因此结构体需要在尾部填充
那么尾部该填充多少呢?一般而言在中间对齐之后,如果结果不是对齐模数的倍数则提升到倍数就可以了。也就是结构体的长度一定是对齐模数的倍数!

对齐模数编译器是可以配置的,因此结构体的长度是和操作系统和编译器相关的

bit of member

规定位域不能跨同类型的两个地址.

1
2
3
4
5
6
7
8
9
10
struct test{
int a:30;
int b:3; // in next 4 bytes, not same with a to avoid cross two variables for b
};
// sizeof(struct test) == 8
struct test{
int a:30;
int b:1; // in the same 4 bytes as a
};
// sizeof(struct test) == 4

function

argument passing

参数中的变量都是实参的一个副本!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <stdio.h>
#include <stdlib.h>
void print(char *str)
{
str++; //str is another variable(address are different with p) that points to hello as well!!
// *str = *(str+1); as str points to readonly memory, so this will cause segment fault!!!
}
int main()
{
// p points to hello
char * p = "hello";
print(p);
printf("%s\n", p); // result: hello
return 0;
}
hello

variable parameters

Whenever a function is declared to have an indeterminate number of arguments, in place of the last argument you should place an ellipsis (which looks like ‘…’), so, int a_function (int x, ... ); would tell the compiler the function should accept however many arguments that the programmer uses, as long as it is equal to at least one, the one being the first, x.

  • va_start, which initializes the list

    va_start is a macro which accepts two arguments, a va_list and the name of the variable that directly precedes the ellipsis (“…”).

  • va_arg, which returns the next argument in the list,

    va_arg takes a va_list and a variable type, and returns the next argument in the list in the form of whatever variable type it is told. It then moves down the list to the next argument, Note that you need to know the type of each argument–that’s part of why printf requires a format string.

  • va_end, which cleans up the variable argument list.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include <stdarg.h>
#include <stdio.h>

/* this function will take the number of values to average
followed by all of the numbers to average */

// the first parameter can be any type, in case you have a way to know how many parameters passed in.
// see below example.
double average ( int num, ... )
{
va_list arguments;
double sum = 0;

/* Initializing arguments to store all values after num */
va_start ( arguments, num ); // num is the name of the list!!!
/* Sum all the inputs; we still rely on the function caller to tell us how
* many there are */
for ( int x = 0; x < num; x++ )
{
sum += va_arg ( arguments, double ); // must know type for each argument!!!
}
va_end ( arguments ); // Cleans up the list

return sum / num;
}

int main()
{
/* this computes the average of 13.2, 22.3 and 4.5 (3 indicates the number of values to average) */
printf( "%f\n", average ( 3, 12.2, 22.3, 4.5 ) );
/* here it computes the average of the 5 values 3.3, 2.2, 1.1, 5.5 and 3.3 */
printf( "%f\n", average ( 5, 3.3, 2.2, 1.1, 5.5, 3.3 ) );
}
13.000000
3.080000
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <stdarg.h>
#include <stdio.h>

void print(const char* format, ...)
{
va_list arguments;
va_start(arguments, format);

int ival;
char* sval;

// support format takes 2 chars: like %d, %s
while (*format) {
if (*format == '%') {
format++;
switch (*format) {
case 'd':
ival = va_arg(arguments, int);
printf("%d\n", ival);
break;
case 's':
sval = va_arg(arguments, char*);
printf("%s\n", sval);
break;
}
}
format++;
}
va_end(arguments); // Cleans up the list
}

int main()
{
print("%d %s", 12, "hello");
return 0;
}
12
hello

operator priority and combination

1
2
3
4
5
6
7
8
9
10
括号第一,单目2,
乘除余3,加减4,
移5,系6,逻于7(一股细流落于齐)
位8,问9,赋逗到底
(位8是与异位,问号,赋值,逗到底)
单右,双左
问同右

*s++,*s--, get the value, then pointer + or -
*++s,*--s, point+ or -, then get the value.

NOTE

Always use () for clear

bit operation

左移<<和右移>>也会移动符号位, 有符号的正数可能变成负数, &,|,~是对数的每一位都做这样的操作,因此它是不区分符号位的!!

因此要想&,|,~操作的话就必须知道数在计算机中的表示

  • 对应无符号数补码就是其原码
  • 而对于正整数补码也是其原码!
  • 而对于负整数补码取绝对值的原码,然后对该原码取反,然后加1,就是负数的补码!

简单起见,实际应用中 &, |, ~ 只操作无符号数

bit move

  • 操作数必须是integer!如果不是进行integral promotion.

The result is undefined if the right operand is negative, or greater than or equal to the number of bits in the left expression's type, the type of the result is that of the promoted left operand.

移位是位操作,每位都要移动包括符号位。左移都是填充0,右移对应有无符号数是不同的,无符号数右移填充0,有符号数右移可能填充符号位【一般这么做】,也可能填充0!移位后的表达式的类型是左侧操作数提升后的类型.

& and |

  • 操作数必须是integer!如果不是进行integral promotion.

  • & 按位与,常见的操作就是把某些位置0

  • | 按位或,常见的操作就是把某些位置1

  • ~ 按位取反!

FAQ

what does include header do

Copy all contents of the file to which includes it when building a .c file. To avoid multiple copy of it, define header file like this.

1
2
3
4
#ifndef IPNET_CONFIG_H
#define IPNET_CONFIG_H
# contents
#endif

Can we define(not declare) variable and function in header file as it’s copied to each .c file?

Yes, you can, but it’s not a usual way, as it may cause conflict or multiple definitions if you don’t handle it properly.

example which causes multiple definition

1
2
3
4
5
6
#test.h
#ifndef TEST_H
#define TEST_H
void say() {
}
#endif
1
2
3
# In a.c
#include “test.h” # copy test.h
# test.h is copied when building a.o as no defined before
1
2
3
# In b.c
#include “test.h” # copy test.h here
# test.h is copied when building b.o as no defined before

Makefile

1
2
3
program:
a.o b.o #multiple definition as say() defined in a.o and b.o!
# as both a.o and b.o define say(), multiple definitions.

Another example

1
2
3
4
5
6
test.h
#ifndef TEST_H
#define TEST_H
void say() {
}
#endif
1
2
3
In a.c
void say(int b) {
}
1
2
In b.c
#include “test.h” copy test.h here

Makefile

1
2
program:
a.o b.o #definition conflict as say() defined in a.o and b.o! but it's different!

Guide line

  • If you define variable and function in header, make sure, add static keyword and the function is short

evaluation order

Function calls, nested assignment statements, and increment and decrement operators cause side effects, some variable is changed as a by-product of the evaluation of an expression. In any expression involving side effects, there can be subtle dependencies on the order in which variables taking part in the expression are updated.

evaluation order is not defined for arithmetic operator(like +, -, *, /) and function arguments.

operand

C,does not specify the order in which the operands of an operator are evaluated. (exceptions are &&, ||, ?:, and ‘,’) For example, in a statement like

1
x = f() + g();

f may be evaluated before g or vice versa; thus if either f or g alters a variable on which the other depends, x can depend on the order of evaluation, it’s hard to say the value of x.

function argument

The order in which function arguments are evaluated is not specified, so the statement below.

1
printf("%d %d\n", ++n, power(2, n)); /* WRONG */

It can produce different results with different compilers, depending on whether n is incremented before power is called.

++i and i++

i++ is post-increment and ++i is pre-increment. Post-increment means that the previous value is returned after incrementing the object. pre-increment means that the object is incremented and then returned. Either way, the object is incremented when its expression is evaluated before any other next code.

, and ? expression

The value of , is the last expression but ? if condition is true, use value is first expression, otherwise the second expression.

1
2
d = (a, b, c);
e = a? b: c;

NOTE

  • ? , has lowest priority then , + - * / % which is from low to high
  • , execute from left to right for each expression
1
2
3
4
5
6
7
8
9
10
11
12
13
int a = 1;
int b = 2;
int c;
c = a, b; // c == a
c = (a, b); // c == b

c = a > b? 10: 20;

c = a > b? 10: 20 + 30; // + has the high priority c== 20 + 30
c = a > b? 10: (20 + 30); // c == 20 + 30, same as above

c = 2 + a > b? 10: 20; // + has high priority NOT---->2 + (a>b? 10 : 20)
c = (2 + a) > b? 10: 20;

ref