http-protocol-faq
FAQs
Here are list of faqs that are asked for web beginner, like what Vary is
and difference between Content-Encoding and Transfer-Encoding
, what conditional request
is etc.
How response Vary header is used
In some case, server may return different responses for the same uri
based on some info from request headers, the based on headers are put in Vary header
, one typical example is web server returns different web pages for mobile and desktop, but the uri is same, that means response depends on User-Agent
request header, so put User-Agent
in Vary response header, when proxy(nginx always cache response) receives the response, it (nginx) stores the response with key (uri, User-Agent)
, next time when a new request comes in with same uri but different user-agent, proxy should request a fresh one from origin server, not use cached data, as the User-Agent(secondary key when searching cache) is different.
response depends on User-Agent and Cookie
Vary: User-Agent, Cookie
In one word, it tells downstream proxies how to match future request headers to decide whether the cached response can be used rather than requesting a fresh one from the origin server.
what q=0.6 mean in header value
As for some headers, like Accept-Language
, Accept
it could contain multiple values, in that case which value should be used by server, q(relative quality factor) is used for this, it indicates preference of each value, server should use highest one(q value is larger)
, when sends response to client, here is an example:
Accept-Language: en-us,en;q=0.5
en-us uses default q=1 which is higher than en, so when server sends response to client, it should use en-us Language if it supports.
Rules
- without q provided use
default q=1
high
value meansmuch prefer
- value
range [0, 1]
- format
value;q=0.5
how Etag is generated
Etag(response header) is an identifier for a specific version of a resource
(mostly for static resource), often a hash value
of a resource, it’s generated by web server based on attributes(size, inode, modified-time etc) of the resource.
ETag: "737060cd8c284d8af7ad3082f209582d"
headers related to resume from break point
In order to support this, client can request partial range of content, while server must support partial request, there are three headers, two response headers Accept-Ranges
, Content-Range
, one request header Range
.
Accept-Ranges
indicates if server supports partial requestRange
indicates which range client wants to getContent-Range
indicates which range that server sends it to client.
server declares it supports
Accept-Ranges: bytes
client request a range
Range: bytes=500-999
server send the conent of that range
Content-Range : bytes 500-999/1234
Expires vs Cache-Control(max-age value) header
Cache-Control
was introduced in HTTP/1.1
and offers more options than Expires
. They can be used to accomplish the same thing.
The data value for Expires is an HTTP date
whereas Cache-Control max-age lets you specify a relative amount of time
so you could specify ‘X hours after the page was requested’.
If a response includes a Cache-Control field with the max-age directive, a recipient MUST ignore the Expires field.
Cache-Control: max-age=3600
Expires: Tue, 18 Jul 2017 16:07:23 GMT
Always use Cache-Control as it offers more options
Ways to do conditional request
Conditional request means client provides condition
to server, server checks the condition
if matched, sends the resource, otherwise, only sends header with special status code.
Old way(http1.0)
If-Unmodified-Since and If-Modified-Since, where the client sends a timestamp of the resource.
http 1.1
If-Modified and If-None-Modified, where the client sends an ETag representation of the resource
Difference:
Dates can be ordered, ETags can not.
This means that if some resource was modified a year ago, but never since, and we know it. Then we can correctly answer an If-Unmodified-Since request for arbitrary dates the last year and agree that sure… it has been unmodified since that date.
An Etag is only comparable for identity. Either it is the same or it is not
. If you have the same resource as above, and during the year the docroot has been moved to a new disk and filesystem, giving all files new inodes but preserving modification dates. And someone had based the ETags on file’s inode number. Then we can’t say that the old ETag is still okay, without having a log of past-still-okay-ETags.
So I don’t see them as one obsoleting the other. They are for different situations. Either you can easily get a Last-Modified date of all the data in the page you’re about to serve, or you can easily get an ETag for what you will serve.
If you have a dynamic webpage
with data from lots of db lookups it might be difficult to tell what the Last-Modified date is without making your database contain lots of modification dates. But you can always make an md5 checksum of the result rendered page
.
When supporting these cache protocols I definitely go for only one of them, never both.
TE and Transfer-Encoding header
The TE request header
specifies the transfer encodings the user agent is willing to accept
. (you could informally call it Accept-Transfer-Encoding, which would be more intuitive).
TE: chunked
The Transfer-Encoding response header
specifies the form of encoding used to safely transfer the payload body
to the user
1 | #chunked, only for Http1.1 |
In which case chunked is used
Regards to chunked encoding, there is one important response header Trailer
, it allows the sender to include additional fields(header) at the end of chunked messages
in order to supply metadata that might be dynamically generated while the message body is sent, such as a message integrity check, digital signature, or post-processing status.
Note: The TE request header needs to be set to "trailers" to allow trailer fields.
Chunked encoding is useful when larger amounts of data are sent to the client and the total size of the response may not be known until the request has been fully processed
. For example, when generating a large HTML table resulting from a database query or when transmitting large images.
Data is sent in a series of chunks. The Content-Length header is omitted
in this case and at the beginning of each chunk you need to add the length of the current chunk in hexadecimal format
, followed by ‘\r\n’ and then the chunk itself, followed by another ‘\r\n’. The terminating chunk is a regular chunk, with the exception that its length is zero. It is followed by the trailer, which consists of a (possibly empty) sequence of entity header fields.
A chunked response looks like this:
1 | HTTP/1.1 200 OK |
Accept-Encoding and Content-Encoding
The Content-Encoding entity header is used to compress the media-type. When present, its value indicates which encodings were applied to the entity-body. It lets the client know how to decode in order to obtain the media-type referenced by the Content-Type header
.
The recommendation is to compress data as much as possible and therefore to use this field, but some types of resources, such as jpeg images, are already compressed
The Accept-Encoding request HTTP header advertises which content encoding, usually a compression algorithm, the client is able to understand
.
1 | Accept-Encoding: gzip |
Note: browser will decompress payload and show uncompressed web page to user
Content-Encoding vs Transfer-Encoding
Content-Encoding is how content is encoding, like if the web page is 100k, it’s better to encode it with gzip to reduce the payload, when server gets the encode data(or server encodes it by itself), the server may decide to transfer the gzip data(Content-Encoding: gzip
) with chunked format(Transfer-Encoding: chunked
), that’s what they are, they apply at different levels, Transfer-Encoding is hop by hop, it may change during transferring proxy
, while Content-Encoding is end-to-end proxy never touch the payload!
Without Content-Encoding, assume it's uncompressed, without Transfer-Encoding, assume it's not chunked, but must has Content-length if has body
method PUT(whole update) vs POST(new) vs PATCH(part update)
The POST method is used to submit an entity to the specified resource
, often causing a change in state or side effects on the server. plan to create new
, if you run many times with same uri, many new objects may be created with same value
POST /questions
The PUT method replaces all current representations of the target resource with the request payload
, or create new one if not found, plan to replace
, if you run many times with same uri, there is only one objects created, as PUT must provide a identity
PUT /questions/{question-id}
The PATCH method is used to apply partial modifications to a resource.
1 | method PATCH POST PUT |
A POST request is typically sent via an HTML form and results in a change on the server, in this case, it only supports three content types
- application/x-www-form-urlencoded:
the keys and values are encoded in key-value tuples separated by '&', with a '=' between the key and the value.
Non-alphanumeric characters in both keys and values are percent encoded: this is the reason why this type is not suitable to use with binary data (use multipart/form-data instead)
1 | POST /test HTTP/1.1 |
multipart/form-data: each value is sent as a block of data (“body part”), with a user agent-defined delimiter (“boundary”) separating each part. The keys are given in the Content-Disposition header of each part.
text/plain
When the POST request is sent via a method other than an HTML form — like via an XMLHttpRequest(like in script) — the body can take any type
GET VS HEAD
The GET method requests a representation of the specified resource. Requests using GET should only retrieve data.
The HEAD method asks for a response identical to that of a GET request, but without the response body.
HEAD same as GET but without body returned
OPTIONS method
The OPTIONS method is used to describe the communication options for the target resource.
The client can specify a URL for the OPTIONS method, or an asterisk (*) to refer to the entire server.
Identifying allowed request methods
1 | $ curl -X OPTIONS http://example.org -i |
how to create a forever http connection
To open a connection that never dies until close it by explicitly
- Websocket
- send http request with transfer-encoding: Chunked, but never set terminating chunk
0\r\n
In the above two ways, after get respone from sever, server will not close it, the close happends only when client closes it or client sends \0\r\n
to server