HTTP/1.1

来自WHY42

Http1.1最初定义在Hypertext Transfer Protocol -- HTTP/1.1中,后面被Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing废除,因此当前HTTP1.1协议实际上包括:

  • RFC7230:Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing
  • RFC7231:Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
  • RFC7232:Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests
  • RFC7233:Hypertext Transfer Protocol (HTTP/1.1): Range Requests
  • RFC7234:Hypertext Transfer Protocol (HTTP/1.1): Caching
  • RFC7235:Hypertext Transfer Protocol (HTTP/1.1): Authentication

HTTP 1.1 vs HTTP 1.0

Version 1.1 of HTTP was released in 1997, only one year after the previous version 1.0. HTTP 1.1 is an enhancement of HTTP 1.0, providing several extensions.

Among the most relevant enhancements, we can cite the following[1]:

Host header

HTTP 1.0 does not officially require the host header. HTTP 1.1 requires it by the specification. The host header is specially important to route messages through proxy servers, allowing to distinguish domains that point to the same IP

GET /hello.txt HTTP/1.1 Host: www.example.com

This header is useful because it allows you to route a message through proxy servers, and also because your web server can distinguish between different sites on the same server.

Persistent connections

In HTTP 1.0 you had to open a new connection for each request/response pair. And after each response the connection would be closed. This lead to some big efficiency problems because of TCP Slow Start.

In HTTP 1.1, it is possible to execute several requests using a single connection. HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. The newer HTTP/2 protocol uses the same idea and takes it further to allow multiple concurrent requests/responses to be multiplexed over a single connection.

In HTTP 1.1, all connections are considered persistent unless declared otherwise.

New methods

besides the already available methods of HTTP 1.0, the 1.1 version added six extra methods: PUT, PATCH, DELETE, CONNECT, TRACE, and OPTIONS.

HTTP/1.1 introduces the OPTIONS method. An HTTP client can use this method to determine the abilities of the HTTP server. It's mostly used for Cross Origin Resource Sharing in web applications.

CONNECT

The HTTP CONNECT method starts two-way communications with the requested resource. It can be used to open a tunnel.

For example, the CONNECT method can be used to access websites that use TLS (HTTPS). The client asks an HTTP Proxy server to tunnel the TCP connection to the desired destination. The server then proceeds to make the connection on behalf of the client. Once the connection has been established by the server, the Proxy server continues to proxy the TCP stream to and from the client[2].

a web client uses CONNECT only when it knows it talks to a proxy and the final URI begins with https://.

CONNECT www.example.com:443 HTTP/1.1

OPTIONS

The HTTP OPTIONS method requests permitted communication options for a given URL or server. A client can specify a URL with this method, or an asterisk (*) to refer to the entire server

OPTIONS /index.html HTTP/1.1
OPTIONS * HTTP/1.1
curl -X OPTIONS https://riguz.com -i
HTTP/1.1 200 OK
Date: Tue, 05 Dec 2023 14:05:01 GMT
Server: Apache/2.4.41 (Ubuntu)
Allow: POST,OPTIONS,HEAD,GET
Content-Length: 0
Content-Type: text/html

In CORS, a preflight request is sent with the OPTIONS method so that the server can respond if it is acceptable to send the request.

OPTIONS /resources/post-here/ HTTP/1.1
Host: bar.example
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Connection: keep-alive
Origin: https://foo.example
Access-Control-Request-Method: POST
Access-Control-Request-Headers: X-PINGOTHER, Content-Type
HTTP/1.1 200 OK
Date: Mon, 01 Dec 2008 01:15:39 GMT
Server: Apache/2.0.61 (Unix)
Access-Control-Allow-Origin: https://foo.example
Access-Control-Allow-Methods: POST, GET, OPTIONS
Access-Control-Allow-Headers: X-PINGOTHER, Content-Type
Access-Control-Max-Age: 86400
Vary: Accept-Encoding, Origin
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive

100 Continue status

There is a new return code in HTTP/1.1 100 Continue. This is to prevent a client from sending a large request when that client is not even sure if the server can process the request, or is authorized to process the request. In this case the client sends only the headers, and the server will tell the client 100 Continue, go ahead with the body.

Caching

HTTP 1.0 had support for caching via the header: If-Modified-Since.

HTTP 1.1 expands on the caching support a lot by using something called 'entity tag'. If 2 resources are the same, then they will have the same entity tags.

HTTP 1.1 also adds the If-Unmodified-Since, If-Match, If-None-Match conditional headers.

There are also further additions relating to caching like the Cache-Control header.


In addition to the highlighted enhancements, there are many others introduced in version 1.1 of HTTP, such as compression and decompression, multi-language support, and byte-range transfers.

HTTP消息格式

GET /hello.txt HTTP/1.1
User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
Host: www.example.com
Accept-Language: en, mi
HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Accept-Ranges: bytes
Content-Length: 51
Vary: Accept-Encoding
Content-Type: text/plain

Hello World! My payload includes a trailing CRLF.
HTTP-message   = start-line
                 *( header-field CRLF )
                 CRLF
                 [ message-body ]

程序首先读取start-line以及header,然后根据header里面的内容决定是否含有body。body则按照长度读取定长字节。

start-line

start-line     = request-line / status-line
request-line   = method SP request-target SP HTTP-version CRLF
status-line    = HTTP-version SP status-code SP reason-phrase CRLF

其中:

  • HTTP中并未直接定义request-line的长度限制,倘若超过服务端实现所支持的长度,则建议返回501(Not implemented)
  • HTTP建议request-line支持至少8000字符长
  • 若request-target超过服务器预期则必须返回414 (URI Too Long)

header-field

header-field   = field-name ":" OWS field-value OWS

field-name     = token
field-value    = *( field-content / obs-fold )
field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar    = VCHAR / obs-text

obs-fold       = CRLF 1*( SP / HTAB )

通常header-field是不应该重复,但有一些例外:

  • 若该header有多个值,则可以定义为多个或是按照逗号分隔值
  • 通常会出现多次,但因无法合并到一起,是一个例外

field-value通常只包含US-ASCII 字符集。

对于header的长度HTTP并未直接限制,实际通常根据header的含义在实现上有对应的限制。对于超出预期的请求,规定必须返回4xx错误而不是忽略,以防止 攻击。

http body

因为http的body并不是必须的,具体什么时候有body按照如下的规则:

  • 在request中,通过或者定义;
  • 在response中,通过请求的method以及响应的状态码确定。HEAD请求不包含body;对于CONNECT请求的2xx响应也没有body;1xx (Informational), 204 (No Content), 以及 304 (Not Modified)也均无body。所有其他response都具有body,即使其长度为0。

Transfer-Encoding

Transfer-Encoding用来表明body中的消息是如何编码的,例如

Transfer-Encoding: gzip, chunked

表示对body进行了gzip压缩并进行了分组。

transfer-coding    = "chunked" 
                   / "compress"
                   / "deflate"
                   / "gzip"
                   / transfer-extension
     transfer-extension = token *( OWS ";" OWS transfer-parameter )

Content-Length

如果不存在头,则可通过表明消息的长度。注意两者不允许同时出现。

Content-Length: 3495

Chunked Transfer Coding

Chunked enables content streams of unknown size to be transferred as a sequence of length-delimited buffers, which enables the sender to retain connection persistence and the recipient to know when it has received the entire message.

chunked-body   = *chunk
                 last-chunk
                 trailer-part
                 CRLF
chunk          = chunk-size [ chunk-ext ] CRLF
                 chunk-data CRLF
chunk-size     = 1*HEXDIG
last-chunk     = 1*("0") [ chunk-ext ] CRLF
chunk-data     = 1*OCTET ; a sequence of chunk-size octets

The chunk-size field is a string of hex digits indicating the size of the chunk-data in octets. The chunked transfer coding is complete when a chunk with a chunk-size of zero is received, possibly followed by a trailer, and finally terminated by an empty line.

其解码过程如下

length := 0
read chunk-size, chunk-ext (if any), and CRLF
while (chunk-size > 0) {
   read chunk-data and CRLF
   append chunk-data to decoded-body
   length := length + chunk-size
   read chunk-size, chunk-ext (if any), and CRLF
}
read trailer field
while (trailer field is not empty) {
   if (trailer field is allowed to be sent in a trailer) {
       append trailer field to existing header fields
   }
   read trailer-field
}
Content-Length := length
Remove "chunked" from Transfer-Encoding
Remove Trailer from existing header fields