HTTP/1.1

来自WHY42
Riguz留言 | 贡献2023年12月5日 (二) 09:36的版本 →‎概要

Http1.1最初定义在Hypertext Transfer Protocol -- HTTP/1.1中,后面被Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing废除,因此当前HTTP1.1协议实际上包括:

  • RFC7230:Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing
  • RFC7231:Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
  • RFC7232:Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests
  • RFC7233:Hypertext Transfer Protocol (HTTP/1.1): Range Requests
  • RFC7234:Hypertext Transfer Protocol (HTTP/1.1): Caching
  • RFC7235:Hypertext Transfer Protocol (HTTP/1.1): Authentication

概要

Version 1.1 of HTTP was released in 1997, only one year after the previous version 1.0. HTTP 1.1 is an enhancement of HTTP 1.0, providing several extensions.

Among the most relevant enhancements, we can cite the following:

  • Host header: HTTP 1.0 does not officially require the host header. HTTP 1.1 requires it by the specification. The host header is specially important to route messages through proxy servers, allowing to distinguish domains that point to the same IP
  • Persistent connections: in HTTP 1.0, each request/response pair requires opening a new connection. In HTTP 1.1, it is possible to execute several requests using a single connection
  • Continue status: to avoid servers refusing unprocessable requests, now clients can first send only the request headers and check if they receive a continue status code (100)
  • New methods: besides the already available methods of HTTP 1.0, the 1.1 version added six extra methods: PUT, PATCH, DELETE, CONNECT, TRACE, and OPTIONS

In addition to the highlighted enhancements, there are many others introduced in version 1.1 of HTTP, such as compression and decompression, multi-language support, and byte-range transfers.

GET /hello.txt HTTP/1.1
User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
Host: www.example.com
Accept-Language: en, mi
HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Accept-Ranges: bytes
Content-Length: 51
Vary: Accept-Encoding
Content-Type: text/plain

Hello World! My payload includes a trailing CRLF.

HTTP消息格式

HTTP-message   = start-line
                 *( header-field CRLF )
                 CRLF
                 [ message-body ]

程序首先读取start-line以及header,然后根据header里面的内容决定是否含有body。body则按照长度读取定长字节。

start-line

start-line     = request-line / status-line
request-line   = method SP request-target SP HTTP-version CRLF
status-line    = HTTP-version SP status-code SP reason-phrase CRLF

其中:

  • HTTP中并未直接定义request-line的长度限制,倘若超过服务端实现所支持的长度,则建议返回501(Not implemented)
  • HTTP建议request-line支持至少8000字符长
  • 若request-target超过服务器预期则必须返回414 (URI Too Long)

header-field

header-field   = field-name ":" OWS field-value OWS

field-name     = token
field-value    = *( field-content / obs-fold )
field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar    = VCHAR / obs-text

obs-fold       = CRLF 1*( SP / HTAB )

通常header-field是不应该重复,但有一些例外:

  • 若该header有多个值,则可以定义为多个或是按照逗号分隔值
  • 通常会出现多次,但因无法合并到一起,是一个例外

field-value通常只包含US-ASCII 字符集。

对于header的长度HTTP并未直接限制,实际通常根据header的含义在实现上有对应的限制。对于超出预期的请求,规定必须返回4xx错误而不是忽略,以防止 攻击。

http body

因为http的body并不是必须的,具体什么时候有body按照如下的规则:

  • 在request中,通过或者定义;
  • 在response中,通过请求的method以及响应的状态码确定。HEAD请求不包含body;对于CONNECT请求的2xx响应也没有body;1xx (Informational), 204 (No Content), 以及 304 (Not Modified)也均无body。所有其他response都具有body,即使其长度为0。

Transfer-Encoding

Transfer-Encoding用来表明body中的消息是如何编码的,例如

Transfer-Encoding: gzip, chunked

表示对body进行了gzip压缩并进行了分组。

transfer-coding    = "chunked" 
                   / "compress"
                   / "deflate"
                   / "gzip"
                   / transfer-extension
     transfer-extension = token *( OWS ";" OWS transfer-parameter )

Content-Length

如果不存在头,则可通过表明消息的长度。注意两者不允许同时出现。

Content-Length: 3495

Chunked Transfer Coding

Chunked enables content streams of unknown size to be transferred as a sequence of length-delimited buffers, which enables the sender to retain connection persistence and the recipient to know when it has received the entire message.

chunked-body   = *chunk
                 last-chunk
                 trailer-part
                 CRLF
chunk          = chunk-size [ chunk-ext ] CRLF
                 chunk-data CRLF
chunk-size     = 1*HEXDIG
last-chunk     = 1*("0") [ chunk-ext ] CRLF
chunk-data     = 1*OCTET ; a sequence of chunk-size octets

The chunk-size field is a string of hex digits indicating the size of the chunk-data in octets. The chunked transfer coding is complete when a chunk with a chunk-size of zero is received, possibly followed by a trailer, and finally terminated by an empty line.

其解码过程如下

length := 0
read chunk-size, chunk-ext (if any), and CRLF
while (chunk-size > 0) {
   read chunk-data and CRLF
   append chunk-data to decoded-body
   length := length + chunk-size
   read chunk-size, chunk-ext (if any), and CRLF
}
read trailer field
while (trailer field is not empty) {
   if (trailer field is allowed to be sent in a trailer) {
       append trailer field to existing header fields
   }
   read trailer-field
}
Content-Length := length
Remove "chunked" from Transfer-Encoding
Remove Trailer from existing header fields