TEXT   66

Notes About HTTP

Guest on 19th August 2022 04:40:06 PM

  1. Notes About HTTP
  2. HTTP Basics
  3. HTTP is the hypertext transfer protocol.  The basic scenario is the following.
  4. Server opens a socket and waits for a connection at some well known port number (generally 80).
  5. Client connects to the server at that port on that host.
  6. Client sends a request down the socket to the server.
  7. Server sends a reply.
  8. Server closes the connection.
  9. If the client wants more than one item from a server, it must build a connection for each item requested.  HTTP/1.1 allows multiple requests per connection, but is not yet widely implemented.   Most clients can make multiple requests at once, even using HTTP versions less than 1.1.  They do this by having multiple simultaneous connections.
  10.  
  11. HTTP Caching
  12. Clients don't normally just request files.  Instead, they normally check the cache first, and only request files if the file is not found in the cache.  The cache is a store of recently requested files.  Sometimes the client will verify the cache contents with the server.  This incurs the latency penalty, but not the transfer penalty.  Verify is done with the head command (see below).
  13. When caching works it has several advantages
  14.  
  15. Improves response time
  16. Reduces network load
  17. Reduces server load
  18. Improves performance of OTHER clients and OTHER requests
  19. Caching does have several disadvantages
  20. Slows response time when it fails
  21. Makes hit counts hard to measure
  22. Takes substantial disk/memory resources
  23. HTTP Proxies
  24. A proxy is a server/client combination that sits between the original server and the original client.  In other words, the picture changes from  the thing on the left to the thing on the right.  Proxys are useful for implementing network security, for (sometimes) improving performance, and for solving some network addressing/routing problems.  Most clients do not use proxies, however.
  25.         Client <----> Server         Client <----> Proxy <-----> Server
  26. One common  use of a proxy is to put a proxy at each gateway in order to cache files, and reduce network traffic across the network.  One study I read said that if all interior Internet gateways had a proxy server, total Internet traffic could be reduced by 30%.
  27. HTTP Requests
  28. Requests go from the client to the server, and a requests from the client asking the server to perform some service.  Each requests starts with a method, followed by a resource-indicator (generally a filename), and a protocol-version.  Optionally, there can be one or more modifiers.  There are three main methods, used in examples below.
  29. GET /index.html http/1.0                         retrieve the meta-data and the body of /index.html
  30. HEAD /robots.txt http/1.0             retrieve only the meta-data of /robots.txt
  31. PUT /my/secret/file http/1.0                   create or modify the file on the server
  32. All requests can have one or more modifiers.  Examples include...
  33. If-Modified-Since: Sat, 29 Oct 1994 19:43:21 GMT
  34. Content-Length: 3472
  35. Authorization: Basic Qwxyehsuzjehgsoiznshyebsn
  36. The If-Modified-Since modifier tells the server to send the data only if the data has changed since the given date.  This is most useful for clients that wish to cache.
  37. The Content-Length modifier is used only for the PUT method, and tells the length of the file body to follow. All put requests must have a  body.
  38.  
  39. The Authorization modifier encodes the user's name and password in a base-64 encoding scheme.  This scheme provides protection against only the most casual snooping attempts, since base 64 encoding can be decoded by anyone without need to know a secret password.
  40.  
  41.  
  42. HTTP Responses
  43. Responses come from the server to the client in response to client requests.  Each response is a series of lines describing the status (success or failure) of the request, followed optionally by the meta-data for the requested object and optionally the body of the file
  44. GET requests return a status code, and if successful the file meta-data and the file data.  The status code is the first line returned by the server, the meta-data are the next few lines, and the body of the file starts after the first blank line.  For example,
  45.  
  46.  
  47. HTTP/1.0 200 OK
  48. Date: Wed, 22 Oct 1997 04:02:44 GMT
  49. Server: Apache/1.1.1
  50. Content-type: text/html
  51. Content-length: 2919
  52. Last-modified: Wed, 15 Oct 1997 18:14:24 GMT
  53.  
  54. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
  55. <HTML>
  56. <HEAD> (Body continues here)
  57. HEAD and PUT are just like the GET request without the body.
  58.  
  59.  
  60. HTTP Performance
  61. HTTP performance can be divided into several parts.
  62. Network latency
  63. Network bandwidth
  64. Server latency
  65. Server bandwidth
  66. Latency means the time after the request is issued until the first byte of the answer is received.  Bandwidth is the rate at which data flows after the first byte is received.  For large files bandwidth across the internet dominates total time (normal internet bandwidth is 4 to 40 KB/sec).  Network latency is typically in the hundred millisecond range.  Server latency/bandwidth is hard to quantify but depends on many factors
  67. Server load
  68. File type (cgi-bin files and database requests are slow)
  69. File location (across network and deep inside subdirectories are slow)
  70. Reference frequency (files that are recently accessed are fast)
  71. Access to small files across the local net to our server (Euclid) can take about 100 ms.  Full downloads of very large files across the whole internet can take hours.
  72. If there is a modem anywhere in the download path  then normally modem performance dominates over other considerations.

Raw Paste


Login or Register to edit or fork this paste. It's free.