前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >【浏览器】:缓存

【浏览器】:缓存

作者头像
WEBJ2EE
发布2020-11-20 10:36:33
8540
发布2020-11-20 10:36:33
举报
文章被收录于专栏:WebJ2EEWebJ2EE
代码语言:javascript
复制
目录
1. 别人是怎么玩的?
    2.1. 知乎
    2.2. 京东
2. 缓存作用?
3. 缓存分类?
4. 缓存原理?
    4.1. 核心词汇
    4.2. 总体流程
    4.3. 缓存控制
        4.3.1. Cache-Control
            4.3.1.1. No caching
            4.3.1.2. Cache but revalidate
            4.3.1.3. Expiration
            4.3.1.4. Cache-Control 配置建议
        4.3.2. Pragma   
    4.4. 缓存验证 
        4.4.1. LastModified
        4.4.2. ETags
    4.5 一种更新资源的方式
5. 用户行为与浏览器
    5.1. 地址栏访问
    5.2. F5 刷新
    5.2. Ctrl+F5 刷新
6. FAQ
    6.1. HTML Meta Tags and HTTP Headers?
    6.2. Pragma HTTP Headers (and why they don’t work)
    6.3. 如何清理浏览器缓存?
    6.4. Targets of caching operations?      
    6.5. Tomcat 如何生成 ETag?
    6.6. Tomcat 如何对待"If-None-Match"等请求头的?

1. 【知乎】是怎么玩的?

1.1. 知乎

1.2. 京东

2. 缓存作用?

  • 降低延迟
    • Caching enables content to be retrieved faster because an entire network round trip is not necessary. Caches maintained close to the user, like the browser cache, can make this retrieval nearly instantaneous.
  • 降低网络消耗
    • Content can be cached at various points in the network path between the content consumer and content origin. When the content is cached closer to the consumer, requests will not cause much additional network activity beyond the cache.
  • 降低服务端压力

3. 缓存分类?

缓存是一种保存资源副本并在下次请求时直接使用该副本的技术。当 web 缓存发现请求的资源已经被存储,它会拦截请求,返回该资源的拷贝,而不会去源服务器重新下载。缓存的种类有很多,其大致可归为两类:

  • 私有缓存(例:浏览器缓存)
  • 共享缓存(例:代理缓存)

4. 缓存原理?

4.1. 核心词汇

  • Origin server
    • The origin server is the original location of the content. If you are acting as the web server administrator, this is the machine that you control. It is responsible for serving any content that could not be retrieved from a cache along the request route and for setting the caching policy for all content.
  • Freshness
    • Freshness is a term used to describe whether an item within a cache is still considered a candidate to serve to a client. Content in a cache will only be used to respond if it is within the freshness time frame specified by the caching policy.
    • Once a resource is stored in a cache, it could theoretically be served by the cache forever. Caches have finite storage so items are periodically removed from storage. This process is called cache eviction. On the other side, some resources may change on the server so the cache should be updated. As HTTP is a client-server protocol, servers can't contact caches and clients when a resource changes; they have to communicate an expiration time for the resource. Before this expiration time, the resource is fresh; after the expiration time, the resource is stale. Eviction algorithms often privilege fresh resources over stale resources. Note that a stale resource is not evicted or ignored; when the cache receives a request for a stale resource, it forwards this request with a If-None-Match to check if it is in fact still fresh. If so, the server returns a 304 (Not Modified) header without sending the body of the requested resource, saving some bandwidth.
  • Stale content
    • Items in the cache expire according to the cache freshness settings in the caching policy. Expired content is “stale”. In general, expired content cannot be used to respond to client requests. The origin server must be re-contacted to retrieve the new content or at least verify that the cached content is still accurate.
  • Validation
    • Stale items in the cache can be validated in order to refresh their expiration time. Validation involves checking in with the origin server to see if the cached content still represents the most recent version of item.

4.2. 总体流程

4.3. 缓存控制

4.3.1. Cache-Control

The Cache-Control HTTP/1.1 general-header field is used to specify directives for caching mechanisms in both requests and responses. Use this header to define your caching policies with the variety of directives it provides.

4.3.1.1. No caching

The cache should not store anything about the client request or server response. A request is sent to the server and a full response is downloaded each and every time.

代码语言:javascript
复制
Cache-Control: no-store

示例:禁止浏览器缓存CSS、JS、PNG、HTML文件

4.3.1.2. Cache but revalidate

A cache will send the request to the origin server for validation before releasing a cached copy.

代码语言:javascript
复制
Cache-Control: no-cache

4.3.1.3. Expiration

The most important directive here is max-age=<seconds>, which is the maximum amount of time in which a resource will be considered fresh. This directive is relative to the time of the request, and overrides the Expires header (if set).

代码语言:javascript
复制
Cache-Control: max-age=31536000

注:若响应中含有Cache-Control:max-age=0或Cache-Control:no-cache 或 Pragma:no-cache,可使浏览器在每次使用缓存前,都去跟服务器确认缓存的可用性;

4.3.1.4. Cache-Control 配置建议

4.3.2. Pragma

  • Pragma is an HTTP/1.0 header. Pragma: no-cache is like Cache-Control: no-cache in that it forces caches to submit the request to the origin server for validation, before releasing a cached copy. However, Pragma is not specified for HTTP responses and is therefore not a reliable replacement for the general HTTP/1.1 Cache-Control header.
  • Pragma should only be used for backwards compatibility with HTTP/1.0 caches where the Cache-Control HTTP/1.1 header is not yet present.

4.4. 缓存验证

When a cached document's expiration time has been reached, it is either validated or fetched again. Validation can only occur if the server provided either a strong validator or a weak validator.

4.4.1. Last-Modified

  • The Last-Modified response header can be used as a weak validator. It is considered weak because it only has 1-second resolution. If the Last-Modified header is present in a response, then the client can issue an If-Modified-Since request header to validate the cached document.
  • When a validation request is made, the server can either ignore the validation request and response with a normal 200 OK, or it can return 304 Not Modified (with an empty body) to instruct the browser to use its cached copy. The latter response can also include headers that update the expiration time of the cached document.

4.4.2. ETags

  • The ETag response header is an opaque-to-the-useragent value that can be used as a strong validator. That means that a HTTP user-agent, such as the browser, does not know what this string represents and can't predict what its value would be. If the ETag header was part of the response for a resource, the client can issue an If-None-Match in the header of future requests – in order to validate the cached resource.

4.5 一种更新资源的方式

5. 用户行为与浏览器

5.1. 地址栏访问

  • IE11、Edge、Chrome、Firefox:完整缓存策略

5.2. F5 刷新

  • IE11、Chrome:完整缓存策略
  • Edge、Firefox:检验新鲜度

5.3. Ctrl + F5

按Ctrl+F5,浏览器将放弃自身缓存,同时也不会向向服务端确认新鲜度,直接从拉取资源。

  • IE11:仍然检验新鲜度
    • If-Modified-Since
  • Edge:直接拉取资源
    • cache-control:no-cache
  • Chrome、Firefox:直接拉取资源
    • cache-control:no-cache
    • pragma:no-cache

6:FAQ?

6.1. HTML Meta Tags and HTTP Headers

  • HTML authors can put tags in a document’s <HEAD> section that describe its attributes. These meta tags are often used in the belief that they can mark a document as uncacheable, or expire it at a certain time.
  • Meta tags are easy to use, but aren’t very effective. That’s because they’re only honored by a few browser caches, not proxy caches (which almost never read the HTML in the document). While it may be tempting to put a Pragma: no-cache meta tag into a Web page, it won’t necessarily cause it to be kept fresh.
  • On the other hand, true HTTP headers give you a lot of control over how both browser caches and proxies handle your representations. They can’t be seen in the HTML, and are usually automatically generated by the Web server. However, you can control them to some degree, depending on the server you use. In the following sections, you’ll see what HTTP headers are interesting, and how to apply them to your site.
  • HTTP headers are sent by the server before the HTML, and only seen by the browser and any intermediate caches.

6.2. Pragma HTTP Headers (and why they don’t work)

  • Many people believe that assigning a Pragma: no-cache HTTP header to a representation will make it uncacheable. This is not necessarily true; the HTTP specification does not set any guidelines for Pragma response headers; instead, Pragma request headers (the headers that a browser sends to a server) are discussed. Although a few caches may honor this header, the majority won’t, and it won’t have any effect.
  • The Pragma HTTP/1.0 general header is an implementation-specific header that may have various effects along the request-response chain. It is used for backwards compatibility with HTTP/1.0 caches where the Cache-Control HTTP/1.1 header is not yet present.
  • Note: Pragma is not specified for HTTP responses and is therefore not a reliable replacement for the general HTTP/1.1 Cache-Control header, although it does behave the same as Cache-Control: no-cache, if the Cache-Control header field is omitted in a request. Use Pragma only for backwards compatibility with HTTP/1.0 clients.

6.3. 如何清理浏览器缓存?

各浏览器都可通过 [Ctrl+Shift+Delete] 快捷键完成缓存清理。

6.4. Targets of caching operations?

HTTP caching is optional but usually desirable. HTTP caches are typically limited to caching responses to GET; they may decline other methods. The primary cache key consists of the request method and target URI (often only the URI is used — this is because only GET requests are caching targets).

6.5. Tomcat 如何生成 ETag?

代码语言:javascript
复制
package org.apache.catalina.webresources
public abstract class AbstractResource implements WebResource {
    @Override
    public final String getETag() {
        if (weakETag == null) {
            synchronized (this) {
                if (weakETag == null) {
                    long contentLength = getContentLength();
                    long lastModified = getLastModified();
                    if ((contentLength >= 0) || (lastModified >= 0)) {
                        weakETag = "W/\"" + contentLength + "-" +
                                   lastModified + "\"";
                    }
                }
            }
        }
        return weakETag;
    }
}

6.6. Tomcat 如何对待"If-None-Match"等请求头的?

代码语言:javascript
复制
package org.apache.catalina.servlets;
public class DefaultServlet extends HttpServlet {
    /**
     * Check if the conditions specified in the optional If headers are
     * satisfied.
     *
     * @param request   The servlet request we are processing
     * @param response  The servlet response we are creating
     * @param resource  The resource
     * @return <code>true</code> if the resource meets all the specified
     *  conditions, and <code>false</code> if any of the conditions is not
     *  satisfied, in which case request processing is stopped
     * @throws IOException an IO error occurred
     */
    protected boolean checkIfHeaders(HttpServletRequest request,
                                     HttpServletResponse response,
                                     WebResource resource)
        throws IOException {

        return checkIfMatch(request, response, resource)
            && checkIfModifiedSince(request, response, resource)
            && checkIfNoneMatch(request, response, resource)
            && checkIfUnmodifiedSince(request, response, resource);

    }
    
    /**
     * Check if the if-match condition is satisfied.
     *
     * @param request   The servlet request we are processing
     * @param response  The servlet response we are creating
     * @param resource  The resource
     * @return <code>true</code> if the resource meets the specified condition,
     *  and <code>false</code> if the condition is not satisfied, in which case
     *  request processing is stopped
     * @throws IOException an IO error occurred
     */
    protected boolean checkIfMatch(HttpServletRequest request,
            HttpServletResponse response, WebResource resource)
            throws IOException {

        String eTag = resource.getETag();
        String headerValue = request.getHeader("If-Match");
        if (headerValue != null) {
            if (headerValue.indexOf('*') == -1) {

                StringTokenizer commaTokenizer = new StringTokenizer
                    (headerValue, ",");
                boolean conditionSatisfied = false;

                while (!conditionSatisfied && commaTokenizer.hasMoreTokens()) {
                    String currentToken = commaTokenizer.nextToken();
                    if (currentToken.trim().equals(eTag))
                        conditionSatisfied = true;
                }

                // If none of the given ETags match, 412 Precondition failed is
                // sent back
                if (!conditionSatisfied) {
                    response.sendError
                        (HttpServletResponse.SC_PRECONDITION_FAILED);
                    return false;
                }

            }
        }
        return true;
    }


    /**
     * Check if the if-modified-since condition is satisfied.
     *
     * @param request   The servlet request we are processing
     * @param response  The servlet response we are creating
     * @param resource  The resource
     * @return <code>true</code> if the resource meets the specified condition,
     *  and <code>false</code> if the condition is not satisfied, in which case
     *  request processing is stopped
     */
    protected boolean checkIfModifiedSince(HttpServletRequest request,
            HttpServletResponse response, WebResource resource) {
        try {
            long headerValue = request.getDateHeader("If-Modified-Since");
            long lastModified = resource.getLastModified();
            if (headerValue != -1) {

                // If an If-None-Match header has been specified, if modified since
                // is ignored.
                if ((request.getHeader("If-None-Match") == null)
                    && (lastModified < headerValue + 1000)) {
                    // The entity has not been modified since the date
                    // specified by the client. This is not an error case.
                    response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
                    response.setHeader("ETag", resource.getETag());

                    return false;
                }
            }
        } catch (IllegalArgumentException illegalArgument) {
            return true;
        }
        return true;
    }


    /**
     * Check if the if-none-match condition is satisfied.
     *
     * @param request   The servlet request we are processing
     * @param response  The servlet response we are creating
     * @param resource  The resource
     * @return <code>true</code> if the resource meets the specified condition,
     *  and <code>false</code> if the condition is not satisfied, in which case
     *  request processing is stopped
     * @throws IOException an IO error occurred
     */
    protected boolean checkIfNoneMatch(HttpServletRequest request,
            HttpServletResponse response, WebResource resource)
            throws IOException {

        String eTag = resource.getETag();
        String headerValue = request.getHeader("If-None-Match");
        if (headerValue != null) {

            boolean conditionSatisfied = false;

            if (!headerValue.equals("*")) {

                StringTokenizer commaTokenizer =
                    new StringTokenizer(headerValue, ",");

                while (!conditionSatisfied && commaTokenizer.hasMoreTokens()) {
                    String currentToken = commaTokenizer.nextToken();
                    if (currentToken.trim().equals(eTag))
                        conditionSatisfied = true;
                }

            } else {
                conditionSatisfied = true;
            }

            if (conditionSatisfied) {

                // For GET and HEAD, we should respond with
                // 304 Not Modified.
                // For every other method, 412 Precondition Failed is sent
                // back.
                if ( ("GET".equals(request.getMethod()))
                     || ("HEAD".equals(request.getMethod())) ) {
                    response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
                    response.setHeader("ETag", eTag);

                    return false;
                }
                response.sendError(HttpServletResponse.SC_PRECONDITION_FAILED);
                return false;
            }
        }
        return true;
    }

    /**
     * Check if the if-unmodified-since condition is satisfied.
     *
     * @param request   The servlet request we are processing
     * @param response  The servlet response we are creating
     * @param resource  The resource
     * @return <code>true</code> if the resource meets the specified condition,
     *  and <code>false</code> if the condition is not satisfied, in which case
     *  request processing is stopped
     * @throws IOException an IO error occurred
     */
    protected boolean checkIfUnmodifiedSince(HttpServletRequest request,
            HttpServletResponse response, WebResource resource)
            throws IOException {
        try {
            long lastModified = resource.getLastModified();
            long headerValue = request.getDateHeader("If-Unmodified-Since");
            if (headerValue != -1) {
                if ( lastModified >= (headerValue + 1000)) {
                    // The entity has not been modified since the date
                    // specified by the client. This is not an error case.
                    response.sendError(HttpServletResponse.SC_PRECONDITION_FAILED);
                    return false;
                }
            }
        } catch(IllegalArgumentException illegalArgument) {
            return true;
        }
        return true;
    }
    
}

参考:

Caching Tutorial: https://www.mnot.net/cache_docs/ Cache-Control for Civilians: https://csswizardry.com/2019/03/cache-control-for-civilians/ Caching best practices & max-age gotchas: https://jakearchibald.com/2016/caching-best-practices/ Web Caching Basics: Terminology, HTTP Headers, and Caching Strategies: https://www.digitalocean.com/community/tutorials/web-caching-basics-terminology-http-headers-and-caching-strategies HTTP: https://www.ietf.org/rfc/rfc2616.txt https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Pragma Cache-Control Header Builder: https://cache-control.sdgluck.now.sh/


本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2020-11-15,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 WebJ2EE 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档