Tomcat:URL中文编码设置

来自WHY42

tomcat中,如果不设置URL编码,URL中带有中文字符时会导致乱码。这个问题经常遇到,今天专门写个程序来测试下:

测试TOMCAT编码

来写个小程序测试下:

public class HelloServlet extends HttpServlet{
    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response){
        System.out.println("=>GET");
        System.out.println(request.getRequestURI());
        this.printParam(request);
    }
    
    void printParam(HttpServletRequest request){
        Map<String, String[]> params = request.getParameterMap();
        for(String k:params.keySet()){
            System.out.print("\t?" + k + "=");
            for(String v:params.get(k))
                System.out.println("(" + v + ")");
        }
    }

    @Override
    public void doPost(HttpServletRequest request, HttpServletResponse response){
        System.out.println("=>POST");
        this.printParam(request);
    }
}

web.xml中配置:

<servlet>
	<servlet-name>HelloWorldServlet</servlet-name>
	<servlet-class>com.riguz.tc.HelloServlet</servlet-class>
</servlet>

<servlet-mapping>
	<servlet-name>HelloWorldServlet</servlet-name>
	<url-pattern>/*</url-pattern>
</servlet-mapping>

然后是一个小的测试程序:

public static void main(String[] args) {
    final String url = "http://localhost:8080/tc/hello";
    HttpKit.get(url + "?id=COM_RIGUZ_测试_0001");
    HttpKit.post(url + "?id=COM_RIGUZ_测试_0001", "name=中华人民共和国&id=COM_RIGUZ_测试_0001");
}
public static String post(String url, Map<String, String> queryParas, String data, Map<String, String> headers) {
    HttpURLConnection conn = null;
    try {
        conn = getHttpConnection(buildUrlWithQueryString(url, queryParas), POST, headers);
        conn.connect();
        OutputStream out = conn.getOutputStream();
        out.write(data.getBytes(CHARSET));
        out.flush();
        out.close();

        return readResponseString(conn);
    }
    catch (Exception e) {
        throw new RuntimeException(e);
    }
    finally {
        if (conn != null) {
            conn.disconnect();
        }
    }
}

GET乱码

如果不加任何配置,一个tomcat运行起来后,输出是这样子的:

=>GET
/tc/hello
        ?id=(COM_RIGUZ_???è??_0001)
=>POST
        ?id=(COM_RIGUZ_???è??_0001)
(COM_RIGUZ_???è??_0001)
        ?name=(??????????°???±??????)

我们在tomcat的server.xml加上URL编码,再看看输出:

<Connector port="8080" protocol="HTTP/1.1"
		   connectionTimeout="20000"
		   redirectPort="8443" 
		   URIEncoding="UTF-8"/>

可以看出URL部分的参数已经正常了,不过POST部分还是不对。

=>GET
/tc/hello
        ?id=(COM_RIGUZ_测试_0001)
=>POST
        ?id=(COM_RIGUZ_测试_0001)
(COM_RIGUZ_???è??_0001)
        ?name=(??????????°???±??????)

POST乱码

我们试试在post方法里面加点东西:

conn.setRequestProperty("Content-type", "application/x-www-form-urlencoded;charset=utf-8");

然后就正常了!

=>GET
/tc/hello
        ?id=(COM_RIGUZ_测试_0001)
=>POST
        ?id=(COM_RIGUZ_测试_0001)
(COM_RIGUZ_测试_0001)
        ?name=(中华人民共和国)

经测试发现,只要指定了Content-type,里面填什么都可以,例如:

  • application/json;charst=utf-8
  • text/html

不过有个坑,貌似JFinal只有填application/x-www-form-urlencoded;charset=utf-8的时候,才取的到参数。 可以这样打印参数测试:

StringBuffer sb = new StringBuffer();
Enumeration<String> e = request.getParameterNames();
if (e.hasMoreElements()) {
    sb.append("Parameter   : ");
    while (e.hasMoreElements()) {
        String name = e.nextElement();
        String[] values = request.getParameterValues(name);
        if (values.length == 1) {
            sb.append(name).append("=").append(values[0]);
        }
        else {
            sb.append(name).append("[]={");
            for (int i=0; i<values.length; i++) {
                if (i > 0)
                    sb.append(",");
                sb.append(values[i]);
            }
            sb.append("}");
        }
        sb.append("  ");
    }
    sb.append("\n");
}
System.out.println(sb.toString());