KOI8-R <FORM>s Handling

In your HTML document:

  1. Use ACCEPT-CHARSET attribute with <FORM> tag as prescribed by RFC 2070, it must contain a comma-separated list of character sets acceptable by the server (exactly as in the Accept-Charset header field format but without any q= quality parameters, see How to request KOI8-R documents).

  2. Use POST method, it is impossible to determine character set for GET method arguments.

  3. The ACCEPT-CHARSET attribute affects all <INPUT> and <TEXTAREA> elements of the <FORM>. If you want a different character set for each element, you must use a ENCTYPE=multipart/form-data form, check out Form-based File Upload in HTML (RFC 1867) for more info.

For example:

<FORM METHOD=POST ACCEPT-CHARSET="koi8-r,us-ascii"
ACTION="cgi-bin/guestbook.cgi">

In your CGI script:

  1. Conformant browser must supply the charset=name attribute in the Content-Type header field. For example:

    Content-Type: application/x-www-form-urlencoded; charset=KOI8-R

    The value of this header field is accessible in a CGI script via the CONTENT_TYPE environment variable. You can check your browser with this <FORM> input test. If a character set is present in the variable, extract it and pass as an argument to your external document character sets convertor.

    Another standard variant is using ENCTYPE=multipart/form-data, but in this case your browser must accompany each part of the multipart form data with the correct charset=name in the Content-Type field. I don't know of any browser to comply to this requirement, so you should avoid using this ENCTYPE value.

  2. If the character set not provided by the variable, assume it is the same as specified in ACCEPT-CHARSET<FORM> attribute and pass it to character sets converter.
    WARNING: this works only for a single character set in ACCEPT-CHARSET attribute.