Understanding boundary in multipart/form-data

Introduction

I am going to discuss here what is boundary in multipart/form-data. The boundary is included to separate name/value pair in the multipart/form-data. The boundary parameter acts like a marker for each pair of name and value. The boundary parameter is automatically added to the Content-Type in the request header.

What is multipart/form-data?

It is one of the encoding methods provided by HTML form data. There are three encoding methods provided by HTML form:

  • application/x-www-form-urlencoded (default)
  • multipart/form-data
  • text/plain

You can also use other encoding method by other means over HTTP protocol other than an HTML form submission. For example, you can use JSON format in REST service over HTTP/HTTPS protocol.

Generally you include multipart/form-data in your HTML form for an input type file. Even you can use this encoding if your HTML form does not contain any input type file but application/x-www-form-urlencoded encoding would be more appropriate. But do not use text/plain.

In conclusion when you make a POST request, your data need to be encoded in the request body by some means and it is where your one of the encoding methods comes into picture.

application/x-www-form-urlencoded is similar to the query string at the end of the URL. text/plain can be used only for debugging purpose. multipart/form-data is significantly more complex but it allows entire file data to be included in the body of the request.

Where does name/value pair come from?

The name and value pair correspond to the name and value respectively of the input fields in HTML form you define in the web page.

The name/value pair is passed when you submit the HTML form data and the Content-Type with boundary parameter gets added automatically upon form submission.

Is arbitrary value allowed in boundary?

Yes, an arbitrary value is allowed in boundary parameter. Make sure that the value for the boundary parameter does not exceed 70 bytes in length and consists only of 7-bit US-ASCII characters.

Is boundary parameter mandatory in multipart/form-data?

Yes, not only in multipart/form-data but also in any of the multipart/* content types.

If you do not specify the boundary parameter then server will not be able to parse the request payload.

Is other charset than US-ASCII allowed?

Yes, you can set the charset parameter, for example, to UTF-8 in Content-Type header unless you are absolutely sure that only US-ASCII charset, which is a default value in the absence of charset parameter, will be used in payload.

Boundary delimiter

According to the RFC2046, the Content-Type field for multipart entities requires one parameter – boundary.

The boundary delimiter line is then defined as a line consisting entirely of two hyphen characters (“-“, decimal value 45) followed by the boundary parameter value from the Content-Type header field, optional linear whitespace, and a terminating CRLF.

Boundary delimiters must not appear within the encapsulated material, and must be no longer than 70 characters, not counting the two leading hyphens.

The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value.

Examples

Enough talking about boundary parameter, let’s see with examples…

File Upload

If you run the example at link Python Flask File Upload, you will see the similar kind of data as shown below.

I have uploaded here an image file using Mozilla FireFox browser (you can use any browser).

Clicking on the Network tab of the browser debug tool you will find such information.

Request method: POST

URL: http://localhost:5000

Request Headers: Content-Type:multipart/form-data; boundary=---------------------------293582696224464

Params:

-----------------------------293582696224464

Content-Disposition: form-data; name="file"; filename="roytuts.jpg"

Content-Type: image/jpeg


<content of the file>

-----------------------------293582696224464--

In the above example the boundary is defined by ---------------------------293582696224464 and the content is written inside the boundary delimiter or marker.

At the end of the boundary marker you will see -- which indicates the end of the boundary.

If you run the file upload example using Restlet client then you will see similar to the below value for the boundary parameter in Content-Type.

Content-Type:	multipart/form-data; boundary=----WebKitFormBoundarydMIgtiA2YeB1Z0kl

Arbitrary Boundary

Here is an example of arbitrary boundary in multipart/form-data:

Content-Type: multipart/form-data;; charset=utf-8; boundary="----arbitrary boundary"

----arbitrary boundary
Content-Disposition: form-data; name="foo"

foo
----arbitrary boundary
Content-Disposition: form-data; name="bar"

bar
----arbitrary boundary--

That’s all about boundary parameter in multipart/form-data.

Thanks for reading.

Related posts

Leave a Comment