Boundary in Form Data
I am going to discuss here what is boundary in multipart/form-data which is mainly found for an input type of file in an HTML form. The boundary is included to separate name/value pair in the multipart/form-data. The boundary parameter acts like a marker for each pair of name and value in the multipart/form-data. The boundary parameter is automatically added to the Content-Type in the http (Hyper Text Transfer Protocol) request header.
What is multipart/form-data?
It is one of the encoding methods provided by an HTML (Hyper Text Markup Language) form data. There are three encoding methods provided by the HTML form:
- application/x-www-form-urlencoded (default)
Generally you include multipart/form-data in your HTML form for an input type file. Even you can use this encoding if your HTML form does not contain any input type file but application/x-www-form-urlencoded encoding would be more appropriate when your HTML form does not have any file input. But do not use text/plain for the Content-Type.
In conclusion when you make a POST request, your data need to be encoded in the request body by some means and it is where your one of the encoding methods comes into picture.
application/x-www-form-urlencoded is similar to the query string at the end of the URL. text/plain can be used only for debugging purpose. multipart/form-data is significantly more complex but it allows entire file data to be included in the body of the request.
Where does name/value pair come from?
The name and value pair correspond to the name and value respectively of the input fields in an HTML form which you define in the web page.
Read more: Half black half blonde hair underneath
The name/value pair is passed when you submit an HTML form data and the Content-Type with boundary parameter gets added automatically upon form submission.
Is arbitrary value allowed in boundary?
Yes, an arbitrary value is allowed in boundary parameter. Make sure that the value for the boundary parameter does not exceed 70 bytes in length and consists only of 7-bit US-ASCII characters.
Is boundary parameter mandatory in multipart/form-data?
Yes, it is not only mandatory in multipart/form-data field but also it is required in any of the multipart/* content types.
If you do not specify the boundary parameter then your server will not be able to parse the request payload.
Is other charset than US-ASCII allowed?
Yes, you can set the charset parameter, for example, to UTF-8 in Content-Type header unless you are absolutely sure that only US-ASCII charset, which is a default value in the absence of charset parameter, will be used in payload.
According to the RFC2046, the Content-Type field for multipart entities requires one parameter – boundary.
The boundary delimiter line is then defined as a line consisting entirely of two hyphen characters (“-“, decimal value 45) followed by the boundary parameter value from the Content-Type header field, optional linear whitespace, and a terminating CRLF (Carriage Return Line Feed).
Read more: Need to specify how to reconcile divergent branches
Boundary delimiters must not appear within the encapsulated material, and must be no longer than 70 characters, not counting the two leading hyphens.
The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value.
Examples – Boundary in multipart/form-data
Enough talking about boundary parameter, let’s see with examples…
If you run the example at link Python Flask File Upload, you will see the similar kind of data as shown below.
I have uploaded here an image file using Mozilla FireFox browser (you can use any browser).
Clicking on the Network tab of the browser debug tool you will find such information.
Request Method: POST
Read more: Iterate over json object c
Request Headers: Content-Type:multipart/form-data; boundary=-293582696224464
-293582696224464 Content-Disposition: form-data; name=”file”; filename=”roytuts.jpg” Content-Type: image/jpeg <content of the file> -293582696224464-
In the above example the boundary is defined by -293582696224464 and the content is written inside the boundary delimiter or marker.
At the end of the boundary marker you will see – which indicates the end of the boundary.
If you run the file upload example using Restlet client then you will see similar to the below value for the boundary parameter in Content-Type.
Content-Type: multipart/form-data; boundary=-WebKitFormBoundarydMIgtiA2YeB1Z0kl
Here is an example of arbitrary boundary in multipart/form-data:
Content-Type: multipart/form-data;; charset=utf-8; boundary=”-arbitrary boundary” -arbitrary boundary Content-Disposition: form-data; name=”foo” foo -arbitrary boundary Content-Disposition: form-data; name=”bar” bar -arbitrary boundary-
That’s all about how boundary parameter works and what is the need of boundary parameter in multipart/form-data.