.description = "I built a web application with file upload functionality. Some Vue.js in the front and a CouchDB in the back. Everything should be pretty simple and straigt forward. But…",
When I uploaded image files, they somehow got mangled. The uploaded file was bigger than the original and the new "file format" was not readable by any means. I got intrigued. What is it, that happens to the files? The changes seemed very random but reproducible, so I created a few test files to see what exactly changes and when.
My first file looked like this:
```
0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
```
To my surprise, the file stayed the same! My curiosity grew. In the meantime I found a very intriguing pattern in uploads hexdump: `C3 BF C3`. It was everywhere. In another file, I found similar patterns with `C2`. So I wrote my next test file. This time a binary file:
**EDIT**: As you probably already noticed, I counted up like in Base10 but it is actually Base16. So I skipped A-F until reaching A0. This might look weird but didn't affect the test.
So all bytes with a value higher than *0x79* got followed by a *0xC2*. *0x79* is the ASCII code for *y*. This is at least what I thought. It actually is the other way around: All bytes with value *0x80* or higher got prefixed by a *0xC2*! — there the scales fell from my eyes: **UTF-8 encoding**!
In *UTF-8* all characters after *0x7F* are at least two bytes long. They get prefixed with *0xC2* until *0xC2BF* (which is the inverted question mark `¿`), which is then followed by *0xC380*. So what happened is, that on the way to the server, the file got encoded to UTF-8 ¯\\\_(ツ)\_/¯
**EDIT:** Corrected some mistakes after some comments on [Hackernews](https://news.ycombinator.com/item?id=14089827)