Your hex editor should color-code bytes

178 points - last Tuesday at 9:52 AM

Source

Comments

fleebee today at 12:33 PM
> having more colors makes it possible to recognize more complex patterns

The implicit cost here is that the simple patterns become harder to recognize when every byte is only subtly differently colored. Rather than give everything a different color, I'd rather have the important stuff highlighted.

In the comparisons given, I think hexyl's highlighting scheme is significantly more useful.

dspillett today at 9:43 AM
Everything should try do some basic syntax highlighting IMO. Not too much, or it just becomes a sea of formatting that doesn't help at all. It is surprising how much difference just a little splash of colour can make if it isn't overdone. If possible, always include configuration options for the user though, so those with colour-blindness issues can tweak things to their needs, those who are just fussy can make the output fit with their finely adjusted system-wide colour schemesÂą, and even better, where you can, allow bold/italic/other as well as colours so that those who barely see colour at all can play too.

Of course none of this helps those using screen-readers and other tech, so make sure that all your fancy colouring & such is additive so if it is all “lost” no meaning is absolutely lost with it.

--------

[1] Some people can be very vocal about this, more so than if highlighting isn't possible at all. If you give any output formatting they'll expect you to match, or be able to be made to match, their preferred style.

cuechan today at 10:57 AM
For anyone who regularly has to look at/analyze binary files, i highly recommend ImHex [1].

Its a hex editor built with imgui and has a lot of built in tools. Imo the best feature is the data structure editor. You can write a data type definition similar to C and it overlays it on the hexdump and parses it in a structured way while you type.

It also has a node based editor.

1: https://github.com/WerWolv/ImHex

Someone today at 12:17 PM
The first example is “go ahead, try to find the single C0 in these bytes” and then argues one should highlight C0 bytes.

If that’s true, how does the tool know I will be looking for C0 bytes and not for 03, D3, etc? The logical conclusion of that would be that the hex editor should uniquely color code every byte. And following the other examples even that’s not enough.

The proposed solution is to create groups of byte values that each get their unique color. I think that helps, but we can do better: add a search feature. That tells your editor what you are looking for. Once you enter a search string, it can highlight all hits.

Yes, “colorful output in a hexdump is useful for the same reason that syntax highlighting for code is useful”, but do you know what syntax highlighting needs? Knowledge of the expected content of a file. Without that, a hex editor at best can guess at how to color-code stuff.

IMO, if you want to add syntax coloring to a hex editor, give it pluggable syntax coloring and heuristics for deciding which one to use when.

While at it, also let those plugins control where to break lines, whether to show hex at all (why show it at all if a file has a few paragraphs of English text or an array of IEEE doubles?), etc.

Those plug-ins will make errors and sometimes, users will want to see all byte values, so you’ll need a way for the user to override them.

NooneAtAll3 today at 10:54 AM
Why did author decide that best way to demonstrate his idea would be by cutting contrast in half?

color-coding might be a great solution, but you don't really know beforehand which byte values are important. Manually selecting C0 to make it stand out it just ctrl+f with extra steps. (But I wouldn't mind something like "color 00 separate from ascii separate from the rest)

roelschroeven today at 11:03 AM
When you're going to color-code bytes in a hex dump, I would expect each ASCII character in the right column to have the same color as the hex byte in the left column, making it easier to pair them. I wonder why that wasn't done here.
delta_p_delta_x today at 10:30 AM

  > Your hex editor should colour-code bytes so it is easier for users to distinguish patterns
  > Article is fully in lowercase, which makes it harder for readers to make out sentences and the flow of the article
  > mfw the irony
orphea today at 11:50 AM
I get the idea but those specific examples are awful - not enough contrast.
nticompass today at 11:23 AM
I used to use wxHexEditor and that had a feature where I could select a section of the file and highlight it in a color. When I was working to decode a certain file format, I used that to color-code different sections of the file and it was super useful. Those color-codes were stored in a separate file so you could load them back in.
ChrisRR today at 11:35 AM
What a bad way to illustrate your point by using such similar looking pastel colours
bandrami today at 8:31 AM
Emacs's hexl-mode does this, incidentally, though annoyingly by default it makes all faces the same color. I never understood why it defines the faces but then doesn't customize them.
Archelaos today at 9:24 AM
This article made me think how I could use similar techinques to colour code the data in database tables. Has anyone here tried that and has some recommendations where to start, etc.?
kokakiwi today at 10:58 AM
ImHex (https://imhex.werwolv.net/) is also a really nice Hex editor with tons of plugins (patterns, file support, etc.) and even an embedded language for adding more patterns easily
psychoslave today at 9:01 AM
That said, even colored these dumps still feels unappealing to me — so yes this is admittedly subjective gut jumping in the conversation. I get that occult form can also be an attractive force.

The post put on the table an interesting point about how to improve the presentation layer to fit what’s human cognition is good at spotting (in general, or at least for the expected audience with some training). And it does start proposing something with these color schemes. But isn’t it kind of missing the forest for the tree? Actually why do we even have rendering with [012345678ABCDEF], when a specific set of (colored/imaged?) glyphs would be able to make more obvious what’s on the table? Or even beyond the hexadecimal grouping, wouldn’t be more relevant to render something "intuitively" far more easy to grap without several layer of internalized interpretation through acculturation?

deleted today at 10:59 AM
red_admiral today at 11:00 AM
My hex editor should let me turn syntax highlighting on and off; follow my personal color theme (and not produce light gray on white in the terminal); and let me highlight specific things I'm searching for like OD OA or FF FE.
PunchyHamster today at 11:46 AM
I wonder how hard it would be to color code repeating sequences
js8 today at 8:32 AM
I think semantic coloring (based on structure) is more useful. Also (can't help as someone working with z/OS), if you really want to make hex output readable, I recommend using big-endian machine.
greatgib today at 10:13 AM
To me the random colors at each byte is messing up with my brain making it hard to fast identify C0 or any other value that I could more easily identify in all black.

But color would be nice more based on the bytes logic.

Eventually the 00 in a shaded grey instead of black, and in best case scenario by logic unit based on your protocol. And worst case scenario by groups of words or so.

azalemeth today at 9:01 AM
I really like hexyl [1], which does this by default.

https://github.com/sharkdp/hexyl

xyx0826 today at 9:30 AM
If you analyze binary files often, I highly recommend binvis - http://binvis.io/. It creates a colored minimap for files it loads and has two available arrangements. Pixel color is based on range of bytes, eg ASCII/null bytes/FF bytes. Besides, it’s a pretty basic hex viewer that runs in your browser. The minimap is extremely powerful for identifying interesting areas and patterns in unknown data.
asibahi today at 8:52 AM
When I read this article a few days ago it inspired me to create my own hex viewer : https://ar-ms.me/thoughts/3sl-a-sweet-hex-utility/

The cool thing about it imo (outside of colors) is a `--windows` flag. Which separates the hex view into partitions: so `-w 2:-3:5` shows the first two bytes on a line, then skips three bytes, then shows the next 5 bytes on a line, then the rest of the file. Easy to use combined with a terminal's up arrow.

a_t48 today at 8:32 AM
I've started doing this with hashes in a CLI I'm working on. For slow prints, it's somewhat helpful https://asciinema.org/a/aD38Pk88CZgSZqtq but for debug dumps with many many hashes it really helps readability and tracking hashes across lines.
adv_zxy today at 10:18 AM
radare2 also has excellent hex viewing/editing support, if one manages to grok the usage of it.
7bit today at 10:17 AM
> it’s much easier to pick out the unique byte when it’s a different color! human brains are really good at spotting visual patterns—given the right format

Don't really see the advantage. Unique bytes have no unique meaning across data types.

The only good syntax highlight to me is 00 and perhaps FF. But that's my opinion of course.

Anything else that has no direct relation to what you're looking at is meaningless.

samzong_ today at 8:55 AM
[dead]
ralferoo today at 11:00 AM
I actually stopped reading after the intro because I fundamentally disagreed with its premise. The "find the C0" took me about 1/4 second with uncoloured. Looking at the coloured took my eyes about 3 seconds to recover from the colour overload, then I was scanning down and found the colours so distracting with the constant switching between orange, pink and yellows than it took me a total of about 5 seconds to scan down as far as the blue C0. Maybe if it was all uncoloured and blue just for that, I might have actually noticed it looking different earlier.

It's been a while since I used hexedit on Linux, but I think that highlighted search results in reverse colours, just like less does for text search. Personally, I'd prefer that to colours.