1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
|
---
template=post
title=Gif Selfies and Colour Quantization
style=/styles/post.css
style=/styles/writing.css
---
<!-- https://cohost.org/gen/post/600484-gif-selfies-and-colo -->
<!-- Monday, December 22nd, 2022 3:50pm -->
I'm working on a project with gifs because I really do like them very much. The
beginning of this project is reading from my webcam, downscaling the image, and
reducing to 256 colors so it can fit in the gif. So that's what this is about!
We'll start at a grayscale selfie. After helping to fix a bug in the webcam
crate I was using <i>(I don't contribute a lot so it was nice! i liked it)</i>,
it was pretty easy to shove it into a gif. I used neam to downscale the image
and gifed to do all the gif related things. Grayscale isn't why we're here
though, let's get on with colour!
<aside>
<code>neam</code> and <code>gifed</code> are two of my image-related projects.
neam is used here as a library for downscaling images using Nearest Neighboor
and gifed is for encoding gifs. that's all you really need to know about them
to make it through :)
</aside>
<img src="grayscale.gif" style="max-width: 20rem;"
alt="A short GIF of me holding a candle close to my face and waving. The candle provides the only light so it and I are all you can see. The apparent framerate is only 1 frame per second." />
<h2>Colour, oh no!</h2>
I'll keep this breif 'cause I don't think I am particularly qualified to talk
about it but there are images from it I want to show, so it gets a breif mention.
For awhile I thought the colour format was NV12, which is a kind-of-weird
version of YUV420. This is a good <a href="http://paulbourke.net/dataformats/nv12">explanation of NV12.</a>
What I was getting instead was YUV422 in the UYUV format
(<a href="http://paulbourke.net/dataformats/yuv/">explanation of yuv</a>). Deriving RGB from
this was not easy, but that's only because I thought it was NV12. Once I figured
out what I was working with it got a lot easier. Here are a few images from working
through it. I think they're rather nice.
<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 0px; margin: 1em 0px; width: 100%">
<img style="width: 100%; margin: 0px;" src="pngd_nv12_2.webp"
alt="A set of four images in a 2x2 grid showing the progress from "entirely wrong" to "almost correct". We start at a very magenta, nonsensical image. There is a severe screen-door, moire effect covering the entire image, but some semblance of shape can still be made out. The next image looks more correct. It's still highly magenta but now there are islands of darker meganta that seem to want to form objects. The third image fixes the magenta, but now we can see that things are very misaligned and the colour is all in the wrong place. There's patches of yellow and red and green and blue that don't align with any shape. We can see that these are pictures of me and I'm in a room. There's a book in front of half my face and the text is even readable! It's "Untethering The Web" which is a book in the Software For Artists series. In the final image the full frame is finally visible. The colours are still misaligned, but the luma is in the right spot so all the shapes are correct. In addition to the book you can see I'm holding a retractable dog leash and an old RCA vacuum tube box. The colours of these items can't be made out, but I can tell you that they're green, blue, and red. An object for reach RGB channel." />
<img style="width: 100%; margin: 0px;" src="pngd_nv12_3.webp" alt="" />
<img style="width: 100%; margin: 0px;" src="pngd_nv12_4.webp" alt="" />
<img style="width: 100%; margin: 0px;" src="pngd_nv12_5.webp" alt="" />
</div>
<h2>Colour Quantization</h2>
Before we get into the GIFs, we have to stop at
<a href="https://en.wikipedia.org/wiki/Color_quantization">colour quantization</a>. It's the
process of reducing the unique colour count of an image. We
have to do this because the maximum size of a GIF palette is 256 colours.
Which is a lot less than images typically have!
The most popular way to do this, I think, is to use an algorithm like k-means
clustering. I didn't use k-means, so I won't talk about it. But it's nice to
know what's out there. If you want to know more, the
<a href="https://github.com/okaneco/kmeans-colors#1-basic-usage">kmeans-colors crate readme</a>
is pretty good. Also
<a href="https://en.wikipedia.org/wiki/Image_segmentation#Clustering_methods">this</a>
bit of Wikipedia.
<h2>colorsquash</h2>
I didn't use k-means. What I did is probably worse, but it was more fun!
I use a thing I wrote called <a href="https://github.com/mademast/colorsquash">colorsquash</a>.
Maybe the hardest part of quantization is picking the palette. What 256 colours
out of the ~16.8 million would be good to represent this image?
I don't know, which ones are most common? Yeah, really.
colorsquash counts up how many times each color occurs and then sorts them most common to least.
We don't want to just take the most common 256, that'd get a lot of very
similar colours, so it uses a tolerance to exclude some of them.
It will only add another colour to
the palette if every colour in the palette is no more than 1.3% similar. There
was no method to choosing this number. I slowly adjusted it until it either
stopped picking 6 colours, or stopped making the entire image barely different shades of
the same colour. Those are the two extremes that, I guess, 1.3% sits between.
<aside>
1.3% is a specific number to this application. it seems to give different
performance based on the contents of the image you're squashing.
</aside>
There's more work to do after we pick the palette; the image itself still has to
be mapped to the new, very reduced colourspace. Initially, colorsquash looked
through the entire palette for every pixel and chose the most similar colour.
It was written to play with images from my DSLR which're roughly 6000x4000,
or 24 million pixels. If it checked all 256 selected colours with all those
pixels then it ends up running 6.1 billion calculations. That's too many!
Good, then, that a lot of the colours are the same. The most common
colour may occur tens of thousands of times. That's a lot of wasted
computation that doesn't need to happen; it only needs to compute the color
difference once really.
So I traded space for time and allocated a <code>Vec<u8></code> that was 256^3
in length and then stored the index of the selected colour at a location equal to
<code>red + green * 256 + blue * 256 * 256</code>. I'm more than happy to take
~16MB of ram to get an 11x speedup. at least that's what I think it ended up being?
I did all that awhile ago ^^;;
<h2>There's a problem</h2>
Okay okay I swear there was a point. Every frame from the webcam is decoded
into RGB and then sent through colorsquash to reduce the color palette. Only
one palette is used for the entire GIF. It's called the Global Color Table <i>(and
I'm mad you can't modify or swap it out on the fly, but whatever)</i>.
If I didn't use a Global Color Table I'd have
to pass a colour table on every image and that's 768 bytes I don't want to spend.
At 10fps it's 7.6KB. gif is already inefficient so I really don't need to
help it use <i>more</i> data.
The palette is picked once on the first frame because it's an expensive operation.
At my target framerate of 5fps we only get 200ms to do everything we need. That's a
pretty good bit of time. "Pick a palette", however, is not all that needs to happen, so every frame after the
first would <i>only</i> map the image <i>(not picking a whole new palette)</i>.
<details>
<summary>CW for body horror? The area around my eyes and mouth is just a noisy black, so</summary>
<img src="close_candle.gif" style="margin: 0px auto;"
alt="A very noisy gif of myself looking at and then blowing out a candle. The candle itself is emanating light; it's casting a sphere of light around itself, which you can see in the gif. My white face is lit pretty well, but there is a lot of pure black noise covering the frame especially concentrated around my eyes and mouth. It almost looks like it's coming from the candle. When I blow out the candle, the entire frame goes out." />
</details>
Well thaaaaat's not right. I fought with this for awhile and produced a number
of gif in the same style. It's kind of beautiful! Image bugs are my favorite.
Here's one I liked quite a bit.
<i>( a friend suggeted I try black lipstick after seeing this one :3 )</i>
<details>
<summary>CW for body horror for the same reason.</summary>
<img src="goth.gif" style="margin: 0px auto;"
alt="a similarly broken gif as the last. I have my eyes closed facing the camera, head slowly angling downward. I think I was singing? The black noise appears and covers my eyes and mouth and nose." />
</details>
I don't know if you can see. Can you? The first frame? It looks fine. Great! in fact. What on earth.
Remember how I said it wasn't picking the palette on every frame?
Well <i>that</i> has come back to bite! When the colorspace is mapped into the
really-kind-of-quite-large <code>Vec</code>, it <i>only</i> maps the colors it's
seen already; it only assigns a palette-index to any colour seen in the first
frame. The rest are left just as they started: index 0. I set index 0 as
black myself so I could have the top/bottom bars because then the gif could be
square <i>(and i like squares)</i>!
So I fixed it! For every unique color in every image, it finds the closest colour
in the palette and adds it to the colour map. Maybe I'll do the entire
16.8 million colours on the first frame, it might be smart, but that's how it is
right now. So here's my good face in glorious 256 colours.
<img src="yay_color.gif" style="margin: 1em auto; width: 100%; max-width: 20em;"
alt="My smiling face looking at the camera, waving, and looking back down at my integrated terminal in VS Code (unseen). It actually loops pretty well. The colours are finally correct. There is no kind of disturbing black noise and it looks like a regular, short-fps video. just in 256 colours." />
|