Gimp-Forum.net
Get list of distinct pixel RGB values - Printable Version

+- Gimp-Forum.net (https://www.gimp-forum.net)
+-- Forum: GIMP (https://www.gimp-forum.net/Forum-GIMP)
+--- Forum: Extending the GIMP (https://www.gimp-forum.net/Forum-Extending-the-GIMP)
+---- Forum: Scripting questions (https://www.gimp-forum.net/Forum-Scripting-questions)
+---- Thread: Get list of distinct pixel RGB values (/Thread-Get-list-of-distinct-pixel-RGB-values)



Get list of distinct pixel RGB values - thetalkietoaster - 03-16-2023

Hi all,

I'm attempting to write a python script that palette swaps an un-indexed image, as sample colorize doesn't quite seem to have a way to force zero interpolation between colours.

I've got code that iterates across a pixel region pixel-by-pixel using the RowIterator from the colorxhtml.py script, finds the unique colours, arranges them by brightness, then uses gimp_image_select_color and gimp_edit_bucket_fill to replace the colours from drawable 1 with the colours from drawable 2, exactly, in order. All good.

The problem is iterating across the drawables takes way longer than it should - about 30 seconds for a ~250x250px layer. That's a bit much. Is there a way to get a list of unique pixel RGB values easily (ignoring transparency)? The histogram functions seem to work on individual channels.

I guess I could do a nested histogram in 1 channel, select areas for each of the values it took, then histogram each of those sub-areas in the next channel, then repeat again, but that seems like it'd be even less efficient!

Edit: Argh, sorry, I meant to post this in scripting! I don't seem to have the rights to delete it and repost, apologies.


RE: Get list of distinct pixel RGB values - Ofnuts - 03-17-2023

I would like to see you code, because even though the colorxhtml script isn't that optimized, it cannot be that bad.

This code:
Code:
image=gimp.image_list()[0]
layer=image.active_layer
region=layer.get_pixel_rgn(0, 0, 400,400)

from collections import defaultdict
colors=defaultdict(int)
for x in range(400):
   for y in range(400):
       colors[region[x,y]]+=1
len(colors)

runs in 5 seconds, and report the same number of colors as the color cube analysis.

[attachment=9545]

It runs even faster (a couple of seconds) with an image reduced to 256 colors...

The colors you obtain are a 3-byte string such as
Code:
> region[0,0]
'x\x89|'

which means :
  • Red is 0x78 = 120 ("x"in ASCII)
  • Green is 0x89 = 137
  • Blue is 0x7C = 124 ("|"in ASCII)
[attachment=9546]

(to have a representative number of colors, my test layer was a gray (0x80) to which I added some low RGB noise).

You can of course write into the pixel region and return it as a new layer.

If you want really fast processing you can use numpy so iterations are done by C code) but of course it isn't part of your regular Python runtime and adding it to the Gimp python runtime on Windows may be an ordeal for your prospective users.


RE: Get list of distinct pixel RGB values - thetalkietoaster - 03-23-2023

Sorry for the very late reply - managed to completely forget about this after I fixed it :/. The code's on GitHub, and it's been revised quite a bit since I posted this. Originally, I was calculating the sum RGB for each pixel and then using that as the dict key, but that was clearly a daft move when I could just do the summing afterwards.

The code I ended up with is below, with changes to account for 1. some palettes having multiple colours of the same sum RGB, and 2. some of the images having 1-2 stray pixels out of the palette. Fixing those meant switching to using pixel RGB as key like you, which then solved the performance issues.
Code:
def extract_sorted_palette(
    layer, include_transparent, count_threshold,
    current_progress, progress_fraction,
):
    """
    Extracts a palette from an image, by finding the discrete RGB values
    and then sorting them by total R+G+B value.
    """
    palette_counts = {}    
    progress_step = progress_fraction / layer.height

    region = layer.get_pixel_rgn(
        0, 0, layer.width, layer.height
    )

    for index_row in range(0, layer.height):
        for pixel in RowIterator(region[0:layer.width, index_row], layer.bpp):
            colour_rgb = pixel[0:3]

            if layer.has_alpha and pixel[3] == 0 and not include_transparent:
                continue

            elif colour_rgb not in palette_counts:
                palette_counts[colour_rgb] = 1

            else:
                palette_counts[colour_rgb] += 1

        gimp.progress_update(current_progress + progress_step * index_row)

    # Now we've counted all the pixel colours, discard outliers and sort
    palette = {}
    for colour_rgb, colour_count in palette_counts.items():
        colour_sum = sum(colour_rgb)

        if colour_count > count_threshold:
            if colour_sum in palette:
                if colour_rgb != palette[colour_sum]:
                    colour_duplicate = palette[colour_sum]
                    raise KeyError(
                        "Multiple colours in layer with same total RGB values: " + \
                        str(colour_rgb) + "(" + str(colour_count) + " pixels) and " + \
                        str(colour_duplicate) + "(" + str(palette_counts[colour_duplicate]) + " pixels). "
                        "Cannot automatically sort colours by brightness. " + \
                        "Try increasing the 'ignore colours with less than this many pixels' setting " + \
                        "to drop stray pixels."
                    )
            else:
                palette[colour_sum] = colour_rgb

    sorted_palette = [
        palette[key] for key in sorted(list(palette.keys()))
    ]
    return sorted_palette
Though looking at your example I can't believe I forgot about defaultdict, I'll fix that. Should also be using actual perceived brightness rather than RGB intensity too.


RE: Get list of distinct pixel RGB values - teapot - 03-24-2023

(03-23-2023, 01:28 PM)thetalkietoaster Wrote: Sorry for the very late reply - managed to completely forget about this after I fixed it :/. The code's on GitHub, and it's been revised quite a bit since I posted this. Originally, I was calculating the sum RGB for each pixel and then using that as the dict key, but that was clearly a daft move when I could just do the summing afterwards.

The code I ended up with is below, with changes to account for 1. some palettes having multiple colours of the same sum RGB, and 2. some of the images having 1-2 stray pixels out of the palette. Fixing those meant switching to using pixel RGB as key like you, which then solved the performance issues.
Code:
def extract_sorted_palette(
   layer, include_transparent, count_threshold,
   current_progress, progress_fraction,
):
   """
   Extracts a palette from an image, by finding the discrete RGB values
   and then sorting them by total R+G+B value.
   """
   palette_counts = {}    
   progress_step = progress_fraction / layer.height

   region = layer.get_pixel_rgn(
       0, 0, layer.width, layer.height
   )

   for index_row in range(0, layer.height):
       for pixel in RowIterator(region[0:layer.width, index_row], layer.bpp):
           colour_rgb = pixel[0:3]

           if layer.has_alpha and pixel[3] == 0 and not include_transparent:
               continue

           elif colour_rgb not in palette_counts:
               palette_counts[colour_rgb] = 1

           else:
               palette_counts[colour_rgb] += 1

       gimp.progress_update(current_progress + progress_step * index_row)

   # Now we've counted all the pixel colours, discard outliers and sort
   palette = {}
   for colour_rgb, colour_count in palette_counts.items():
       colour_sum = sum(colour_rgb)

       if colour_count > count_threshold:
           if colour_sum in palette:
               if colour_rgb != palette[colour_sum]:
                   colour_duplicate = palette[colour_sum]
                   raise KeyError(
                       "Multiple colours in layer with same total RGB values: " + \
                       str(colour_rgb) + "(" + str(colour_count) + " pixels) and " + \
                       str(colour_duplicate) + "(" + str(palette_counts[colour_duplicate]) + " pixels). "
                       "Cannot automatically sort colours by brightness. " + \
                       "Try increasing the 'ignore colours with less than this many pixels' setting " + \
                       "to drop stray pixels."
                   )
           else:
               palette[colour_sum] = colour_rgb

   sorted_palette = [
       palette[key] for key in sorted(list(palette.keys()))
   ]
   return sorted_palette
Though looking at your example I can't believe I forgot about defaultdict, I'll fix that. Should also be using actual perceived brightness rather than RGB intensity too.

I just skimmed the function extract_sorted_palette that you pasted into your post compared to your recent github version. Did you mean to change the logic when reversing the if statement:

From:

if layer.has_alpha and pixel[3] == 0 and not include_transparent:
    continue

To:

if include_transparent or layer.has_alpha and pixel[3] > 0:
    up the count

On a brief look, it doesn't seem right for a case when include_transparent is False. Would you be wanting something more like this:

if include_transparent or not layer.has_alpha or pixel[3] > 0:
    up the count