Posts: 3
Threads: 1
Joined: Jun 2019
Reputation:
0
Operating system(s):
- Windows (Vista and later)
Gimp version: 2.10
Hi
I'm writing a python script to cleanup a scan text document. (I'll have to cleanup thousands of pages)
Is there a way to determine what is the darkest pixel in the image?
I want to use "pdb.gimp_image_select_color" using the darkest pixel (which is not always pure black (0, 0, 0))
Posts: 6,336
Threads: 271
Joined: Oct 2016
Reputation:
562
Operating system(s):
Gimp version: 2.10
06-10-2019, 09:09 AM
(This post was last modified: 06-10-2019, 09:57 AM by Ofnuts.)
No direct way as far as I can tell. You can use several calls with Gimp histogram, adjusting the upper limit of the range until you don't get any pixels in the range:
The call is:
Code:
mean, std_dev, median, pixels, count, percentile = pdb.gimp_drawable_histogram(drawable, channel, start_range, end_range)
The values you are interested in are count and/or percentile (as far as I can tell, count=pixels*percentile).
You use:
Code:
_,_,_,_, count,_ = pdb.gimp_drawable_histogram(drawable, HISTOGRAM_VALUE, 0.,max)
and you try max values (with a dichotomic search you'll never need more than eight calls), something like:
Code:
def blackest(drawable):
bot,top=0.,1.
while top-bot>.001:
print "%5.3f < x < %5.3f" % (bot,top)
threshold=(top+bot)/2.
_,_,_,_,count,_ = pdb.gimp_drawable_histogram(drawable, HISTOGRAM_VALUE, 0.,threshold)
print "%5.3f px @ %5.3f" % (count,threshold)
if count:
top=threshold
else:
bot=threshold
return threshold
However doing a color selection on the result may select a single pixel and may not give you the result you want. What is the whole process?
Posts: 3
Threads: 1
Joined: Jun 2019
Reputation:
0
Operating system(s):
- Windows (Vista and later)
Gimp version: 2.10
Hi Ofnuts, thanks for taking the time to respond, and moving into right board
We are scanning very old documents, that was still typed using a typewriter. We are scanning it at 600dpi in grayscale. The mission now is to clean-up the scans, and reduce file size as much as possible (300dpi Black and white) to be able to share these documents. Some documents are very bad, with a lot of noise etc.
Pointing me to the histogram, solved my problem. Based on the percentile I decided to get the range that is used the most in the lower scale of the histogram, and set all pixels up to that scale to black.
Code:
def SetBlack(image, drawable):
#find the "color" that is used the most in "darker" side of the historgram - this is the black of text
MaxPercentile = 0.0
MaxEndRange = 0.0
Increment = 0.025 #2.5% increase
start_range = 0.0
end_range = Increment
mean, std_dev, median, pixels, count, percentile = pdb.gimp_drawable_histogram(drawable, 0, start_range, end_range)
if percentile > MaxPercentile:
MaxPercentile = percentile
MaxEndRange = end_range
for x in range(0, 20):
start_range = end_range
end_range = end_range + Increment
mean, std_dev, median, pixels, count, percentile = pdb.gimp_drawable_histogram(drawable, 0, start_range, end_range)
if percentile > MaxPercentile:
MaxPercentile = percentile
MaxEndRange = end_range
#pdb.gimp_message(percentile)
MaxEndRange = MaxEndRange * 1.2 #Add 20%
#pdb.gimp_message(MaxPercentile)
#pdb.gimp_message(MaxEndRange)
pdb.gimp_drawable_curves_spline(drawable, 0, 6, (0.0, 0.0, MaxEndRange, 0.0, 0.9, 1.0))
Apologies for my coding style, I'm new to gimp and python.
Posts: 6,336
Threads: 271
Joined: Oct 2016
Reputation:
562
Operating system(s):
Gimp version: 2.10
06-10-2019, 01:33 PM
(This post was last modified: 06-10-2019, 01:34 PM by Ofnuts.)
Looks like you are re-inventing the automatic contrast stretch, either
Code:
pdb.gimp_drawable_levels_stretch(drawable)
pdb.plug_in_autostretch_hsv(image, drawable)
Also, if you batch-process, using ImageMagick instead of Gimp is likely be a better idea.
Posts: 3
Threads: 1
Joined: Jun 2019
Reputation:
0
Operating system(s):
- Windows (Vista and later)
Gimp version: 2.10
Thanks, will look into these. Don't want the re-invent something.
Yes, I thought that ImageMagick might be the way to go, when I was looking for an auto de-skew routine, and came across it.
Now for a few more sleepless nights, playing with a new program.
|