Let's make some observations of the effects of changing the DPI:
DPI 1000 Height=1970 Width=1970 # Spots=140625 Raw pixels: 3880900
DPI 10000 Height=19690 Width=19690 # Spots=140625 Raw pixels: 387696100
We can see that while the number of spots drawn remains quite consistent (it does vary due to the various rounding in your calculations, but for all intents and purposes, we can consider it constant), the raw pixel count of a raster image generated increases quadratically. A vector representation would seem desireable, since it is freely scalable (quality depending on the capabilities of a renderer).
Unfortunately, the way you generate the SVG is flawed, since you've basically turned it into an extremely inefficient raster representation. This is because you generate a rectangle for each individual pixel (even for those that are technically background). Consider that in an 8-bit grayscale image, such as the PNGs you generate requires 1 byte to represent a raw pixel. On the other hand, your SVG representation of a single pixel looks something like this:
<rect fill="rgb(255,255,255)" height="1" width="1" x="12345" y="15432" />
Using ~70 bytes per pixel, when we're talking about tens of megapixels... clearly not the way to go.
However, let's recall that the number of spots doesn't depend on DPI. Can we just represent the spots in some efficient way? Well, the spots are actually circles, parametrized by position, radius and colour. SVG supports circles, and their representation looks like this:
<circle cx="84" cy="108" fill="rgb(0,0,0)" r="2" />
Let's look at the effects of changing the DPI now.
DPI 1000 # Spots=140625 Raw pixels: 3880900 SVG size: 7435966
DPI 10000 # Spots=140625 Raw pixels: 387696100 SVG size: 7857942
The slight increase in size is due to increased range of position/radius values.
I somewhat refactored your code example. Here's the result that demonstrates the SVG output.
import numpy as np
import cv2
import svgwrite
MM_IN_INCH = 0.03937
def round_int_to_10s(value):
int_value = int(value)
return int_value + 10 - int_value % 10
def get_sizes_pixels(height_mm, width_mm, pattern_size_mm, dpi):
dpmm = MM_IN_INCH * dpi # dots per mm
width_px = round_int_to_10s(np.ceil(width_mm * dpmm))
height_px = round_int_to_10s(np.ceil(height_mm * dpmm))
pattern_size_px = pattern_size_mm * dpmm
return height_px, width_px, pattern_size_px
def get_grid_positions(size, pattern_size, density):
count = int(density * size / pattern_size) # get number of patterns possible
if count == 1:
return [size // 2]
return [int(i * size / (count + 1)) for i in range(1, count + 1)]
def get_spot_grid(height_px, width_px, pattern_size_px, density):
vertical = get_grid_positions(height_px, pattern_size_px, density)
horizontal = get_grid_positions(width_px, pattern_size_px, density)
return vertical, horizontal
def generate_spots(vertical, horizontal, pattern_size, density, variation):
spots = []
noise_halfspan = 2 * pattern_size / density;
noise_min, noise_max = (-noise_halfspan, noise_halfspan)
for i in vertical:
for j in horizontal:
# generate the noisy information
center = tuple(map(int, (j, i) + variation * np.random.randint(noise_min, noise_max, 2)))
d = int(pattern_size + pattern_size * variation * (np.random.rand()-0.5) / 2)
spots.append((center, d//2)) # add circle params
return spots
def render_raster(height, width, spots):
im = 255 * np.ones((height, width), dtype=np.uint8)
for center, radius in spots:
cv2.circle(im, center, radius, 0, -1) # add circle
return im
def render_svg(height, width, spots):
dwg = svgwrite.Drawing(profile='tiny', size = (width, height))
fill_color = svgwrite.utils.rgb(0, 0, 0)
for center, radius in spots:
dwg.add(dwg.circle(center, radius, fill=fill_color)) # add circle
return dwg.tostring()
# INPUTS #
############
dpi = 100 # dots per inch
WidthOfSample_mm = 50 # mm
HeightOfSample_mm = 50 # mm
PatternSize_mm = 1 # mm
density = 0.75 # 1 is very dense, 0 is not fine at all
Variation = 0.75 # 1 is very bad, 0 is very good
############
height, width, pattern_size = get_sizes_pixels(HeightOfSample_mm, WidthOfSample_mm, PatternSize_mm, dpi)
vertical, horizontal = get_spot_grid(height, width, pattern_size, density)
spots = generate_spots(vertical, horizontal, pattern_size, density, Variation)
img = render_raster(height, width, spots)
svg = render_svg(height, width, spots)
print(f"Height={height} Width={width} # Spots={len(spots)}")
print(f"Raw pixels: {img.size}")
print(f"SVG size: {len(svg)}")
cv2.imwrite("timo.png", img)
with open("timo.svg", "w") as f:
f.write(svg)
This generates the following output:
PNG | Rendered SVG
Note: Since it's not possible to upload SVGs here, I put it on pastebin, and provide capture of it rendered by Firefox.
Further improvements to the size of the SVG are possible. For example, we're currently using the same colour over an over. Styling or grouping should help remove this redundancy.
Here's an example that groups all the spots in one group with constant fill colour:
def render_svg(height, width, spots):
dwg = svgwrite.Drawing(profile='tiny', size = (width, height))
dwg_spots = dwg.add(dwg.g(id='spots', fill='black'))
for center, radius in spots:
dwg_spots.add(dwg.circle(center, radius)) # add circle
return dwg.tostring()
The output looks the same, but the file is now 4904718 bytes instead of 7435966 bytes.
An alternative (pointed out by AKX) if you only desire to draw in black, you may omit the fill
specification as well as the grouping, since the default SVG fill colour is black.
The next thing to notice is that most of the spots have the same radius -- in fact, using your settings at DPI of 1000 the unique radii are [1, 2]
and at DPI of 10000 they are [15, 16, 17, 18, 19, 20, 21, 22, 23]
.
How could we avoid repeatedly specifying the same radius? (As far as I can tell, we can't use groups to specify it) In fact, how can we omit repeatedly specifying it's a circle? Ideally we'd just tell it "Draw this mark at all of those positions" and just provide a list of points.
Turns out there are two features of SVG that let us do exactly that. First of all, we can specify custom markers, and later refer to them by an ID.
<marker id="id1" markerHeight="2" markerWidth="2" refX="1" refY="1">
<circle cx="1" cy="1" fill="black" r="1" />
</marker>
Second, the polyline element can optionally draw markers at every vertex of the polyline. If we draw the polyline with no stroke and no fill, all we end up is with the markers.
<polyline fill="none" marker-end="url(#id1)" marker-mid="url(#id1)" marker-start="url(#id1)"
points="2,5 8,22 11,26 9,46 8,45 2,70 ... and so on" stroke="none" />
Here's the code:
def group_by_radius(spots):
radii = set([r for _,r in spots])
groups = {r: [] for r in radii}
for c, r in spots:
groups[r].append(c)
return groups
def render_svg_v2(height, width, spots):
dwg = svgwrite.Drawing(profile='full', size=(width, height))
by_radius = group_by_radius(spots)
dwg_grp = dwg.add(dwg.g(stroke='none', fill='none'))
for r, centers in by_radius.items():
dwg_marker = dwg.marker(id=f'r{r}', insert=(r, r), size=(2*r, 2*r))
dwg_marker.add(dwg.circle((r, r), r=r))
dwg.defs.add(dwg_marker)
dwg_line = dwg_grp.add(dwg.polyline(centers))
dwg_line.set_markers((dwg_marker, dwg_marker, dwg_marker))
return dwg.tostring()
The output SVG still looks the same, but now the filesize at DPI of 1000 is down to 1248852 bytes.
With high enough DPI, a lot of the coordinates will be 3, 4 or even 5 digits. If we bin the coordinates into tiles of 100 or 1000 pixels, we can then take advantage of the use element, which lets us apply an offset to the referenced object. Thus, we can limit the polyline coordinates to 2 or 3 digits at the cost of some extra overhead (which is generally worth it).
Here's an initial (clumsy) implementation of that:
def bin_points(points, bin_size):
bins = {}
for x,y in points:
bin = (max(0, x // bin_size), max(0, y // bin_size))
base = (bin[0] * bin_size, bin[1] * bin_size)
offset = (x - base[0], y - base[1])
if base not in bins:
bins[base] = []
bins[base].append(offset)
return bins
def render_svg_v3(height, width, spots, bin_size):
dwg = svgwrite.Drawing(profile='full', size=(width, height))
by_radius = group_by_radius(spots)
dwg_grp = dwg.add(dwg.g(stroke='none', fill='none'))
polyline_counter = 0
for r, centers in by_radius.items():
dwg_marker = dwg.marker(id=f'm{r}', insert=(r, r), size=(2*r, 2*r))
dwg_marker.add(dwg.circle((r, r), r=r, fill='black'))
dwg.defs.add(dwg_marker)
dwg_marker_grp = dwg_grp.add(dwg.g())
marker_iri = dwg_marker.get_funciri()
for kind in ['start','end','mid']:
dwg_marker_grp[f'marker-{kind}'] = marker_iri
bins = bin_points(centers, bin_size)
for base, offsets in bins.items():
dwg_line = dwg.defs.add(dwg.polyline(id=f'p{polyline_counter}', points=offsets))
polyline_counter += 1
dwg_marker_grp.add(dwg.use(dwg_line, insert=base))
return dwg.tostring()
With bin size set to 100, and DPI of 1000, we get to a file size of 875012 bytes, which means about 6.23 bytes per spot. That's not so bad for XML based format. With DPI of 10000 we need bin size of 1000 to make a meaningful improvement, which yields something like 1349325 bytes (~9.6B/spot).