Question

Plotly scatter map - How do I prevent marker text from overlapping?

I trying to plot a USA map using Scattergeo. I have a lot of markers that are close to each other, and their marker text are overlap each other and looks very messy. Are there workarounds for this? I'm trying to make the map look as clean as possible.

I don’t necessarily need the markers to be at their exact coordinates, but I don’t want to manually tweak each geo-coordinate since there would be too many coordinates. So something programmatic would be best.

enter image description here

 3  107  3
1 Jan 1970

Solution

 2

This is a tough problem. Besides the obvious idea of making the font smaller, one possibility would be to create a few buttons where each button toggles an equal proportion of the marker text labels. If you have 50 marker text labels, then you could have 5 buttons that each toggle 5 of the marker text labels, reducing the chance that the text overlaps. You can adjust the number of buttons accordingly, and although I made the clusters random, depending on your use case, you could determine a pattern to your clusters (like toggling text for points that are within the same category).

Here is an example below with airport data that would otherwise have a lot of overlapping text. One thing that I found useful was creating this figure by adding the text traces first (with only one cluster's text traces visible), then adding the marker traces after (which are all visible). So if you have 5 clusters, then there are 5 text traces, one of which is visible and all others not visible, followed by another 5 marker traces which are all visible.

import numpy as np
import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_february_us_airport_traffic.csv')
# df['text'] = df['airport'] + '' + df['city'] + ', ' + df['state'] + '' + 'Arrivals: ' + df['cnt'].astype(str)

## assign numbers to each airport
df['text'] = list(range(len(df)))

## assign clusters to data at random
np.random.seed(42)
n_clusters = 10
cluster_choices = [f'Cluster_{i}' for i in range(n_clusters)]
df['cluster'] = np.random.choice(cluster_choices, size=len(df))

fig = go.Figure()

## only show the text for first cluster of markers
for group, df_group in df.groupby('cluster'):
    if group == 'Cluster_1':
        print("cluster 1 is visible")
        visible = True
    else:
        visible = False
    
    fig.add_trace(go.Scattergeo(
        lon = df_group['long'],
        lat = df_group['lat'],
        text = df_group['text'],
        mode = 'text',
        visible = visible
    ))

## always show markers
fig.add_trace(go.Scattergeo(
    lon = df['long'],
    lat = df['lat'],
    text = df['text'],
    mode = 'markers',
    marker_color = df['cnt'],
    marker = dict(size=30, opacity=0.4),
    visible = True,
))

## there are n_clusters text traces, followed by n_clusters marker traces
## so we always show the last n_clusters out of all traces
visible_array = [[i == j for j in range(n_clusters)] + [True]*n_clusters for i in range(n_clusters)]

updatemenus = [
    dict(
        buttons=[
            dict(
                label=f'Cluster {i}',
                method='update',
                args=[{'visible': visible_array[i]},
                      {'title': f'Cluster {i}'}]
            ) for i in range(n_clusters)
        ]
    )
]



fig.update_layout(
    title = 'Most trafficked US airports<br>(Hover for airport names)',
    geo_scope='usa',
    showlegend=False,
    updatemenus=updatemenus
)

fig.show()

enter image description here

2024-07-01
Derek O