Using Plotly to Explore Global Oil and Gas Pipelines
Intro
I was attracted by the plots of Mapping the world’s oil and gas pipelines
https://www.aljazeera.com/news/2021/12/16/mapping-world-oil-gas-pipelines-interactive.
To be precisely, the plot looks interesting to me, so my curiosity drove me to visualize it in Python.
Firstly we need to collect the data,luckly, the above picture has the data source: Global Energy Monitor | 2000. So we can easily get them.
You can find all datasets and code in my github:
Import Data
I am planning to use Plotly to complete all the drawings. Why plotly? Because it very fancy and powerful,besides, it support interactive.
Firstly we can quickly explore our datasets. Two datasets are shared in the repo and one is the storage information of LNG (liquefied gas), and the other is the information of Gas Pipe (Gas pipeline distribution) . In addition, geo data of all countries is collected as well .
file = 'data/countries.geojson'
import json
with open(file) as f:
j = json.load(f)lng_file = 'data/GGIT-LNG-Terminals-June-2021.xlsx'
lng_df = pd.read_excel(lng_file)
As we preview the dataframe, these columns are more interesting to us:
- Country
- Region
- CapacityInMtpa: Storage amount
- Latitude
- Longitude
Get Started with Bar Chart
We would like to draw a bar chart first, the code is as follows. In fact, plotly has a fantastic tutorial, and we can just refer the official code.
Usually, only one line of code is needed to plot the bar chart, which is nothing more complicated than
go.Pie or px.pie
But it is for a toy dataset, with this real-world dataset we have to do data preprocessing.
There are a few highlights:
- The raw data has some “ — “ characters, indicating that the data is missing, we need to drop it
- Raw data needs to be aggregated, such as aggregation per country
- In order to plot the charts like the news, we need to add “dots” to the top of the bar. That is why a scatter chart is overlaid.
- Other detail adjustment, for example, Russian is set as the gold color
# clean data
clean_lng_df = lng_df.drop(lng_df[lng_df['CapacityInMtpa']=='--'].index)
clean_lng_df['CapacityInMtpa'] = pd.to_numeric(clean_lng_df['CapacityInMtpa'])
lng_df_sum = clean_lng_df.groupby('Country').agg({'CapacityInMtpa':'sum','Latitude':'first','Longitude':'first'}).reset_index()
lng_top_10_length = lng_df_sum.sort_values('CapacityInMtpa',ascending=True).iloc[-10:,:]
# plot
fig = go.Figure()
lng_top_10_length['color'] = ['gold' if c == 'Russia' else 'deeppink' for c in lng_top_10_length['Country'].to_list() ]
# add bar
fig.add_trace(go.Bar(x=lng_top_10_length['CapacityInMtpa'],y=lng_top_10_length['Country'],orientation='h',marker=dict(color=lng_top_10_length['color'])))
# add dot scatter
fig.add_trace(go.Scatter(x=lng_top_10_length['CapacityInMtpa'],y=lng_top_10_length['Country'],orientation='h',marker=dict(color=lng_top_10_length['color']),mode="markers"))
# adjust the detail
fig.update_layout(
paper_bgcolor='black',
plot_bgcolor='black',
title=dict(text='LNG capacity of each country (top10)',
font_size= 14,
font_color='silver'),
xaxis=dict(
title='capacity/ Mtpa',
titlefont_color='silver',
titlefont_size=12,
tickfont_size=10,
tickfont_color='silver'
),
yaxis=dict(
title=' ',
titlefont_size=12,
tickfont_size=10,
tickfont_color='silver'
),
bargap=0.9, # gap between bars of adjacent location coordinates.
height= 400,
showlegend=False
)
fig.show()
It can be seen that Russia’s liquefied gas reserves are not so much, close to half of Japan’s. The US is far ahead, almost 10 times ahead of Russia.
Sunburst Chart
we can plot sunburst, which is specially used to draw data with a hierarchical structure, such as from level of regions to countries .
lng_df_region_sum = clean_lng_df.groupby(['Region','Country']).agg({'CapacityInMtpa':'sum','Latitude':'first','Longitude':'first'}).reset_index()
lng_df_region_sum_top = lng_df_region_sum.sort_values('CapacityInMtpa',ascending=True).iloc[-20:,:]
fig = px.sunburst(lng_df_region_sum_top, path=['Region', 'Country'], values='CapacityInMtpa')
fig.update_layout(
paper_bgcolor='black',
plot_bgcolor='black',
title = 'LNG capacity in region',
)
fig.show()
It can be seen that North America and East Asia occupy more than 50% of the global liquefied gas reserves.
Choroplethmapbox
The dataset contains geographic information, then the best way to present it is a map.
Choroplethmapbox is used to show the information in the area. Here Geojson data is adopted to define the boundary of each country.
Geojson contains the outline of each region, so it is necessary to map the column names of the dataset to the key field of the geojson.
fig= go.Figure()
lng_df_region_sum_top['Country'].replace({'USA':'United States of America'},inplace=True)
trace1 = go.Choroplethmapbox(locations=lng_df_region_sum_top.Country, z=lng_df_region_sum_top.CapacityInMtpa,geojson= j,featureidkey="properties.ADMIN",colorscale="sunsetdark")
fig.add_trace(trace1)
fig.update_layout( #paper_bgcolor='black',
title='LNG capacity',
#plot_bgcolor='black',
mapbox_style="carto-darkmatter",mapbox_zoom=1,showlegend=False,height=800,width=1200)
fig.show()
Gas Pipeline Map
With the above code ready, we can challenge a more advanced dataset, that is, the dataset of natural gas pipelines.
We can plot the length ranking of each country’s natural gas pipelines.
And the ranking of annual delivery volume:
We can mark the route of each pipeline on the map, here we mark the top 10 countries with different colors.
Why Gas Pipeline is Crucial for Russia
It can be seen that although Russia’s total pipeline mileage is not far ahead, it cover the widest area.
The first reason is that Russia itself is vast, and the second reason is that its surrounding countries are indeed short of natural gas.
We can further analyze the countries to which Russia’s natural gas is exported.
Surprisingly that the first place turned out to be Ukraine, with as many as 15 pipelines. You never know, maybe the gas and pipeline plays some role in the war trigger.
lats = []
lons = []
names = []
colors = []
color_maps = {}
top_10_length['StartCountry'].replace({'USA':'United States of America'},inplace=True)
for i,c in enumerate(top_10_length.StartCountry.to_list()):
color_maps[c] = px.colors.qualitative.Safe[i]
for feature, name,startcountry,pipe in zip(linestring_df.ls, linestring_df.Countries,linestring_df.StartCountry,linestring_df.PipelineName):
if isinstance(feature, shapely.geometry.linestring.LineString):
linestrings = [feature]
elif isinstance(feature, shapely.geometry.multilinestring.MultiLineString):
linestrings = feature.geoms
else:
continue
for linestring in linestrings:
x, y = linestring.xy
lats = np.append(lats, y)
lons = np.append(lons, x)
colors = np.append(colors,[color_maps.get(startcountry,px.colors.qualitative.Safe[-1])]*len(y))
names = np.append(names, [name+'*'+pipe]*len(y))
lats = np.append(lats, None)
lons = np.append(lons, None)
names = np.append(names, None)
colors = np.append(colors,color_maps.get(startcountry,px.colors.qualitative.Safe[-1]))
fig= go.Figure()
trace1 = go.Choroplethmapbox(locations=top_10_length.StartCountry, z=top_10_length.LengthEstimateKm,geojson= j,featureidkey="properties.ADMIN",colorscale="sunsetdark")
trace2 = go.Scattermapbox(mode='lines',lat=lats, lon=lons,
#marker=dict(color=colors),
line=dict(color='lime'))
fig.add_trace(trace1)
fig.add_trace(trace2)
fig.update_layout( #paper_bgcolor='black',
#plot_bgcolor='black',
mapbox_style="carto-darkmatter",mapbox_zoom=1,showlegend=False,height=800,width=1200)
fig.show()
In the end
as we explored the gas pipeline of global countries, and realize that gas from Russia does play a big role in the whole world.
Hope the war ends soon.
If you like the plots, please check the my github link:
https://github.com/bingblackbean/data_amber_post/tree/main/gas_pipe