Streamlit 2.0: Build Data Apps Faster Than Ever
Streamlit has become the go‑to framework for turning Python scripts into sleek, shareable data apps, and the arrival of Streamlit 2.0 raises the bar even higher. In this post we’ll explore the most impactful upgrades, walk through three end‑to‑end examples, and sprinkle in pro‑tips that help you ship faster and keep your codebase clean. Whether you’re a data scientist who wants a quick UI for a model or a full‑stack engineer building a client‑facing dashboard, Streamlit 2.0 gives you the tools to do it in minutes instead of days.
Why Streamlit 2.0 Matters
Version 2.0 is more than a collection of minor bug fixes; it re‑architects the core runtime to be truly reactive. The new dependency‑graph engine tracks every widget, function, and data object, updating only the parts of the UI that actually changed. This means smoother interactions, lower memory footprints, and a noticeable speed boost for large datasets.
Another game‑changing addition is the built‑in theming system, which lets you switch between light, dark, or custom palettes with a single line of code. No more fiddling with CSS files or external libraries—Streamlit now handles contrast, font scaling, and responsive layouts out of the box.
Finally, the revamped session state API eliminates the “global variable” hacks that many early adopters relied on. You can now store complex objects (like a trained model or a pandas DataFrame) safely across reruns, making multi‑page apps feel like native desktop software.
Key New Features
Native Data Caching 2.0
Cache decorators have been upgraded to support async functions, fine‑grained invalidation, and automatic serialization of common data types. The new @st.cache_data decorator replaces the older @st.cache and is optimized for large tabular data, image arrays, and even Spark DataFrames.
import streamlit as st
import pandas as pd
@st.cache_data(ttl=3600) # cache for one hour
def load_sales_data(path: str) -> pd.DataFrame:
df = pd.read_csv(path)
# Heavy preprocessing steps
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')
return df
sales_df = load_sales_data('data/sales_2023.csv')
st.dataframe(sales_df.head())
Session State Enhancements
The new st.session_state API behaves like a mutable dictionary with built‑in type safety. You can declare default values, watch for changes, and even persist state across page reloads using the optional persist=True flag.
if 'selected_region' not in st.session_state:
st.session_state.selected_region = 'North America'
region = st.selectbox(
"Choose a region",
options=['North America', 'Europe', 'Asia'],
index=['North America', 'Europe', 'Asia'].index(st.session_state.selected_region)
)
st.session_state.selected_region = region
st.write(f"You are looking at data for **{region}**.")
Built‑in Theming & Layout
Streamlit 2.0 introduces a declarative theme.toml that lives alongside your script. You can define primary colors, font families, and even custom CSS snippets that are safely sandboxed. The framework also adds new layout primitives like st.columns with fractional sizing and st.tabs that remember the last active tab per user session.
- Primary color: Controls button fills, progress bars, and accent borders.
- Background: Light, dark, or a custom hex code.
- Font: Choose from system defaults or load a Google Font via URL.
Fast Deployment with Streamlit Cloud
Deploying a Streamlit app used to require a Dockerfile or a separate CI pipeline. With Streamlit 2.0, the platform automatically detects requirements, caches the environment, and spins up a scalable instance in seconds. The new “One‑Click Share” button generates a permanent URL and QR code, making collaboration with non‑technical stakeholders painless.
Getting Started – A Minimal App
Before diving into complex dashboards, let’s build the classic “Hello, World!” app that showcases the new session state and theming features. Save the snippet below as app.py and run streamlit run app.py.
import streamlit as st
# Theme configuration (optional)
st.set_page_config(
page_title="Quick Demo",
page_icon="🚀",
layout="centered",
initial_sidebar_state="expanded"
)
st.title("🚀 Streamlit 2.0 Quick Demo")
if 'counter' not in st.session_state:
st.session_state.counter = 0
def increment():
st.session_state.counter += 1
st.button("Add One", on_click=increment)
st.metric(label="Current Count", value=st.session_state.counter)
This tiny app demonstrates three core ideas: page configuration, persistent session state, and the new st.metric widget that automatically formats numbers with units. Feel free to experiment with the sidebar, add a markdown description, or switch the theme by creating a theme.toml file next to app.py.
Real‑World Use Case #1: Interactive Sales Dashboard
Imagine a sales analyst who needs to explore quarterly performance across regions, product lines, and marketing channels. With Streamlit 2.0 you can load a 2‑million‑row CSV in under a second, let users slice the data with dropdowns, and instantly see a line chart, a heat map, and a downloadable report.
Step‑by‑Step Implementation
- Cache the heavy CSV load using
@st.cache_data. - Use
st.sidebarfor filter widgets. - Render charts with
altair(orplotly) that react to the filtered dataframe. - Provide a “Download CSV” button that respects the current filters.
import streamlit as st
import pandas as pd
import altair as alt
@st.cache_data
def load_data():
df = pd.read_parquet('data/sales_2023.parquet')
df['date'] = pd.to_datetime(df['date'])
return df
df = load_data()
st.sidebar.header("Filter Options")
region = st.sidebar.multiselect(
"Region", options=df['region'].unique(), default=df['region'].unique()
)
product = st.sidebar.multiselect(
"Product Line", options=df['product_line'].unique(), default=df['product_line'].unique()
)
date_range = st.sidebar.date_input(
"Date Range",
value=[df['date'].min(), df['date'].max()]
)
filtered = df[
(df['region'].isin(region)) &
(df['product_line'].isin(product)) &
(df['date'].between(date_range[0], date_range[1]))
]
st.subheader("🗓️ Sales Over Time")
line_chart = alt.Chart(filtered).mark_line(point=True).encode(
x='date:T',
y='revenue:Q',
color='region:N',
tooltip=['date', 'revenue', 'region']
).interactive()
st.altair_chart(line_chart, use_container_width=True)
st.subheader("🔥 Top Products")
top_products = (
filtered.groupby('product_line')['revenue']
.sum()
.reset_index()
.sort_values('revenue', ascending=False)
.head(10)
)
bar_chart = alt.Chart(top_products).mark_bar().encode(
x='revenue:Q',
y=alt.Y('product_line:N', sort='-x'),
tooltip=['product_line', 'revenue']
)
st.altair_chart(bar_chart, use_container_width=True)
st.download_button(
label="💾 Download Filtered Data",
data=filtered.to_csv(index=False).encode('utf-8'),
file_name='filtered_sales.csv',
mime='text/csv'
)
The app automatically updates the charts whenever a sidebar widget changes, thanks to Streamlit’s reactive graph. Because the data load is cached, subsequent interactions are near‑instant, even on modest hardware.
Real‑World Use Case #2: Machine‑Learning Model Explorer
Data scientists often struggle to share model insights with non‑technical teammates. Streamlit 2.0 lets you embed an interactive model explorer where users can tweak hyperparameters, view performance metrics, and visualize predictions on a sample dataset—all without writing a single line of JavaScript.
Core Components
- Sidebar sliders: Adjust learning rate, regularization strength, or tree depth.
- Cached training function: Retrains only when hyperparameters change.
- Metrics & plots: Show accuracy, confusion matrix, and ROC curve.
- Explainability: Integrate SHAP values for feature importance.
import streamlit as st
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score, confusion_matrix
import matplotlib.pyplot as plt
import shap
@st.cache_data
def load_dataset():
df = pd.read_csv('data/heart.csv')
X = df.drop('target', axis=1)
y = df['target']
return train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = load_dataset()
st.sidebar.header("🔧 Model Hyperparameters")
n_estimators = st.sidebar.slider("Number of Trees", 10, 200, 100, step=10)
max_depth = st.sidebar.slider("Max Depth", 2, 20, 5)
min_samples_leaf = st.sidebar.slider("Min Samples per Leaf", 1, 10, 2)
@st.cache_resource
def train_model(n_estimators, max_depth, min_samples_leaf):
model = RandomForestClassifier(
n_estimators=n_estimators,
max_depth=max_depth,
min_samples_leaf=min_samples_leaf,
random_state=42
)
model.fit(X_train, y_train)
return model
model = train_model(n_estimators, max_depth, min_samples_leaf)
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]
st.subheader("📊 Performance Metrics")
col1, col2 = st.columns(2)
col1.metric("Accuracy", f"{accuracy_score(y_test, y_pred):.2%}")
col2.metric("ROC‑AUC", f"{roc_auc_score(y_test, y_proba):.2%}")
st.subheader("🧭 Confusion Matrix")
cm = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots()
cax = ax.matshow(cm, cmap='Blues')
for (i, j), val in np.ndenumerate(cm):
ax.text(j, i, f"{val}", ha='center', va='center')
plt.xlabel('Predicted')
plt.ylabel('Actual')
st.pyplot(fig)
st.subheader("🔎 SHAP Feature Importance")
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values[1], X_test, plot_type="bar", show=False)
st.pyplot(bbox_inches='tight')
Notice the use of @st.cache_resource for the model object—this ensures the heavy training step runs only when a hyperparameter actually changes. The SHAP plot renders instantly because the explainer is also cached.
Performance Tips & Pro Tricks
Even with the new engine, large apps can still suffer from unnecessary reruns. Below are three proven strategies to keep your app snappy.
Tip 1 – Scope Your Caches. Cache at the highest level where data is immutable. If you cache inside a loop, Streamlit will treat each iteration as a separate cache entry, blowing up memory usage.
Tip 2 – Use Stale‑While‑Revalidate. Set attl(time‑to‑live) on@st.cache_dataso stale data is shown instantly while a background thread refreshes the cache. Users perceive zero latency.
Tip 3 – Minimize Widget Dependencies. Place widgets in the sidebar whenever possible. Sidebar widgets are evaluated once per session, while main‑area widgets trigger a full rerun on every interaction.
Testing & Debugging in Streamlit 2.0
Streamlit now ships with a built‑in st.test module that lets you write unit tests for UI callbacks without launching a browser. The API mirrors pytest fixtures, enabling you to assert that a button click updates the session state as expected.
def test_counter_increment():
from streamlit.testing import ScriptRunContext, TestSessionState
ctx = ScriptRunContext()
state = TestSessionState()
# Simulate button click
ctx.run_script("app.py", state=state, widget="Add One")
assert state.counter == 1
When debugging, use st.sidebar.debug=True to expose a collapsible panel that prints variable snapshots after each rerun. This is especially handy for tracing data leaks in complex multi‑page apps.
Migration Checklist: From 1.x to 2.0
- Replace
@st.cachewith@st.cache_dataor@st.cache_resourcedepending on whether you cache data or objects. - Update any manual session‑state hacks to use the new dictionary‑style
st.session_stateAPI. - Remove external CSS files; migrate colors and fonts to
theme.toml. - Test your app with
streamlit run --server.enableCORS falseto ensure the new reactive graph respects CORS settings. - Run
streamlit helloafter upgrade to verify the local environment is healthy.
Following this checklist reduces the risk of hidden bugs and lets you take full advantage of the performance gains in Streamlit 2.0.
Conclusion
Streamlit 2.0 transforms the “Python script → web app” pipeline into a truly production‑ready workflow. With smarter caching, robust session state, declarative theming, and seamless cloud deployment, you can build data‑driven applications faster than ever before. The examples above illustrate how a few lines of code can evolve into interactive dashboards, model explorers, and shareable reports—all while keeping the codebase readable and maintainable. Dive in, experiment with the new APIs, and let Streamlit 2.0 handle the heavy lifting so you can focus on the insights that matter.