How to Install RDKit in Jupyter Notebook with Python
Installing RDKit inside a Jupyter Notebook can feel like trying to assemble furniture without instructions. You know it should work, but one wrong step and suddenly nothing fits. In this article, you will learn exactly how to install RDKit in Jupyter Notebook with Python, from the very first check to a full working coding example that you can run and modify.
Why Installing RDKit Deserves
RDKit is one of the most popular toolkits for cheminformatics in Python. People use it to draw molecules, calculate properties, search substructures, and build machine learning models. It is powerful, free, and widely trusted.
But here is the problem. RDKit is not easy to install if you treat it like a normal Python library. Many guides online show one or two commands and move on. When those commands fail, the reader is left staring at an error message and wondering what just happened.
In this guide, you will not only install RDKit. You will also learn how Jupyter chooses its Python environment, how to avoid the most common mistakes, and how to verify that everything works with a complete coding example.
If you have struggled with RDKit before, this article is written for you.
Why RDKit and Jupyter Often Disagree
Before we touch any commands, it helps to understand the real source of most errors.
Jupyter does not automatically use the same Python that your terminal uses. It runs inside a kernel that points to a specific environment. If RDKit installs into one environment and Jupyter uses another, you will see import errors even though RDKit is technically installed.
Many competitor articles ignore this point. They focus on installing RDKit but never explain how to connect that installation to Jupyter. This guide fixes that gap first.
Check Your Current Python and Jupyter Setup
Start by opening a terminal and checking your Python version.
python --version
You should see Python 3.8 or newer. If you see Python 2, stop here and install a modern Python distribution.
Next, open Jupyter Notebook and run this in a cell.
import sys
print(sys.executable)
This prints the exact Python executable that Jupyter is using. Keep this path in mind. Later, we will make sure RDKit installs into the same environment.
This small check already puts you ahead of most quick tutorials.
Create a Clean Conda Environment for RDKit
RDKit works best with conda because conda handles compiled libraries properly. Using a clean environment avoids conflicts with other packages.
Open your terminal and create a new environment.
conda create -n rdkit_env python=3.10
Activate the environment.
conda activate rdkit_env
Now every command you run will affect only this environment. This keeps your system stable and your setup predictable.
Installing RDKit from the Right Source
This is the most important step in the entire process.
Inside your activated environment, run:
conda install -c conda-forge rdkit
Conda-forge provides official precompiled builds of RDKit for Windows, macOS, and Linux. This avoids building from source, which is slow and often fails.
Wait until the installation finishes. Do not interrupt it. Conda may take a few minutes to resolve dependencies.
Once done, test RDKit directly in the terminal.
python
Then type:
from rdkit import Chem
print(Chem.MolFromSmiles("CCO"))
If you see a molecule object printed, RDKit works in this environment.
But Jupyter still does not know about this environment yet.
Connecting the RDKit Environment to Jupyter
Now we make Jupyter aware of this environment.
Still inside the activated environment, install ipykernel.
conda install ipykernel
Then register the environment as a Jupyter kernel.
python -m ipykernel install --user --name rdkit_env --display-name "Python (RDKit)"
Start Jupyter.
jupyter notebook
When you create a new notebook, select the kernel named Python (RDKit).
This single step solves the most common “RDKit is not found” error.
Verifying RDKit Inside Jupyter Notebook
Open a new notebook with the RDKit kernel and run this code.
from rdkit import Chem
from rdkit.Chem import Draw
mol = Chem.MolFromSmiles("CCO")
mol
You should see a molecule object.
Now try to draw it.
Draw.MolToImage(mol)
If an image of ethanol appears, your installation is complete and healthy.
Do not skip this test. Importing RDKit is not enough. Rendering molecules confirms that all graphical dependencies work.
A Complete and More Defined Coding Example
Now we move beyond testing and build a small but realistic RDKit workflow. This example reads multiple molecules, computes properties, generates fingerprints, and displays results.
This is the kind of code you might actually use in a project.
from rdkit import Chem
from rdkit.Chem import Descriptors, Draw, AllChem, DataStructs
# Step 1: Define a small dataset of molecules
smiles_list = [
"CCO", # Ethanol
"CC(=O)O", # Acetic acid
"c1ccccc1", # Benzene
"CCN(CC)CC", # Triethylamine
"CCOC(=O)C" # Ethyl acetate
]
molecules = [Chem.MolFromSmiles(s) for s in smiles_list]
# Step 2: Compute basic properties
print("Basic molecular properties:\n")
for smi, mol in zip(smiles_list, molecules):
mw = Descriptors.MolWt(mol)
logp = Descriptors.MolLogP(mol)
hbd = Descriptors.NumHDonors(mol)
hba = Descriptors.NumHAcceptors(mol)
print(f"SMILES: {smi}")
print(f" Molecular Weight: {mw:.2f}")
print(f" LogP: {logp:.2f}")
print(f" H-Bond Donors: {hbd}")
print(f" H-Bond Acceptors: {hba}")
print()
# Step 3: Generate Morgan fingerprints
fingerprints = [AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=1024) for mol in molecules]
# Step 4: Compute similarity between first molecule and others
print("Similarity to the first molecule:\n")
ref_fp = fingerprints[0]
for smi, fp in zip(smiles_list, fingerprints):
sim = DataStructs.TanimotoSimilarity(ref_fp, fp)
print(f"{smi} Similarity: {sim:.2f}")
# Step 5: Visualize all molecules
Draw.MolsToGridImage(molecules, molsPerRow=3, subImgSize=(200, 200))
Handling Common Errors with Calm and Confidence
One frequent error is ModuleNotFoundError: No module named 'rdkit'.
This almost always means the wrong kernel is selected. Go to the kernel menu and choose Python (RDKit).
Another error is missing shared libraries, often related to Boost. This happens when mixing pip and conda installations. The fix is to uninstall RDKit from pip and reinstall from conda-forge only.
pip uninstall rdkit
conda install -c conda-forge rdkit
Restart Jupyter and test again.
A small habit helps avoid many issues. In every new notebook, run:
import sys
print(sys.executable)
This tells you exactly which Python you are using.
Adding New Practical Advice You Rarely See
Here is one tip that saves hours in real projects.
Freeze your working environment once RDKit works.
conda env export > rdkit_env.yml
Later, you can recreate the exact setup with:
conda env create -f rdkit_env.yml
This protects you from future dependency changes.
Another useful habit is to restart the kernel after installing any package. Jupyter does not always pick up new libraries until you restart.