{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial 01 - Getting started with PyQInt\n", "\n", ":::{important}\n", "This is a JupyterLab notebook. We assume you execute all the cells in this notebook in sequential order. This is relevant as some cells depend on the output as generated in a previous cell. This notebook assumes that the following packages have been installed:\n", "\n", "* pyqint (`conda install -c ifilot pyqint`)\n", "* pytessel (`conda install -c ifilot pytessel`)\n", "* jupyterlab-myst (`pip install jupyterlab-myst`)\n", ":::\n", "\n", "## Introduction\n", "\n", "PyQInt is a Python package for calculating one- and two-electron integrals as encountered in electronic structure calculations. Since integral evaluation can be quite computationally intensive, the evaluation is programmed in C++ and connected to Python using Cython.\n", "\n", "PyQInt mainly serves as an educational package to teach students how to perform (simple) electronic structure calculations wherein the most difficult task, i.e. the integral evaluation, is already encapsulated in a handy set of routines. With PyQInt, the student can for example build their own Hartree-Fock routine. Some common electronic structure routine, most notably the Hartree-Fock algorithm, is also readily available.\n", "\n", "PyQInt is fully open-source. Its source code can be found [here](https://github.com/ifilot/pyqint) and detailed documentation is available [here](https://pyqint.imc-tue.nl/).\n", "\n", "## Learning goals\n", "\n", "In this notebook, you will learn\n", "\n", "* How to build a molecule using Python\n", "* How to automatically build a basis set for a given molecule\n", "* How to perform a Hartree-Fock calculation\n", "* How to analyze and visualize the molecular orbitals\n", "\n", "## Building a molecule\n", "\n", "PyQInt offers two ways for constructing molecules\n", "\n", "* Building the molecule 'by hand' by providing a list of elements and their positions\n", "* Loading one of the pre-defined molecules\n", "\n", "Here, we will choose the latter and this is by the most convenient method for this tutorial." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import MoleculeBuilder\n", "mol = MoleculeBuilder().from_name('co') # load molecule directly from the molecule class\n", "mol.name = 'CO' # give the molecule an appropriate name\n", "print(mol) # print the atoms and their positions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 01\n", "\n", "Repeat the same process for the CH4 molecule. Note that the CH4 molecule is also available as one of the pre-defined molecules." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Objective: repeat the same process for the NH3 molecule\n", "from pyqint import MoleculeBuilder\n", "mol = MoleculeBuilder().from_name('ch4') # load molecule directly from the molecule class\n", "mol.name = 'CH4' # give the molecule an appropriate name\n", "print(mol) # print the atoms and their positions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building the basis set\n", "\n", "To perform an electronic structure calculation, we often use a basis set. PyQInt uses a so-called localized basis set composed of contracted Gaussian functions. There are many ways to construct these contracted Gaussian functions, each balancing at the interface between computational expense and accuracy. A relatively minimalistic though perfectly fine basis set for **educational** purposes is the `sto3g` basis set.\n", "\n", "Before I show you how to build a basis set, let me briefly explain what a Gaussian Type Orbital (GTO) and a Contracted Gaussian Function (CGF) is. The GTO is defined via the formula as found below\n", "\n", "$$\n", "\\Phi(\\alpha,l,m,n,\\vec{R}) = N (x - X)^{l} (y - Y)^{m} (z - Z)^{n} \\exp \\left(- \\alpha |\\vec{r} - \\vec{R}|^{2} \\right)\n", "$$\n", "\n", "wherein $N$ is a normalization constant, $X,Y,Z$ correspond to the Cartesian coordinates of the center of the Gaussian, $\\alpha$ is related to the width of the Gaussian (larger $\\alpha$ is smaller width) and $\\vec{R}$ is the (Cartesian) vector position of the GTO. A contracted Gaussian function (CGF) can be constructed using a linear combination of GTOs via\n", "\n", "$$\n", "\\phi = \\sum_{i} c_{i} \\Phi_{i}(\\alpha,l,m,n,\\vec{R})\n", "$$\n", "\n", "In the script below, we will construct an `sto3g` basis set for the CO molecule. We can explicitly print out all the coefficients present in this set. Have a brief look at the output and convince yourself that we are essentially looping over the CGFs of C and of O. Both C and O each have 5 CGFs, corresponding to the 1s, 2s, 2px, 2py and 2pz atomic orbitals. Note that the coefficients for C and O differ. \n", "\n", ":::{tip}\n", "* To test your understanding: which atom has a more contracted 1s atomic orbital, C or O? \n", "* Can you rationalize this result based on the chemo-physical properties of C and O? (maybe the amount of protons has something to do with it...)\n", ":::" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import MoleculeBuilder\n", "mol = MoleculeBuilder().from_name('co') # load molecule directly from the molecule class\n", "mol.name = 'CO' # give the molecule an appropriate name\n", "print(mol) # print the atoms and their positions\n", "\n", "cgfs, nuclei = mol.build_basis('sto3g') # construct the basis set for the CO molecule\n", "for cgf in cgfs: # loop over the contracted gaussian functions and print their coefficient\n", " print(cgf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 02\n", "\n", "Repeat on the procedure and provide an output of the CGFs for CH4. How many CGFs are there per H atom?" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import MoleculeBuilder\n", "\n", "mol = MoleculeBuilder().from_name('ch4') # load molecule directly from the molecule class\n", "mol.name = 'CH4' # give the molecule an appropriate name\n", "print(mol) # print the atoms and their positions\n", "\n", "cgfs, nuclei = mol.build_basis('sto3g') # construct the basis set for the CO molecule\n", "for cgf in cgfs: # loop over the contracted gaussian functions and print their coefficient\n", " print(cgf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Performing a Hartree-Fock calculation\n", "\n", "Performing a Hartree-Fock calculation using `PyQInt` is done using the `HF` class. The procedure is fairly straightforward and shown in the script below. We essentially invoke a HF object, attach a molecule and a basis set to it and immediately perform the restricted Hartree-Fock calculation. The result is stored in a dictionary (here called `res`) which contains a relatively large set of data corresponding to the calculation." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import MoleculeBuilder, HF\n", "\n", "mol = MoleculeBuilder().from_name('co') # load molecule directly from the molecule class\n", "mol.name = 'CO' # give the molecule an appropriate name\n", "\n", "res = HF().rhf(mol, 'sto3g')\n", "print('Total electronic energy: ', res['energy'])" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "Besides the total electronic energy, the `res` dictionary actually contains many other elements. To get an overview of all the elements, we can run the following." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(res.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Perhaps the most relevant items are the orbital energies and the coefficient matrix, as these essentially define the molecular orbitals. These two pieces of information are contained in `orbe` and `orbc`. Note that `orbe` is essentially a list of energies which is always ordered from low to high. The coefficient matrix is constructed such that each column corresponds to a separate molecular orbital. The rows on the other hand loop over the basis functions and are in the same order as the list of CGFs objects. In other words, the order of the basis function in the rows is as follows:\n", "\n", "1. O 1s\n", "2. O 2s\n", "3. O 2px\n", "4. O 2py\n", "5. O 2pz\n", "6. C 1s\n", "7. C 2s\n", "8. C 2px\n", "9. C 2py\n", "10. C 2pz\n", "\n", "Below, we will print the orbital energies and the coefficient matrix." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "tags": [] }, "outputs": [], "source": [ "print('Orbital energies: ', res['orbe'])\n", "print('Coefficient matrix: ', res['orbc'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an example, let us explore the first molecular orbital and see how its coefficients look like." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(res['orbc'][:,0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the result as shown above, we can readily observe that only the first element holds a signifant value while all other values are quite close to zero. The interpretation of this result is that the molecular orbital is nearly synonymous to the 1s atomic orbital on O. This result should not surprise us. We (should) know from our chemical intuition that the core atomic orbitals are not involved in chemical bonding; they are so incredibly contracted that their overlap with the other atomic orbitals is negligible.\n", "\n", "To get an easy overview of the coefficient matrix, hopefully aiding in your interpretation of the coefficients, one can use the `plot_matrix` function as provided below." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "tags": [] }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "def plot_matrix(ax, mat):\n", " \"\"\"\n", " Produce plot of a matrix\n", " \n", " ax: axis object\n", " mat: matrix object\n", " \"\"\"\n", " ax.imshow(mat, vmin=-1, vmax=1, cmap='PiYG')\n", " for i in range(mat.shape[0]):\n", " for j in range(mat.shape[1]):\n", " ax.text(i, j, '%.2f' % mat[j,i], ha='center', va='center',\n", " fontsize=7)\n", " ax.set_xticks([])\n", " ax.set_yticks([])\n", " ax.hlines(np.arange(1, mat.shape[0])-0.5, -0.5, mat.shape[0] - 0.5,\n", " color='black', linestyle='--', linewidth=1)\n", " ax.vlines(np.arange(1, mat.shape[0])-0.5, -0.5, mat.shape[0] - 0.5,\n", " color='black', linestyle='--', linewidth=1)\n", " \n", "# create figure and plot object\n", "fig, ax = plt.subplots(1,1)\n", "plot_matrix(ax, res['orbc'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When using a localized and minimal basis set such as `sto3g`, we are in the unique position that we can easily interpret the coefficient values because each basis function corresponds to a single atomic orbital. For larger and more complex basis sets, such a one-to-one association is not always possible. For example, if we consider the third molecular orbital (i.e. the third **column**), we can readily see that the dominant contributions are from the 2s and 2pz atomic orbitals on O and the 2s and 2pz atomic orbitals on C. There are also some minor contributions from the 1s atomic orbitals on both C and O. The contribution of 2px and 2py is negligible for both C and O." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 03\n", "\n", "Perform the following tasks\n", "\n", "1. Perform a Hartree-Fock calculation of the CH4 molecule\n", "2. Visualize the coefficient matrix\n", "3. List the orbital energies\n", "\n", "and use the output to answer the following questions\n", "\n", "1. Which molecular orbitals are core orbitals? How can you tell this?\n", "2. Among the occupied molecular orbitals, you will find 2 non-degenerate orbitals and 1 triple-degenerate set. Identify these molecular orbitals.\n", "\n", ":::{tip}\n", "If you executed the previous cell, the `plot_matrix()` function is still loaded in memory and you can readily recycle it here.\n", ":::" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import MoleculeBuilder, HF\n", "\n", "mol = MoleculeBuilder().from_name('ch4') # load molecule directly from the molecule class\n", "mol.name = 'CH4' # give the molecule an appropriate name\n", "\n", "# Objective, fill out the remainder of the instructions below:\n", "res = HF().rhf(mol, 'sto3g')\n", "\n", "fig, ax = plt.subplots(1,1)\n", "plot_matrix(ax, res['orbc'])\n", "print('Orbital energies: ', res['orbe'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing molecular orbitals\n", "\n", "Orbital visualization is a semi-complex process, mainly because molecular orbitals are inherently three-dimensional objects. The most common visualization methods are using contour plots or using isosurfaces. We will first explore using contour plots as this can be fully done within this notebook.\n", "\n", "### Contour plots\n", "\n", "For building a contour plot, we first need to produce a scalar field mapped to a plain. The PyQInt library offers a few convenience functions for this. We will demonstrate the procedure using our example for the CO molecule." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pyqint import PyQInt\n", "\n", "mol = MoleculeBuilder().from_name('co') # load molecule directly from the molecule class\n", "mol.name = 'CO' # give the molecule an appropriate name\n", "res = HF().rhf(mol, 'sto3g')\n", "\n", "# we extract some useful data from the res object\n", "cgfs = res['cgfs'] # list of contracted Gaussian functions\n", "C = res['orbc'] # coefficient matrix\n", "coeff = C[:,2] # we take the coefficients of the 3rd molecular orbital (remember, Python has zero-based indexing)\n", "\n", "# build \"integrator\" object (the name is a bit misleading, but this class is responsible for the evaluation of the molecular integrals, hence the name)\n", "integrator = PyQInt()\n", "\n", "# we here manually construct a grid, composed of 100x100 points, located on the xz-plane\n", "x = np.linspace(-2, 2, 100)\n", "z = np.linspace(-2, 2, 100)\n", "xx, zz = np.meshgrid(x,z)\n", "yy = np.zeros(len(x) * len(z))\n", "grid = np.vstack([xx.flatten(), yy, zz.flatten()]).reshape(3,-1).T\n", "sf = integrator.plot_wavefunction(grid, coeff, cgfs).reshape((len(z), len(x)))\n", "\n", "# plot wave function; note that I have hardcoded the lower and upper limits of the plot to +/- 0.5;\n", "# these values are a bit arbitrary and consider adjusting them for your situation\n", "plt.figure()\n", "plt.imshow(sf, origin='lower', extent=[-2,2,-2,2], cmap='PiYG', \n", " vmin=-0.5, vmax=0.5)\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The gods of programming dictate that we should consider encapsulating common functionality in functions (hence the name), so let me drop this nice function such that you can easily produce plots of molecular orbitals on the xz plane." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "tags": [] }, "outputs": [], "source": [ "def produce_plot_plane_xz(cgfs, coeff, dim=-2, nrpoints=100, title=None):\n", " \"\"\"\n", " Produce a contour plot of a molecular orbital\n", " \n", " cgfs: list of contracted gaussian function objects\n", " coeff: list of linear expansion coefficients\n", " dim: dimension of the plane\n", " nrpoints: number of sampling points per cartesian direction\n", " \"\"\"\n", " # build \"integrator\" object (the name is a bit misleading, but this class is responsible for the evaluation of the molecular integrals, hence the name)\n", " integrator = PyQInt()\n", "\n", " # we here manually construct a grid, composed of 100x100 points, located on the xz-plane\n", " x = np.linspace(-dim, dim, 100)\n", " z = np.linspace(-dim, dim, 100)\n", " xx, zz = np.meshgrid(x,z)\n", " yy = np.zeros(len(x) * len(z))\n", " grid = np.vstack([xx.flatten(), yy, zz.flatten()]).reshape(3,-1).T\n", " sf = integrator.plot_wavefunction(grid, coeff, cgfs).reshape((len(z), len(x)))\n", "\n", " # plot wave function; note that I have hardcoded the lower and upper limits of the plot to +/- 0.5;\n", " # these values are a bit arbitrary and consider adjusting them for your situation\n", " plt.figure()\n", " plt.imshow(sf, origin='lower', extent=[-dim,dim,-dim,dim], cmap='PiYG', \n", " vmin=-0.5, vmax=0.5)\n", " plt.colorbar()\n", " plt.xlabel('x [a.u.]')\n", " plt.ylabel('z [a.u.]')\n", " \n", " if title is not None:\n", " plt.title(title)\n", " \n", "# create a contour plot of the 4th molecular orbital\n", "produce_plot_plane_xz(cgfs, C[:,3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{warning}\n", "Building contour plots might seem rather straightforward, but let me emphasize that it is only simple if you can easily project the wave function onto a plane. The xz-plane is not always the best plane for the projection and if you have a fairly complex molecule such as CH$_{4}$, we might not even expect that one single plane is going to produce a good overview of the wave function distribution.\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise 04\n", "\n", "* Open a browser and load this picture: https://pyqint.imc-tue.nl/_images/co_canonical_isosurfaces.jpg. Compare the contour plots previously generated to this picture and convince yourself that the three-dimensional representation matches those found in the contour plots.\n", "* Construct the contour plots of the 6th molecular orbital of CO.\n", "* Try to construct the contour plots of the 4th and 5th molecular orbital. You will run into a problem for the 5th molecular orbital. Why is that? If you are a bit familiar with the molecular orbitals of CO, note that the 4th and 5th orbitals are a double-degenerate pair connected via a 90 degree rotation. Why is that problematic when plotting on the same plane? How would you resolve this?\n", "\n", ":::{tip}\n", "If you executed the cells in order, the result object of the Hartree-Fock calculation for CO should still be in memory, just like the nice plotting function.\n", ":::" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Loop over orbitals 4,5 and 6. 4 and 5 are a degenerate pair.\n", "for i in range(4,7):\n", " produce_plot_plane_xz(cgfs, C[:,i], dim=3, title='Orbital: %i' % i)" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Isosurfaces\n", "\n", "Finally a few words on plotting isosurfaces. Isosurfaces are fully three-dimensional objects. An isosurface is constructed by first building a three-dimensional scalar field in some unit cell. Next, all points within that unit cell that have the same value are stitched together, yielding a two-dimensional surface in three-dimensional space. Since molecular orbitals are wave functions, they have a positive and a negative lobe. As such, we need to execute an isosurface construction algorithm twice.\n", "\n", "For constructing isosurfaces, we can make use of PyTessel which is a sister-package of PyQInt and both play along nicely. Let me again demontrate the procedure by means of an example script. We are of course going to recycle our CO calculation as this is still in memory.\n", "\n", ":::{important}\n", "The isosurface calculations will generate `.ply` objects. These objects **cannot** be easily visualized in JupyterLab and hence these will be stored on the hard drive. You will have to open these files in another program for the visualization. The `.ply` files will be placed in the same folder as where the notebook file resides.\n", ":::" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "tags": [] }, "outputs": [], "source": [ "import os\n", "from pytessel import PyTessel\n", "\n", "def build_isosurface(filename, cgfs, coeff, isovalue):\n", " \"\"\"\n", " Build the isosurface of a molecular orbital. Note that the dimensions of \n", " the unit cell are hardcoded to be 10x10x10 centered at the origin.\n", " \n", " Good trial isovalues are +/- 0.1\n", " \n", " filename: output filename, should end in .ply\n", " cgfs: list of CGF objects\n", " coeff: list of linear expansion coefficients\n", " isovalue: isovalue of the isosurface\n", " \"\"\"\n", " sz = 100\n", " integrator = PyQInt()\n", " grid = integrator.build_rectgrid3d(-5, 5, sz)\n", " scalarfield = np.reshape(integrator.plot_wavefunction(grid, coeff, cgfs), (sz, sz, sz))\n", " unitcell = np.diag(np.ones(3) * 10.0)\n", "\n", " pytessel = PyTessel()\n", " vertices, normals, indices = pytessel.marching_cubes(scalarfield.flatten(), scalarfield.shape, unitcell.flatten(), isovalue)\n", " pytessel.write_ply(filename, vertices, normals, indices)\n", " \n", "# build positive and negative lobe of the 5th molecular orbital\n", "build_isosurface('co_04_pos.ply', cgfs, C[:,4], 0.1)\n", "build_isosurface('co_04_neg.ply', cgfs, C[:,4], -0.1)\n", "\n", "# print storage location\n", "print('You can find the .ply files here: ', os.getcwd())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Browse to the path as indicated above. In the folder, you should find two `.ply` files. Try to open these files. I recommend loading both in a visualization program such as [Blender](https://www.blender.org/) for a nice rendition." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If everything went according to plan, you should see an isosurface very similar to the 5th MO (top row, right-most image) as shown in the picture below below.\n", "\n", "![isosurfaces of the canonical molecular orbitals of CO](https://ifilot.pages.tue.nl/pyqint-han-sur-lesse-jupyterlab/_images/MO_CO_CAN.png)\n", "\n", ":::{note}\n", "The image above is grabbed from the internet and might require an active internet connection.\n", ":::" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "#### Exercise 05\n", "\n", "* Perform a Hartree-Fock calculation of the CH4 molecule.\n", "* Build the isosurfaces for all occupied molecular orbitals.\n", "* Visualize all isosurfaces in a program such as Blender.\n", "\n", ":::{hint}\n", "* Take care defining the names for the files. You ideally want to use the word 'CH4' in the filename.\n", "* You have to use the `build_isosurface` method twice per molecular orbital. Once to render the positive lobe, once to render the negative lobe.\n", ":::" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "tags": [] }, "outputs": [], "source": [ "mol = MoleculeBuilder().from_name('ch4') # load molecule directly from the molecule class\n", "mol.name = 'CH4' # give the molecule an appropriate name\n", "res = HF().rhf(mol, 'sto3g')\n", "\n", "# let me already give you the for loop ;-)\n", "for i in range(0,5):\n", "print('Building MO %i' % (i+1))\n", " build_isosurface('mo_ch4_%02i_pos.ply' % (i+1), cgfs, res['orbc'][:,i], 0.1)\n", " build_isosurface('mo_ch4_%02i_neg.ply' % (i+1), cgfs, res['orbc'][:,i], -0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If everything went according to plan, you should see isosurfaces very similar to the picture as shown below.\n", "\n", "![isosurfaces of the canonical molecular orbitals of CH4](https://ifilot.pages.tue.nl/pyqint-han-sur-lesse-jupyterlab/_images/MO_CH4_CAN.png)\n", "\n", ":::{note}\n", "The image above is grabbed from the internet and might require an active internet connection.\n", ":::" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }