Stack Abuse: Getting Started with Python’s Wikipedia API

Introduction

In this article, we will be using the Wikipedia API to retrieve data from Wikipedia. Data scraping has seen a rapid surge owing to the increasing use of data analytics and machine learning tools. The Internet is the single largest source of information, and therefore it is important to know how to fetch data from various sources. And with Wikipedia being one of the largest and most popular sources for information on the Internet, this is a natural place to start.

In this article, we will see how to use Python’s Wikipedia API to fetch a variety of information from the Wikipedia website.

Installation

In order to extract data from Wikipedia, we must first install the Python Wikipedia library, which wraps the official Wikipedia API. This can be done by entering the command below in your command prompt or terminal:

$   pip install wikipedia 

Once the installation is done, we can use the Wikipedia API in Python to extract information from Wikipedia. In order to call the methods of the Wikipedia module in Python, we need to import it using the following command.

import wikipedia   

Searching Titles and Suggestions

The search() method does a Wikipedia search for a query that is supplied as an argument to it. As a result, this method returns a list of all the article’s titles that contain the query. For example:

import wikipedia   print(wikipedia.search("Bill"))   

Output:

['Bill', 'The Bill', 'Bill Nye', 'Bill Gates', 'Bills, Bills, Bills', 'Heartbeat bill', 'Bill Clinton', 'Buffalo Bill', 'Bill & Ted', 'Kill Bill: Volume 1'] 

As you see in the output, the searched title along with the related search suggestions are displayed. You can configure the number of search titles returned by passing a value for the results parameter, as shown here:

import wikipedia   print(wikipedia.search("Bill", results=2))   

Output:

['Bill', 'The Bill'] 

The above code prints only 2 search results of the query since that is how many we requested to be returned.

Let’s say we need to get the Wikipedia search suggestions for a search title, “Bill Cliton” that is incorrectly entered or has a typo. The suggest() method returns suggestions related to the search query entered as a parameter to it, or it will return “None” if no suggestions were found.

Let’s try it out here:

import wikipedia   print(wikipedia.suggest("Bill cliton"))   

Output:

bill clinton   

You can see that it took our incorrect entry, “Bill cliton”, and returned the correct suggestion of “bill clinton”.

Extracting Wikipedia Article Summary

We can extract the summary of a Wikipedia article using the summary() method. The article for which the summary needs to be extracted is passed as a parameter to this method.

Let’s extract the summary for “Ubuntu”:

print(wikipedia.summary("Ubuntu"))   

Output:

Ubuntu ( (listen)) is a free and open-source Linux distribution based on Debian. Ubuntu is officially released in three editions: Desktop, Server, and Core (for the internet of things devices and robots). Ubuntu is a popular operating system for cloud computing, with support for OpenStack.Ubuntu is released every six months, with long-term support (LTS) releases every two years. The latest release is 19.04 ("Disco Dingo"), and the most recent long-term support release is 18.04 LTS ("Bionic Beaver"), which is supported until 2028. Ubuntu is developed by Canonical and the community under a meritocratic governance model. Canonical provides security updates and support for each Ubuntu release, starting from the release date and until the release reaches its designated end-of-life (EOL) date. Canonical generates revenue through the sale of premium services related to Ubuntu. Ubuntu is named after the African philosophy of Ubuntu, which Canonical translates as "humanity to others" or "I am what I am because of who we all are".   

The whole summary is printed in the output. We can customize the number of sentences in the summary text to be displayed by configuring the sentences argument of the method.

print(wikipedia.summary("Ubuntu", sentences=2))   

Output:

Ubuntu ( (listen)) is a free and open-source Linux distribution based on Debian. Ubuntu is officially released in three editions: Desktop, Server, and Core (for the internet of things devices and robots).   

As you can see, only 2 sentences of Ubuntu’s text summary is printed.

However, keep in mind that wikipedia.summary will raise a “disambiguation error” if the page does not exist or the page is disambiguous. Let’s see an example.

print(wikipedia.summary("key"))   

The above code throws a DisambiguationError since there are many articles that would match “key”.

Output:

Traceback (most recent call last):     File "<stdin>", line 1, in <module>   File "/Library/Python/2.7/site-packages/wikipedia/util.py", line 28, in __call__     ret = self._cache[key] = self.fn(*args, **kwargs)   File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 231, in summary     page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)   File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 276, in page     return WikipediaPage(title, redirect=redirect, preload=preload)   File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 299, in __init__     self.__load(redirect=redirect, preload=preload)   File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 393, in __load     raise DisambiguationError(getattr(self, 'title', page['title']), may_refer_to) wikipedia.exceptions.DisambiguationError: "Key" may refer to:   Key (cryptography)   Key (lock)   Key (map)   ... 

If you had wanted the summary on a “cryptography key”, for example, then you’d have to enter it as the following:

print(wikipedia.summary("Key (cryptography)"))   

With the more specific query we now get the correct summary in the output.

Retrieving Full Wikipedia Page Data

In order to get the contents, categories, coordinates, images, links and other metadata of a Wikipedia page, we must first get the Wikipedia page object or the page ID for the page. To do this, the page() method is used with page the title passed as an argument to the method.

Look at the following example:

wikipedia.page("Ubuntu")   

This method call will return a WikipediaPage object, which we’ll explore more in the next few sections.

Extracting Metadata of a Page

To get the complete plain text content of a Wikipedia page (excluding images, tables, etc.), we can use the content attribute of the page object.

print(wikipedia.page("Python").content)   

Output:

Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aims to help programmers write clear, logical code for small and large-scale projects.Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including procedural, object-oriented, and  functional programming. Python is often described as a "batteries included" language due to its comprehensive standard library.Python was conceived in the late 1980s as a successor to the ABC language. Python 2.0, released 2000, introduced features like list comprehensions and a garbage collection system capable of collecting reference cycles.   ... 

Similarly, we can get the URL of the page using the url attribute:

print(wikipedia.page("Python").url)   

Output:

https://en.wikipedia.org/wiki/Python_(programming_language)   

We can get the URLs of external links on a Wikipedia page by using the references property of the WikipediaPage object.

print(wikipedia.page("Python").references)   

Output:

[u'http://www.computerworld.com.au/index.php/id;66665771', u'http://neopythonic.blogspot.be/2009/04/tail-recursion-elimination.html', u'http://www.amk.ca/python/writing/gvr-interview', u'http://cdsweb.cern.ch/journal/CERNBulletin/2006/31/News%20Articles/974627?ln=en', u'http://www.2ality.com/2013/02/javascript-influences.html', ...] 

The title property of the WikipediaPage object can be used to we extract the title of the page.

print(wikipedia.page("Python").title)   

Output:

Python (programming language)   

Similarly, the categories attribute can be used to get the list of categories of a Wikipedia page:

print(wikipedia.page("Python").categories)   

Output

['All articles containing potentially dated statements', 'Articles containing potentially dated statements from August 2016', 'Articles containing potentially dated statements from December 2018', 'Articles containing potentially dated statements from March 2018', 'Articles with Curlie links', 'Articles with short description', 'Class-based programming languages', 'Computational notebook', 'Computer science in the Netherlands', 'Cross-platform free software', 'Cross-platform software', 'Dutch inventions', 'Dynamically typed programming languages', 'Educational programming languages', 'Good articles', 'High-level programming languages', 'Information technology in the Netherlands', 'Object-oriented programming languages', 'Programming languages', 'Programming languages created in 1991', 'Python (programming language)', 'Scripting languages', 'Text-oriented programming languages', 'Use dmy dates from August 2015', 'Wikipedia articles with BNF identifiers', 'Wikipedia articles with GND identifiers', 'Wikipedia articles with LCCN identifiers', 'Wikipedia articles with SUDOC identifiers'] 

The links element of the WikipediaPage object can be used to get the list of titles of the pages whose links are present in the page.

print(wikipedia.page("Ubuntu").links)   

Output

[u'/e/ (operating system)', u'32-bit', u'4MLinux', u'ALT Linux', u'AMD64', u'AOL', u'APT (Debian)', u'ARM64', u'ARM architecture', u'ARM v7', ...] 

Finding Pages Based on Coordinates

The geosearch() method is used to do a Wikipedia geo search using latitude and longitude arguments supplied as float or decimal numbers to the method.

print(wikipedia.geosearch(37.787, -122.4))   

Output:

['140 New Montgomery', 'New Montgomery Street', 'Cartoon Art Museum', 'San Francisco Bay Area Planning and Urban Research Association', 'Academy of Art University', 'The Montgomery (San Francisco)', 'California Historical Society', 'Palace Hotel Residential Tower', 'St. Regis Museum Tower', 'Museum of the African Diaspora'] 

As you see, the above method returns articles based on the coordinates provided.

Similarly, we can set the coordinates property of the page() and get the articles related to the geolocation. For example:

print(wikipedia.page(37.787, -122.4))   

Output:

['140 New Montgomery', 'New Montgomery Street', 'Cartoon Art Museum', 'San Francisco Bay Area Planning and Urban Research Association', 'Academy of Art University', 'The Montgomery (San Francisco)', 'California Historical Society', 'Palace Hotel Residential Tower', 'St. Regis Museum Tower', 'Museum of the African Diaspora'] 

Language Settings

You can customize the language of a Wikipedia page to your native language, provided the page exists in your native language. To do so, you can use the set_lang() method. Each language has a standard prefix code which is passed as an argument to the method. For example, let’s get the first 2 sentences of the summary text of “Ubuntu” wiki page in the German language.

wikipedia.set_lang("de")   print(wikipedia.summary("ubuntu", sentences=2))   

Output

Ubuntu (auch Ubuntu Linux) ist eine Linux-Distribution, die auf Debian basiert. Der Name Ubuntu bedeutet auf Zulu etwa „Menschlichkeit“ und bezeichnet eine afrikanische Philosophie.   

You can check the list of currently supported ISO languages along with its prefix, as follows:

print(wikipedia.languages())   

Retrieving Images in a Wikipedia Page

The images list of the WikipediaPage object can be used to fetch images from a Wikipedia page. For instance, the following script returns the first image from Wikipedia’s Ubuntu page:

print(wikipedia.page("ubuntu").images[0])   

Output

https://upload.wikimedia.org/wikipedia/commons/1/1d/Bildschirmfoto_zu_ubuntu_704.png   

The above code returns the URL of the image present at index 0 in the Wikipedia page.

To see the image, you can copy and paste the above URL into your browser.

Retreiving Full HTML Page Content

To get the full Wikipedia page in HTML format, you can use the following script:

print(wikipedia.page("Ubuntu").html())   

Output

<div class="mw-parser-output"><div role="note" class="hatnote navigation-not-searchable">For the African philosophy, see <a href="/wiki/Ubuntu_philosophy" title="Ubuntu philosophy">Ubuntu philosophy</a>. For other uses, see <a href="/wiki/Ubuntu_(disambiguation)" class="mw-disambig" title="Ubuntu (disambiguation)">Ubuntu (disambiguation)</a>.</div>   <div class="shortdescription nomobile noexcerpt noprint searchaux" style="display:none">Linux distribution based on Debian</div>   ... 

As seen in the output, the entire page in HTML format is displayed. This can take a bit longer to load if the page size is large, so keep in mind that it can raise an HTMLTimeoutError when a request to the server times out.

Conclusion

In this tutorial, we had a glimpse of using the Wikipedia API for extracting data from the web. We saw how to get a variety of information such as a page’s title, category, links, images, and retrieve articles based on geo-locations.

Planet Python

Stack Abuse: Python for NLP: Getting Started with the StanfordCoreNLP Library

This is the ninth article in my series of articles on Python for NLP. In the previous article, we saw how Python’s Pattern library can be used to perform a variety of NLP tasks ranging from tokenization to POS tagging, and text classification to sentiment analysis. Before that we explored the TextBlob library for performing similar natural language processing tasks.

In this article, we will explore StanfordCoreNLP library which is another extremely handy library for natural language processing. We will see different features of StanfordCoreNLP with the help of examples. So before wasting any further time, let’s get started.

Setting up the Environment

The installation process for StanfordCoreNLP is not as straight forward as the other Python libraries. As a matter of fact, StanfordCoreNLP is a library that’s actually written in Java. Therefore make sure you have Java installed on your system. You can download the latest version of Java freely.

Once you have Java installed, you need to download the JAR files for the StanfordCoreNLP libraries. The JAR file contains models that are used to perform different NLP tasks. To download the JAR files for the English models, download and unzip the folder located at the official StanfordCoreNLP website.

Next thing you have to do is run the server that will serve the requests sent by the Python wrapper to the StanfordCoreNLP library. Navigate to the path where you unzipped the JAR files folder. Navigate inside the folder and execute the following command on the command prompt:

$   java -mx6g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000 

The above command initiates the StanfordCoreNLP server. The parameter -mx6g specifies that the memory used by the server should not exceed 6 gigabytes. It is important to mention that you should be running 64-bit system in order to have a heap as big as 6GB. If you are running a 32-bit system, you might have to reduce the memory size dedicated to the server.

Once you run the above command, you should see the following output:

[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called --- [main] INFO CoreNLP - setting default constituency parser [main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz [main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead [main] INFO CoreNLP - to use shift reduce parser download English models jar from: [main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html [main] INFO CoreNLP -     Threads: 8 [main] INFO CoreNLP - Starting server... [main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000 

The server is running at port 9000.

Now the final step is to install the Python wrapper for the StanfordCoreNLP library. The wrapper we will be using is pycorenlp. The following script downloads the wrapper library:

$   pip install pycorenlp 

Now we are all set to connect to the StanfordCoreNLP server and perform the desired NLP tasks.

To connect to the server, we have to pass the address of the StanfordCoreNLP server that we initialized earlier to the StanfordCoreNLP class of the pycorenlp module. The object returned can then be used to perform NLP tasks. Look at the following script:

from pycorenlp import StanfordCoreNLP  nlp_wrapper = StanfordCoreNLP('http://localhost:9000')   

Performing NLP Tasks

In this section, we will briefly explore the use of StanfordCoreNLP library for performing common NLP tasks.

Lemmatization, POS Tagging and Named Entity Recognition

Lemmatization, parts of speech tagging, and named entity recognition are the most basic NLP tasks. The StanfordCoreNLP library supports pipeline functionality that can be used to perform these tasks in a structured way.

In the following script, we will create an annotator which first splits a document into sentences and then further splits the sentences into words or tokens. The words are then annotated with the POS and named entity recognition tags.

doc = "Ronaldo has moved from Real Madrid to Juventus. While messi still plays for Barcelona"   annot_doc = nlp_wrapper.annotate(doc,       properties={         'annotators': 'ner, pos',         'outputFormat': 'json',         'timeout': 1000,     }) 

In the script above we have a document with two sentences. We use the annotate method of the StanfordCoreNLP wrapper object that we initialized earlier. The method takes three parameters. The annotator parameter takes the type of annotation we want to perform on the text. We pass 'ner, pos' as the value for the annotator parameter which specifies that we want to annotate our document for POS tags and named entities.

The outputFormat variable defines the format in which you want the annotated text. The possible values are json for JSON objects, xml for XML format, text for plain text, and serialize for serialized data.

The final parameter is the timeout in milliseconds which defines the time that the wrapper should wait for the response from the server before timing out.

In the output, you should see a JSON object as follows:

 
{'sentences': [{'index': 0, 'entitymentions': [{'docTokenBegin': 0, 'docTokenEnd': 1, 'tokenBegin': 0, 'tokenEnd': 1, 'text': 'Ronaldo', 'characterOffsetBegin': 0, 'characterOffsetEnd': 7, 'ner': 'PERSON'}, {'docTokenBegin': 4, 'docTokenEnd': 6, 'tokenBegin': 4, 'tokenEnd': 6, 'text': 'Real Madrid', 'characterOffsetBegin': 23, 'characterOffsetEnd': 34, 'ner': 'ORGANIZATION'}, {'docTokenBegin': 7, 'docTokenEnd': 8, 'tokenBegin': 7, 'tokenEnd': 8, 'text': 'Juventus', 'characterOffsetBegin': 38, 'characterOffsetEnd': 46, 'ner': 'ORGANIZATION'}], 'tokens': [{'index': 1, 'word': 'Ronaldo', 'originalText': 'Ronaldo', 'lemma': 'Ronaldo', 'characterOffsetBegin': 0, 'characterOffsetEnd': 7, 'pos': 'NNP', 'ner': 'PERSON', 'before': '', 'after': ' '}, {'index': 2, 'word': 'has', 'originalText': 'has', 'lemma': 'have', 'characterOffsetBegin': 8, 'characterOffsetEnd': 11, 'pos': 'VBZ', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 3, 'word': 'moved', 'originalText': 'moved', 'lemma': 'move', 'characterOffsetBegin': 12, 'characterOffsetEnd': 17, 'pos': 'VBN', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 4, 'word': 'from', 'originalText': 'from', 'lemma': 'from', 'characterOffsetBegin': 18, 'characterOffsetEnd': 22, 'pos': 'IN', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 5, 'word': 'Real', 'originalText': 'Real', 'lemma': 'real', 'characterOffsetBegin': 23, 'characterOffsetEnd': 27, 'pos': 'JJ', 'ner': 'ORGANIZATION', 'before': ' ', 'after': ' '}, {'index': 6, 'word': 'Madrid', 'originalText': 'Madrid', 'lemma': 'Madrid', 'characterOffsetBegin': 28, 'characterOffsetEnd': 34, 'pos': 'NNP', 'ner': 'ORGANIZATION', 'before': ' ', 'after': ' '}, {'index': 7, 'word': 'to', 'originalText': 'to', 'lemma': 'to', 'characterOffsetBegin': 35, 'characterOffsetEnd': 37, 'pos': 'TO', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 8, 'word': 'Juventus', 'originalText': 'Juventus', 'lemma': 'Juventus', 'characterOffsetBegin': 38, 'characterOffsetEnd': 46, 'pos': 'NNP', 'ner': 'ORGANIZATION', 'before': ' ', 'after': ''}, {'index': 9, 'word': '.', 'originalText': '.', 'lemma': '.', 'characterOffsetBegin': 46, 'characterOffsetEnd': 47, 'pos': '.', 'ner': 'O', 'before': '', 'after': ' '}]}, {'index': 1, 'entitymentions': [{'docTokenBegin': 14, 'docTokenEnd': 15, 'tokenBegin': 5, 'tokenEnd': 6, 'text': 'Barcelona', 'characterOffsetBegin': 76, 'characterOffsetEnd': 85, 'ner': 'ORGANIZATION'}], 'tokens': [{'index': 1, 'word': 'While', 'originalText': 'While', 'lemma': 'while', 'characterOffsetBegin': 48, 'characterOffsetEnd': 53, 'pos': 'IN', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 2, 'word': 'messi', 'originalText': 'messi', 'lemma': 'messus', 'characterOffsetBegin': 54, 'characterOffsetEnd': 59, 'pos': 'NNS', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 3, 'word': 'still', 'originalText': 'still', 'lemma': 'still', 'characterOffsetBegin': 60, 'characterOffsetEnd': 65, 'pos': 'RB', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 4, 'word': 'plays', 'originalText': 'plays', 'lemma': 'play', 'characterOffsetBegin': 66, 'characterOffsetEnd': 71, 'pos': 'VBZ', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 5, 'word': 'for', 'originalText': 'for', 'lemma': 'for', 'characterOffsetBegin': 72, 'characterOffsetEnd': 75, 'pos': 'IN', 'ner': 'O', 'before': ' ', 'after': ' '}, {'index': 6, 'word': 'Barcelona', 'originalText': 'Barcelona', 'lemma': 'Barcelona', 'characterOffsetBegin': 76, 'characterOffsetEnd': 85, 'pos': 'NNP', 'ner': 'ORGANIZATION', 'before': ' ', 'after': ''}]}]}

If you look at the above script carefully, you can find the POS tags, named entities and lemmatized version of each word.

Lemmatization

Let’s now explore the annotated results. We’ll first print the lemmatizations for the words in the two sentences in our dataset:

for sentence in annot_doc["sentences"]:       for word in sentence["tokens"]:         print(word["word"] + " => " + word["lemma"]) 

In the script above, the outer loop iterates through each sentence in the document and the inner loop iterates through each word in the sentence. Inside the inner loop, the word and it’s corresponding lemmatized form are printed on the console. The output looks like this:

Ronaldo=>Ronaldo   has=>have   moved=>move   from=>from   Real=>real   Madrid=>Madrid   to=>to   Juventus=>Juventus   .=>. While=>while   messi=>messus   still=>still   plays=>play   for=>for   Barcelona=>Barcelona   

For example, you can see the word moved has been lemmatized to move, similarly the word plays has been lemmatized to play.

POS Tagging

In the same way, we can find the POS tags for each word. Look at the following script:

for sentence in annot_doc["sentences"]:       for word in sentence["tokens"]:         print (word["word"] + "=>" + word["pos"]) 

In the output, you should see the following results:

Ronaldo=>NNP   has=>VBZ   moved=>VBN   from=>IN   Real=>JJ   Madrid=>NNP   to=>TO   Juventus=>NNP   .=>. While=>IN   messi=>NNS   still=>RB   plays=>VBZ   for=>IN   Barcelona=>NNP   

The tag set used for POS tags is the Penn Treebank tagset and can be found here.

Named Entity Recognition

To find named entities in our document, we can use the following script:

for sentence in annot_doc["sentences"]:       for word in sentence["tokens"]:         print (word["word"] + "=>" + word["ner"]) 

The output looks like this:

Ronaldo=>PERSON   has=>O   moved=>O   from=>O   Real=>ORGANIZATION   Madrid=>ORGANIZATION   to=>O   Juventus=>ORGANIZATION   .=>O While=>O   messi=>O   still=>O   plays=>O   for=>O   Barcelona=>ORGANIZATION   

We can see that Ronaldo has been identified as a PERSON while Barcelona has been identified as Organization, which in this case is correct.

Sentiment Analysis

To find the sentiment of a sentence, all you have to is pass sentiment as the value for the annotators property. Look at the following script:

doc = "I like this chocolate. This chocolate is not good. The chocolate is delicious. Its a very tasty chocolate. This is so bad"   annot_doc = nlp_wrapper.annotate(doc,       properties={        'annotators': 'sentiment',        'outputFormat': 'json',        'timeout': 1000,     }) 

To find the sentiment, we can iterate over each sentence and then use sentimentValue property to find the sentiment. The sentimentValue returns a value between 1 and 4 where 1 corresponds to highly negative sentiment while 4 corresponds to highly positive sentiment. The sentiment property can be used to get sentiment in verbal form i.e positive, negative or neutral.

The following script finds the sentiment for each sentence in the document we defined above.

for sentence in annot_doc["sentences"]:       print ( " ".join([word["word"] for word in sentence["tokens"]]) + " => " \         + str(sentence["sentimentValue"]) + " = "+ sentence["sentiment"]) 

Output:

I like this chocolate . => 2 = Neutral   This chocolate is not good . => 1 = Negative   The chocolate is delicious . => 3 = Positive   Its a very tasty chocolate . => 3 = Positive   This is so bad => 1 = Negative   

Conclusion

StanfordCoreNLP is another extremely handy library for natural language processing. In this article, we studied how to set up the environment to run StanfordCoreNLP. We then explored the use of StanfordCoreNLP library for common NLP tasks such as lemmatization, POS tagging and named entity recognition and finally, we rounded off the article with sentimental analysis using StanfordCoreNLP.

Planet Python

Real Python: Get Started With Django Part 1: Build a Portfolio App

Django is a fully featured Python web framework that can be used to build complex web applications. In this tutorial, you’ll jump in and learn Django by example. You’ll follow the steps to create a fully functioning web application and, along the way, learn some of the most important features of the framework and how they work together.

In later posts in this series, you’ll see how to build more complex websites using even more of Django’s features than you’ll cover in this tutorial.

By the end of this tutorial, you will be able to:

  • Understand what Django is and why it’s a great web framework
  • Understand the architecture of a Django site and how it compares with other frameworks
  • Set up a new Django project and app
  • Build a Personal Portfolio Website with Django

Free Bonus: Click here to get access to a free Django Learning Resources Guide (PDF) that shows you tips and tricks as well as common pitfalls to avoid when building Python + Django web applications.

Why You Should Learn Django

There are endless web development frameworks out there, so why should you learn Django over any of the others? First of all, it’s written in Python, one of the most readable and beginner-friendly programming languages out there.

Note: This tutorial assumes an intermediate knowledge of the Python language. If you’re new to programming with Python, check out some of our beginner tutorials or the introductory course.

The second reason you should learn Django is the scope of its features. If you need to build a website, you don’t need to rely on any external libraries or packages if you choose Django. This means that you don’t need to learn how to use anything else, and the syntax is seamless as you’re using only one framework.

There’s also the added benefit that you don’t need to worry that updating one library or framework will render others that you’ve installed useless.

If you do find yourself needing to add extra features, there are a range of external libraries that you can use to enhance your site.

One of the great things about the Django framework is its in-depth documentation. It has detailed documentation on every aspect of Django and also has great examples and even a tutorial to get you started.

There’s also a fantastic community of Django developers, so if you get stuck there’s almost always a way forward by either checking the docs or asking the community.

Django is a high-level web application framework with loads of features. It’s great for anyone new to web development due to its fantastic documentation, and particularly if you’re also familiar with Python.

The Structure of a Django Website

A Django website consists of a single project that is split into separate apps. The idea is that each app handles a self-contained function that the site needs to perform. As an example, imagine an application like Instagram. There are several different functions that need to be performed:

  • User management: Login, logout, register, and so on
  • The image feed: Uploading, editing, and displaying images
  • Private messaging: Private messages between users and notifications

These are each separate pieces of functionality, so if this were a Django site, then each piece of functionality should be a different Django app inside a single Django project.

The Django project holds some configurations that apply to the project as a whole, such as project settings, URLs, shared templates and static files. Each application can have its own database and has its own functions to control how the data is displayed to the user in HTML templates.

Each application also has its own URLs as well as its own HTML templates and static files, such as JavaScript and CSS.

Django apps are structured so that there is a separation of logic. It supports the Model-View-Controller Pattern, which is the architecture on which most web frameworks are built. The basic principle is that in each application there are three separate files that handle the three main pieces of logic separately:

  • Model defines the data structure. This is usually a database and is the base layer to an application.
  • View displays some or all of the data to the user with HTML and CSS.
  • Controller handles how the database and the view interact.

If you want to learn more about the MVC pattern, then check out Model-View-Controller (MVC) Explained – With Legos.

In Django, the architecture is slightly different. Although based upon the MVC pattern, Django handles the controller part itself. There’s no need to define how the database and views interact. It’s all done for you!

The pattern Django utilizes is called the Model-View-Tempalate (MVT) pattern. The view and template in the MVT pattern make up the view in the MVC pattern. All you need to do is add some URL configurations to map the views to, and Django handles the rest!

A Django site starts off as a project and is built up with a number of applications that each handle separate functionality. Each app follows the Model-View-Template pattern. Now that you’re familiar with the structure of a Django site, let’s have a look at what you’re going to build!

What You’re Going to Build

Before you get started with any web development project, it’s a good idea to come up with a plan of what you’re going to build. In this tutorial, we are going to build an application with the following features:

  • A fully functioning blog: If you’re looking to demonstrate your coding ability, a blog is a great way to do that. In this application, you will be able to create, update, and delete blog posts. Posts will have categories that can be used to sort them. Finally, users will be able to leave comments on posts.

  • A portfolio of your work: You can showcase previous web development projects here. You’ll build a gallery style page with clickable links to projects that you’ve completed.

Note: Before you get started, you can pull down the source code and follow along with the tutorial.

If you prefer to follow along by writing the code yourself, don’t worry. I’ve referenced the relevant parts of the source code throughout so you can refer back to it.

We won’t be using any external Python libraries in this tutorial. One of the great things about Django is that it has so many features that you don’t need to rely on external libraries. However, we will add Bootstrap 4 styling in the templates.

By building these two apps, you’ll learn the basics of Django models, view functions, forms, templates, and the Django admin page. With knowledge of these features, you’ll be able to go away and build loads more applications. You’ll also have the tools to learn even more and build sophisticated Django sites.

Hello, World!

Now that you know the structure of a Django application, and what you are about to build, we’re going to go through the process of creating an application in Django. You’ll extend this later into your personal portfolio application.

Set Up Your Development Environment

Whenever you are starting a new web development project, it’s a good idea to first set up your development environment. Create a new directory for your project to live in, and cd into it:

$   mkdir rp-portfolio $   cd rp-portfolio 

Once your inside the main directory, it’s a good idea to create a virtual environment to manage dependencies. There are many different ways to set up virtual environments, but here you’re going to use venv:

$   python3 -m venv venv 

This command will create a folder venv in your working directory. Inside this directory, you’ll find several files including a copy of the Python standard library. Later, when you install new dependencies, they will also be stored in this directory. Next, you need to activate the virtual environment by running the following command:

$   source venv/bin/activate 

Note: If you’re not using bash shell, you might need to use a different command to activate your virtual environment. For example, on windows you need this command:

C:\> venv\Scripts\activate.bat 

You’ll know that your virtual environment has been activated, because your console prompt in the terminal will change. It should look something like this:

(venv) $   

Note: Your virtual environment directory doesn’t have to be called venv. If you want to create one under a different name, for example my_venv, just replace with the second venv with my_venv.

Then, when activating your virtual environment, replace venv with my_venv again. The prompt will also now be prefixed with (my_venv).

Now that you’ve created a virtual environment, it’s time to install Django. You can do this using pip:

(venv) $   pip install Django 

Once you’ve set up the virtual environment and installed Django, you can now dive in to creating the application.

Create a Django Project

As you saw in the previous section, a Django web application is made up of a project and its constituent apps. Making sure you’re in the rp_portfolio directory, and you’ve activated your virtual environment, run the following command to create the project:

$   django-admin startproject personal_portfolio 

This will create a new directory personal_portfolio. If you cd into this new directory you’ll see another directory called personal_portfolio and a file called manage.py. Your directory structure should look something like this:

rp-portfolio/ │ ├── personal_portfolio/ │   ├── personal_portfolio/ │   │   ├── __init__.py │   │   ├── settings.py │   │   ├── urls.py │   │   └── wsgi.py │   │ │   └── manage.py │ └── venv/ 

Most of the work you do will be in that first personal_portfolio directory. To save having to cd through several directories each time you come to work on your project, it can be helpful to reorder this slightly by moving all the files up a directory. While you’re in the rp-portfolio directory, run the following commands:

$   mv personal_portfolio/manage.py ./ $   mv personal_portfolio/personal_portfolio/* personal_portfolio $   rm -r personal_portfolio/personal_portfolio/ 

You should end up with something like this:

rp-portfolio/ │ ├── personal_portfolio/ │   ├── __init__.py │   ├── settings.py │   ├── urls.py │   └── wsgi.py │ ├── venv/ │ └── manage.py 

Once your file structure is set up, you can now start the server and check that your set up was successful. In the console, run the following command:

$   python manage.py runserver 

Then, in your browser go to localhost:8000, and you should see the following:

Initial view of Django site

Congratulations, you’ve created a Django site! The source code for this part of the tutorial can be found on GitHub. The next step is to create apps so that you can add views and functionality to your site.

Create a Django Application

For this part of the tutorial, we’ll create an app called hello_world, which you’ll subsequently delete as its not necessary for our personal portfolio site.

To create the app, run the following command:

$   python manage.py startapp hello_world 

This will create another directory called hello_world with several files:

  • __init__.py tells Python to treat the directory as a Python package.
  • admin.py contains settings for the Django admin pages.
  • apps.py contains settings for the application configuration.
  • models.py contains a series of classes that Django’s ORM converts to database tables.
  • tests.py contains test classes.
  • views.py contains functions and classes that handle what data is displayed in the HTML templates.

Once you’ve created the app, you need to install it in your project. In rp-portfolio/settings.py, add the following line of code under INSTALLED_APPS:

INSTALLED_APPS = [     'django.contrib.admin',     'django.contrib.auth',     'django.contrib.contenttypes',     'django.contrib.sessions',     'django.contrib.messages',     'django.contrib.staticfiles',     'hello_world', ] 

That line of code means that your project now knows that the app you just created exists. The next step is to create a view so that you can display something to a user.

Create a View

Views in Django are a collection of functions or classes inside the views.py file in your app directory. Each function or class handles the logic that gets processed each time a different URL is visited.

Navigate to the views.py file in the hello_world directory. There’s already a line of code in there that imports render(). Add the following code:

from django.shortcuts import render  def hello_world(request):     return render(request, 'hello_world.html', {}) 

In this piece of code, you’ve defined a view function called hello_world(). When this function is called, it will render an HTML file called hello_world.html. That file doesn’t exist yet, but we’ll create it soon.

The view function takes one argument, request. This object is an HttpRequestObject that is created whenever a page is loaded. It contains information about the request, such as the method, which can take several values including GET and POST.

Now that you’ve created the view function, you need to create the HTML template to display to the user. render() looks for HTML templates inside a directory called templates inside your app directory. Create that directory and subsequently a file named hello_world.html inside it:

$   mkdir hello_world/templates/ $   touch hello_world/templates/hello_world.html 

Add the following lines of HTML to your file:

<h1>Hello, World!</h1> 

You’ve now created a function to handle your views and templates to display to the user. The final step is to hook up your URLs so that you can visit the page you’ve just created. Your project has a module called urls.py in which you need to include a URL configuration for the hello_world app. Inside personal_portfolio/urls.py, add the following:

from django.contrib import admin from django.urls import path, include  urlpatterns = [     path('admin/', admin.site.urls),     path('', include('hello_world.urls')), ] 

This looks for a module called urls.py inside the hello_world application and registers any URLs defined there. Whenever you visit the root path of your URL (localhost:8000), the hello_world application’s URLs will be registered. The hello_world.urls module doesn’t exist yet, so you’ll need to create it:

$   touch hello_world/urls.py 

Inside this module, we need to import the path object as well as our app’s views module. Then we want to create a list of URL patterns that correspond to the various view functions. At the moment, we have only created one view function, so we need only create one URL:

from django.urls import path from hello_world import views  urlpatterns = [     path('', views.hello_world, name='hello_world'), ] 

Now, when you restart the server and visit localhost:8000, you should be able to see the HTML template you created:

Hello, World! view of Django site

Congratulations, again! You’ve created your first Django app and hooked it up to your project. Don’t forget to check out the source code for this section and the previous one. The only problem now is that it doesn’t look very nice. In the next section, we’re going to add bootstrap styles to your project to make it prettier!

Add Bootstrap to Your App

If you don’t add any styling, then the app you create isn’t going to look too nice. Instead of going into CSS styling with this tutorial, we’ll just cover how to add bootstrap styles to your project. This will allow us to improve the look of the site without too much effort.

Before we get started with the Bootstrap styles, we’ll create a base template that we can import to each subsequent view. This template is where we’ll subsequently add the Bootstrap style imports.

Create another directory called templates, this time inside personal_portfolio, and a file called base.html, inside the new directory:

$   mkdir personal_portfolio/templates/ $   touch personal_portfolio/templates/base.html 

We create this additional templates directory to store HTML templates that will be used in every Django app in the project. As you saw previously, each Django project can consist of multiple apps that handle separated logic, and each app contains its own templates directory to store HTML templates related to the application.

This application structure works well for the back end logic, but we want our entire site to look consistent on the front end. Instead of having to import Bootstrap styles into every app, we can create a template or set of templates that are shared by all the apps. As long as Django knows to look for templates in this new, shared directory it can save a lot of repeated styles.

Inside this new file (personal_portfolio/templates/base.html), add the following lines of code:

{% block page_content %}{% endblock %} 

Now, in hello_world/templates/hello_world.html, we can extend this base template:

{% extends "base.html" %}  {% block page_content %} <h1>Hello, World!</h1> {% endblock %} 

What happens here is that any HTML inside the page_content block gets added inside the same block in base.html.

To install Bootstrap in your app, you’ll use the Bootstrap CDN. This is a really simple way to install Bootstrap that just involves adding a few lines of code to base.html. Check out the source code to see how to add the CDN links to your project.

All future templates that we create will extend base.html so that we can include Bootstrap styling on every page without having to import the styles again.

Before we can see our new styled application, we need to tell our Django project that base.html exists. The default settings register template directories in each app, but not in the project directory itself. In personal_portfolio/settings.py, update TEMPLATES:

TEMPLATES = [     {         "BACKEND": "django.template.backends.django.DjangoTemplates",         "DIRS": ["personal_portfolio/templates/"],         "APP_DIRS": True,         "OPTIONS": {             "context_processors": [                 "django.template.context_processors.debug",                 "django.template.context_processors.request",                 "django.contrib.auth.context_processors.auth",                 "django.contrib.messages.context_processors.messages",             ]         },     } ] 

Now, when you visit localhost:8000, you should see that the page has been formatted with slightly different styling:

Hello, World! view of Django site with Bootstrap styles

Whenever you want create templates or import scripts that you intend to use in all your Django apps inside a project, you can add them to this project-level directory and extend them inside your app templates.

Adding templates is the last stage to building your Hello, World! Django site. You learned how the Django templating engine works and how to create project-level templates that can be shared by all the apps inside your Django project.

In this section, you learned how to create a simple Hello, World! Django site by creating a project with a single app. In the next section, you’ll create another application to showcase web development projects, and you’ll learn all about models in Django!

The source code for this section can be found on GitHub.

Showcase Your Projects

Any web developer looking to create a portfolio needs a way to show off projects they have worked on. That’s what you’ll be building now. You’ll create another Django app called projects that will hold a series of sample projects that will be displayed to the user. Users can click on projects and see more information about your work.

Before we build the projects app, let’s first delete the hello_world application. All you need to do is delete the hello_world directory and remove the line "hello_world", from INSTALLED_APPS in settings.py:

INSTALLED_APPS = [     'django.contrib.admin',     'django.contrib.auth',     'django.contrib.contenttypes',     'django.contrib.sessions',     'django.contrib.messages',     'django.contrib.staticfiles',     'hello_world',  # Delete this line ] 

Finally, you need to remove the URL path created in personal_portfolio/urls.py:

from django.contrib import admin from django.urls import path, include  urlpatterns = [     path('admin/', admin.site.urls),     path('', include('hello_world.urls')),  # Delete this line ] 

Now that you’ve removed the hello_world app, we can create the projects app. Making sure you’re in the rp-portfolio directory, run the following command in your console:

$   python manage.py startapp projects 

This will create a directory named projects. The files created are the same as those created when we set up the hello_world application. In order to hook up our app, we need to add it into INSTALLED_APPS in settings.py:

INSTALLED_APPS = [     'django.contrib.admin',     'django.contrib.auth',     'django.contrib.contenttypes',     'django.contrib.sessions',     'django.contrib.messages',     'django.contrib.staticfiles',     'projects', ] 

Check out the source code for this section on GitHub. We’re not going to worry about URLs for this application just yet. Instead, we’re going to focus on building a Project model.

Projects App: Models

If you want to store data to display on a website, then you’ll need a database. Typically, if you want to create a database with tables and columns within those tables, you’ll need to use SQL to manage the database. But when you use Django, you don’t need to learn a new language because it has a built-in Object Relational Mapper (ORM).

An ORM is a program that allows you to create classes that correspond to database tables. Class attributes correspond to columns, and instances of the classes correspond to rows in the database. So, instead of learning a whole new language to create our database and its tables, we can just write some Python classes.

When you’re using an ORM, the classes you build that represent database tables are referred to as models. In Django, they live in the models.py module of each Django app.

In your projects app, you’ll only need one table to store the different projects you’ll display to the user. That means you’ll only need to create one model in models.py.

The model you’ll create will be called Project and will have the following fields:

  • title will be a short string field to hold the name of your project.
  • description will be a larger string field to hold a longer piece of text.
  • technology will be a string field, but its contents will be limited to a select number of choices.
  • image will be an image field that holds the file path where the image is stored.

To create this model, we’ll create a new class in models.py and add the following in our fields:

from django.db import models  class Project(models.Model):     title = models.CharField(max_length=100)     description = models.TextField()     technology = models.CharField(max_length=20)     image = models.FilePathField(path="/img") 

Django models come with many built-in model field types. We’ve only used three in this model. CharField is used for short strings and specifies a maximum length.

TextField is similar to CharField but can be used for longer form text as it doesn’t have a maximum length limit. Finally, FilePathField also holds a string but must point to a file path name.

Now that we’ve created our Project class, we need Django to create the database. By default, the Django ORM creates databases in SQLite, but you can use other databases that use the SQL language, such as PostgreSQL or MySQL, with the Django ORM.

To start the process of creating our database, we need to create a migration. A migration is a file containing a Migration class with rules that tell Django what changes need to be made to the database. To create the migration, type the following command in the console, making sure you’re in the rp-portfolio directory:

$   python manage.py makemigrations projects Migrations for 'projects':   projects/migrations/0001_initial.py     - Create model Project 

You should see that a file projects/migrations/0001_initial.py has been created in the projects app. Check out that file in the source code to make sure your migration is correct.

Now that you’ve create a migration file, you need to apply the migrations set out in the migrations file and create your database using the migrate command:

$   python manage.py migrate projects Operations to perform:   Apply all migrations: projects Running migrations:   Applying projects.0001_initial... OK 

Note: When running both the makemigrations and migrate commands, we added projects to our command. This tells Django to only look at models and migrations in the projects app. Django comes with several models already created.

If you run makemigrations and migrate without the projects flag, then all migrations for all the default models in your Django projects will be created and applied. This is not a problem, but for the purposes of this section, they are not needed.

You should also see that a file called db.sqlite3 has been created in the root of your project. Now your database is set up and ready to go. You can now create rows in your table that are the various projects you want to show on your portfolio site.

To create instances of our Project class, we’re going to have to use the Django shell. The Django shell is similar to the Python shell but allows you to access the database and create entries. To access the Django shell, we use another Django management command:

$   python manage.py shell 

Once you’ve accessed the shell, you’ll notice that the command prompt will change from $ to >>>. You can then import your models:

>>>

>>> from projects.models import Project 

We’re first going to create a new project with the following attributes:

  • name: My First Project
  • description: A web development project.
  • technology: Django
  • image: img/project1.png

To do this, we create an instance of the Project class in the Django shell:

>>>

>>> p1 = Project( ...     title='My First Project', ...     description='A web development project.', ...     technology='Django', ...     image='img/project1.png' ... ) >>> p1.save() 

This creates a new entry in your projects table and saves it to the database. Now you have created a project that you can display on your portfolio site.

The final step in this section is to create two more sample projects:

>>>

>>> p2 = Project( ...     title='My Second Project', ...     description='Another web development project.', ...     technology='Flask', ...     image='img/project2.png' ... ) >>> p2.save() >>> p3 = Project( ...     title='My Third Project', ...     description='A final development project.', ...     technology='Django', ...     image='img/project3.png' ... ) >>> p3.save() 

Well done for reaching the end of this section! You now know how to create models in Django and build migration files so that you can translate these model classes into database tables. You’ve also used the Django shell to create three instances of your model class.

In the next section, we’ll take these three projects you created and create a view function to display them to users on a web page. You can find the source code for this section of the tutorial on GitHub.

Projects App: Views

Now you’ve created the projects to display on your portfolio site, you’ll need to create view functions to send the data from the database to the HTML templates.

In the projects app, you’ll create two different views:

  1. An index view that shows a snippet of information about each project
  2. A detail view that shows more information on a particular topic

Let’s start with the index view, as the logic is slightly simpler. Inside views.py, you’ll need to import the Project class from models.py and create a function project_index() that renders a template called project_index.html. In the body of this function, you’ll make a Django ORM query to select all objects in the Project table:

 1 from django.shortcuts import render  2 from projects.models import Project  3   4 def project_index(request):  5     projects = Project.objects.all()  6     context = {  7         'projects': projects  8     }  9     return render(request, 'project_index.html', context) 

There’s quite a lot going on in this code block, so let’s break it down.

In line 6, you perform a query. A query is simply a command that allows you to create, retrieve, update, or delete objects (or rows) in your database. In this case, you’re retrieving all objects in the projects table.

A database query returns a collection of all objects that match the query, known as a Queryset. In this case, you want all objects in the table, so it will return a collection of all projects.

In line 7 of the code block above, we define a dictionary context. The dictionary only has one entry projects to which we assign our Queryset containing all projects. The context dictionary is used to send information to our template. Every view function you create needs to have a context dictionary.

In line 10, context is added as an argument to render(). Any entries in the context dictionary are available in the template, as long as the context argument is passed to render(). You’ll need to create a context dictionary and pass it to render in each view function you create.

We also render a template named project_index.html, which doesn’t exist yet. Don’t worry about that for now. You’ll create the templates for these views in the next section.

Next, you’ll need to create the project_detail() view function. This function will need an additional argument: the id of the project that’s being viewed.

Otherwise, the logic is similar:

13 def project_detail(request, pk): 14     project = Project.objects.get(pk=pk) 15     context = { 16         'project': project 17     } 18     return render(request, 'project_detail.html', context) 

In line 14, we perform another query. This query retrieves the project with primary key, pk, equal to that in the function argument. We then assign that project in our context dictionary, which we pass to render(). Again, there’s a template project_detail.html, which we have yet to create.

Once your view functions are created, we need to hook them up to URLs. We’ll start by creating a file projects/urls.py to hold the URL configuration for the app. This file should contain the following code:

 1 from django.urls import path  2 from . import views  3   4 urlpatterns = [  5     path("", views.project_index, name="project_index"),  6     path("<int:pk>/", views.project_detail, name="project_detail"),  7 ] 

In line 5, we hook up the root URL of our app to the project_index view. It is slightly more complicated to hook up the project_detail view. To do this, we want the URL to be /1, or /2, and so on, depending on the pk of the project.

The pk value in the URL is the same pk passed to the view function, so you need to dynamically generate these URLs depending on which project you want to view. To do this, we used the <int:pk> notation. This just tells Django that the value passed in the URL is an integer, and its variable name is pk.

With those now set up, we need to hook these URLs up to the project URLs. In personal_portfolio/urls.py, add the following highlighted line of code:

from django.contrib import admin from django.urls import path, include  urlpatterns = [     path("admin/", admin.site.urls),     path("projects/", include("projects.urls")), ] 

This line of code includes all the URLs in the projects app but means they are accessed when prefixed by projects/. There are now two full URLs that can be accessed with our project:

  • localhost:8000/projects: The project index page
  • localhost:8000/projects/3: The detail view for the project with pk=3

These URLs still won’t work properly because we don’t have any HTML templates. But our views and logic are up and running so all that’s left to do is create those templates. If you want to check your code, take a look at the source code for this section.

Projects App: Templates

Phew! You’re nearly there with this app. Our final step is to create two templates:

  1. The project_index template
  2. The project_detail template

As we’ve added Bootstrap styles to our application, we can use some pre-styled components to make the views look nice. Let’s start with the project_index template.

For the project_index template, you’ll create a grid of Bootstrap cards, with each card displaying details of the project. Of course, we don’t know how many projects there are going to be. In theory, there could be hundreds to display.

We don’t want to have to create 100 different Bootstrap cards and hard-code in all the information to each project. Instead, we’re going to use a feature of the Django template engine: for loops.

Using this feature, you’ll be able to loop through all the projects and create a card for each one. The for loop syntax in the Django template engine is as follows:

{% for project in projects %} {# Do something... #} {% endfor %} 

Now that you know how for loops work, you can add the following code to a file named projects/templates/project_index.html:

 1 {% extends "base.html" %}  2 {% load static %}  3 {% block page_content %}  4 <h1>Projects</h1>  5 <div class="row">  6 {% for project in projects %}  7     <div class="col-md-4">  8         <div class="card mb-2">  9             <img class="card-img-top" src="{% static project.image %}"> 10             <div class="card-body"> 11                 <h5 class="card-title">{{ project.title }}</h5> 12                 <p class="card-text">{{ project.description }}</p> 13                 <a href="{% url 'project_detail' project.pk %}" 14                    class="btn btn-primary"> 15                     Read More 16                 </a> 17             </div> 18         </div> 19     </div> 20     {% endfor %} 21 </div> 22 {% endblock %} 

There’s a lot of Bootstrap HTML here, which is not the focus of this tutorial. Feel free to copy and paste and take a look at the Bootstrap docs if you’re interested in learning more. Instead of focusing on the Bootstrap, there are a few things to highlight in this code block.

In line 1, we extend base.html as we did in the Hello, World! app tutorial. I’ve added some more styling to this file to include a navigation bar and so that all the content is contained in a Bootstrap container. The changes to base.html can be seen in the source code on GitHub.

On line 2, we include a {% load static %} tag to include static files such as images. Remember back in the section on Django models, when you created the Project model. One of its attributes was a filepath. That filepath is where we’re going to store the actual images for each project.

Django automatically registers static files stored in a directory named static/ in each application. Our image file path names were of the structure: img/<photo_name>.png.

When loading static files, Django looks in the static/ directory for files matching a given filepath within static/. So, we need to create a directory named static/ with another directory named img/ inside. Inside img/, you can copy over the images from the source code on GitHub.

On line 6, we begin the for loop, looping over all projects passed in by the context dictionary.

Inside this for loop, we can access each individual project. To access the project’s attributes, you can use dot notation inside double curly brackets. For example, to access the project’s title, you use {{ project.title }}. The same notation can be used to access any of the project’s attributes.

On line 9, we include our project image. Inside the src attribute, we add the code {% static project.image %}. This tells Django to look inside the static files to find a file matching project.image.

The final point that we need to highlight is the link on line 13. This is the link to our project_detail page. Accessing URLs in Django is similar to accessing static files. The code for the URL has the following form:

{% url '<url path name>' <view_function_arguments> %} 

In this case, we are accessing a URL path named project_detail, which takes integer arguments corresponding to the pk number of the project.

With all that in place, if you start the Django server and visit localhost:8000/projects, then you should see something like this:

project index view

With the project_index.html template in place, it’s time to create the project_detail.html template. The code for this template is below:

{% extends "base.html" %} {% load static %}  {% block page_content %} <h1>{{ project.title }}</h1> <div class="row">     <div class="col-md-8">         <img src="{% static project.image %}" alt="" width="100%">     </div>     <div class="col-md-4">         <h5>About the project:</h5>         <p>{{ project.description }}</p>         <br>         <h5>Technology used:</h5>         <p>{{ project.technology }}</p>     </div> </div> {% endblock %} 

The code in this template has the same functionality as each project card in the project_index.html template. The only difference is the introduction of some Bootstrap columns.

If you visit localhost:8000/projects/1, you should see the detail page for that first project you created:

project detail view

In this section, you learned how to use models, views, and templates to create a fully functioning app for your personal portfolio project. Check out the source code for this section on GitHub.

In the next section, you’ll build a fully functioning blog for your site, and you’ll also learn about the Django admin page and forms.

Share Your Knowledge With a Blog

A blog is a great addition to any personal portfolio site. Whether you update it monthly or weekly, it’s a great place to share your knowledge as you learn. In this section, you’re going to build a fully functioning blog that will allow you to perform the following tasks:

  • Create, update, and delete blog posts
  • Display posts to the user as either an index view or a detail view
  • Assign categories to posts
  • Allow users to comment on posts

You’ll also learn how to use the Django Admin interface, which is where you’ll create, update, and delete posts and categories as necessary.

Before you get into building out the functionality of this part of your site, create a new Django app named blog. Don’t delete projects. You’ll want both apps in your Django project:

$   python manage.py startapp blog 

This may start to feel familiar to you, as its your third time doing this. Don’t forget to add blog to your INSTALLED_APPS in personal_porfolio/settings.py:

INSTALLED_APPS = [     "django.contrib.admin",     "django.contrib.auth",     "django.contrib.contenttypes",     "django.contrib.sessions",     "django.contrib.messages",     "django.contrib.staticfiles",     "projects",     "blog", ] 

Hold off on hooking up the URLs for now. As with the projects app, you’ll start by adding your models.

Blog App: Models

The models.py file in this app is much more complicated than in the projects app.

You’re going to need three separate database tables for the blog:

  1. Post
  2. Category
  3. Comment

These tables need to be related to one another. This is made easier because Django models come with fields specifically for this purpose.

Below is the code for the Category and Post models:

 1 from django.db import models  2   3 class Category(models.Model):  4     name = models.CharField(max_length=20)  5   6 class Post(models.Model):  7     title = models.CharField(max_length=255)  8     body = models.TextField()  9     created_on = models.DateTimeField(auto_now_add=True) 10     last_modified = models.DateTimeField(auto_now=True) 11     categories = models.ManyToManyField('Category', related_name='posts') 

The Category model is very simple. All that’s needed is a single CharField in which we store the name of the category.

The title and body fields on the Post model are the same field types as you used in the Project model. We only need a CharField for the title as we only want a short string for the post title. The body needs to be a long-form piece of text, so we use a TextField.

The next two fields, created_on and last_modified, are Django DateTimeFields. These store a datetime object containing the date and time when the post was created and modified respectively.

On line 11, the DateTimeField takes an argument auto_now_add=True. This assigns the current date and time to this field whenever an instance of this class is created.

On line 12, the DateTimeField takes an argument auto_now=True. This assigns the current date and time to this field whenever an instance of this class is saved. That means whenever you edit an instance of this class, the date_modified is updated.

The final field on the post model is the most interesting. We want to link our models for categories and posts in such a way that many categories can be assigned to many posts. Luckily, Django makes this easier for us by providing a ManytoManyField field type. This field links the Post and Category models and allows us to create a relationship between the two tables.

The ManyToManyField takes two arguments. The first is the model with which the relationship is, in this case its Category. The second allows us to access the relationship from a Category object, even though we haven’t added a field there. By adding a related_name of posts, we can access category.posts to give us a list of posts with that category.

The third and final model we need to add is Comment. We’ll use another relationship field similar the ManyToManyField that relates Post and Category. However, we only want the relationship to go one way: one post should have many comments.

You’ll see how this works after we define the Comment class:

16 class Comment(models.Model): 17     author = models.CharField(max_length=60) 18     body = models.TextField() 19     created_on = models.DateTimeField(auto_now_add=True) 20     post = models.ForeignKey('Post', on_delete=models.CASCADE) 

The first three fields on this model should look familiar. There’s an author field for users to add a name or alias, a body field for the body of the comment, and a created_on field that is identical to the created_on field on the Post model.

On line 20, we use another relational field, the ForeignKey field. This is similar to the ManyToManyField but instead defines a many to one relationship. The reasoning behind this is that many comments can be assigned to one post. But you can’t have a comment that corresponds to many posts.

The ForeignKey field takes two arguments. The first is the other model in the relationship, in this case, Post. The second tells Django what to do when a post is deleted. If a post is deleted, then we don’t want the comments related to it hanging around. We, therefore, want to delete them as well, so we add the argument on_delete=models.CASCADE.

Once you’ve created the models, you can create the migration files with makemigrations:

$   python manage.py makemigrations blog 

The final step is to migrate the tables. This time, don’t add the app-specific flag. Later on, you’ll need the User model that Django creates for you:

$   python manage.py migrate 

Now that you’ve created the models, we can start to add some posts and categories. You won’t be doing this from the command line as you did with the projects, as typing out a whole blog post into the command line would be unpleasant to say the least!

Instead, you’ll learn how to use the Django Admin, which will allow you to create instances of your model classes in a nice web interface.

Don’t forget that you can check out the source code for this section on GitHub before moving onto the next section.

Blog App: Django Admin

The Django Admin is a fantastic tool and one of the great benefits of using Django. As you’re the only person who’s going to be writing blog posts and creating categories, there’s no need to create a user interface to do so.

On the other hand, you don’t want to have to write blog posts in the command line. This is where the admin comes in. It allows you to create, update, and delete instances of your model classes and provides a nice interface for doing so.

Before you can access the admin, you need to add yourself as a superuser. This is why, in the previous section, you applied migrations project-wide as opposed to just for the app. Django comes with built-in user models and a user management system that will allow you to login to the admin.

To start off, you can add yourself as superuser using the following command:

$   python manage.py createsuperuser 

You’ll then be prompted to enter a username followed by your email address and password. Once you’ve entered the required details, you’ll be notified that the superuser has been created. Don’t worry if you make a mistake since you can just start again:

Username (leave blank to use 'jasmine'): jfiner Email address: jfiner@example.com Password: Password (again): Superuser created successfully. 

Navigate to localhost:8000/admin and log in with the credentials you just used to create a superuse. You’ll see a page similar to the one below:

The Default Django Admin

The User and Groups models should appear, but you’ll notice that there’s no reference to the models you’ve created yourself. That’s because you need to register them inside the admin.

In the blog directory, open the file admin.py and type the following lines of code:

 1 from django.contrib import admin  2 from blog.models import Post, Category  3   4 class PostAdmin(admin.ModelAdmin):  5     pass  6   7 class CategoryAdmin(admin.ModelAdmin):  8     pass  9  10 admin.site.register(Post, PostAdmin) 11 admin.site.register(Category, CategoryAdmin) 

On line 2, you import the models you want to register on the admin page.

Note: We’re not adding the comments to the admin. That’s because it’s not usually necessary to edit or create comments yourself.

If you wanted to add a feature where comments are moderated, then go ahead and add the Comments model too. The steps to do so are exactly the same!

On line 5 and line 9, you define empty classes PostAdmin and CategoryAdmin. For the purposes of this tutorial, you don’t need to add any attributes or methods to these classes. They are used to customize what is shown on the admin pages. For this tutorial, the default configuration is enough.

The last two lines are the most important. These register the models with the admin classes. If you now visit localhost:8000/admin, then you should see that the Post and Category models are now visible:

Django Admin with Posts and Categories

If you click into Posts or Categorys, you should be able to add new instances of both models. I like to add the text of fake blog posts by using lorem ipsum dummy text.

Create a couple of fake posts and assign them fake categories before moving onto the next section. That way, you’ll have posts you can view when we create our templates.

Don’t forget to check out the source code for this section before moving on to building out the views for our app.

Blog App: Views

You’ll need to create three view functions in the views.py file in the blog directory:

  • blog_index will display a list of all your posts.
  • blog_detail will display the full post as well as comments and a form to allow users to create new comments.
  • blog_category will be similar to blog_index, but the posts viewed will only be of a specific category chosen by the user.

The simplest view function to start with is blog_index(). This will be very similar to the project_index() view from your project app. You’ll just query the Post models and retrieve all its objects:

 1 from django.shortcuts import render  2 from blog.models import Post  3   4 def blog_index(request):  5     posts = Post.objects.all().order_by('-created_on')  6     context = {  7         "posts": posts,  8     }  9     return render(request, "blog_index.html", context) 

On line 3, you import the Post model, and on line 6 inside the view function, you obtain a Queryset containing all the posts in the database. order_by() orders the Queryset according to the argument given. The minus sign tells Django to start with the largest value rather than the smallest. We use this, as we want the posts to be ordered with the most recent post first.

Finally, you define the context dictionary and render the template. Don’t worry about creating it yet. You’ll get to creating those in the next section.

Next, you can start to create the blog_category() view. The view function will need to take a category name as an argument and query the Post database for all posts that have been assigned the given category:

13 def blog_category(request, category): 14     posts = Post.objects.filter( 15         categories__name__contains=category 16     ).order_by( 17         '-created_on' 18     ) 19     context = { 20         "category": category, 21         "posts": posts 22     } 23     return render(request, "blog_category.html", context) 

On line 14, you’ve used a Django Queryset filter. The argument of the filter tells Django what conditions need to be met for an object to be retrieved. In this case, we only want posts whose categories contain the category with the name corresponding to that given in the argument of the view function. Again, you’re using order_by() to order posts starting with the most recent.

We then add these posts and the category to the context dictionary, and render our template.

The last view function to add is blog_detail(). This is more complicated as we are going to include a form. Before you add the form, just set up the view function to show a specific post with a comment associated with it. This function will be almost equivalent to the project_detail() view function in the projects app:

21 def blog_detail(request, pk): 22     post = Post.objects.get(pk=pk) 23     comments = Comment.objects.filter(post=post) 24     context = { 25         "post": post, 26         "comments": comments, 27     } 28  29     return render(request, "blog_detail.html", context) 

The view function takes a pk value as an argument and, on line 22, retrieves the object with the given pk.

On line 23, we retrieve all the comments assigned to the given post using Django filters again.

Lastly, add both post and comments to the context dictionary and render the template.

To add a form to the page, you’ll need to create another file in the blog directory named forms.py. Django forms are very similar to models. A form consists of a class where the class attributes are form fields. Django comes with some built-in form fields that you can use to quickly create the form you need.

For this form, the only fields you’ll need are author, which should be a CharField, and body, which can also be a CharField.

Note: If the CharField of your form corresponds to a model CharField, make sure both have the same max_length value.

blog/forms.py should contain the following code:

from django import forms  class CommentForm(forms.Form):     author = forms.CharField(         max_length=60,         widget=forms.TextInput(attrs={             "class": "form-control",             "placeholder": "Your Name"         })     )     body = forms.CharField(widget=forms.Textarea(         attrs={             "class": "form-control",             "placeholder": "Leave a comment!"         })     ) 

You’ll also notice an argument widget has been passed to both the fields. The author field has the forms.TextInput widget. This tells Django to load this field as an HTML text input element in the templates. The body field uses a forms.TextArea widget instead, so that the field is rendered as an HTML text area element.

These widgets also take an argument attrs, which is a dictionary and allows us to specify some CSS classes, which will help with formatting the template for this view later. It also allows us to add some placeholder text.

When a form is posted, a POST request is sent to the server. So, in the view function, we need to check if a POST request has been received. We can then create a comment from the form fields. Django comes with a handy is_valid() on its forms, so we can check that all the fields have been entered correctly.

Once you’ve created the comment from the form, you’ll need to save it using save() and then query the database for all the comments assigned to the given post. Your view function should contain the following code:

21 def blog_detail(request, pk): 22     post = Post.objects.get(pk=pk) 23  24     form = CommentForm() 25     if request.method == 'POST': 26         form = CommentForm(request.POST) 27         if form.is_valid(): 28             comment = Comment( 29                 author=form.cleaned_data["author"], 30                 body=form.cleaned_data["body"], 31                 post=post 32             ) 33             comment.save() 34  35     comments = Comment.objects.filter(post=post) 36     context = { 37         "post": post, 38         "comments": comments, 39         "form": CommentForm(), 40     } 41     return render(request, "blog_detail.html", context) 

On line 25, we create an instance of our form class. Don’t forget to import your form at the beginning of the file:

from . import CommentForm 

We then go on to check if a POST request has been received. If it has, then we create a new instance of our form, populated with the data entered into the form.

The form is then validated using is_valid(). If the form is valid, a new instance of Comment is created. You can access the data from the form using form.cleaned_data, which is a dictionary.

They keys of the dictionary correspond to the form fields, so you can access the author using form.cleaned_data['author']. Don’t forget to add the current post to the comment when you create it.

Note: The life cycle of submitting a form can be a little complicated, so here’s an outline of how it works:

  1. When a user visits a page containing a form, they send a GET request to the server. In this case, there’s no data entered in the form, so we just want to render the form and display it.
  2. When a user enters information and clicks the Submit button, a POST request, containing the data submitted with the form, is sent to the server. At this point, the data must be processed, and two things can happen:
    • The form is valid, and the user is redirected to the next page.
    • The form is invalid, and empty form is once again displayed. The user is back at step 1, and the process repeats.

The Django forms module will output some errors, which you can display to the user. This is beyond the scope of this tutorial, but you can read more about rendering form error messages in the Django documentation.

On line 34, save the comment and go on to add the form to the context dictionary so you can access the form in the HTML template.

The final step before you get to create the templates and actually see this blog up and running is to hook up the URLs. You’ll need create another urls.py file inside blog/ and add the URLs for the three views:

from django.urls import path from . import views  urlpatterns = [     path("", views.blog_index, name="blog_index"),     path("<int:pk>/", views.blog_detail, name="blog_detail"),     path("<category>/", views.blog_category, name="blog_category"), ] 

Once the blog-specific URLs are in place, you need to add them to the projects URL configuration using include():

from django.contrib import admin from django.urls import path, include  urlpatterns = [     path("admin/", admin.site.urls),     path("projects/", include("projects.urls")),     path("blog/", include("blog.urls")), ] 

With this set up, all the blog URLs will be prefixed with blog/, and you’ll have the following URL paths:

  • localhost:8000/blog: Blog index
  • localhost:8000/blog/1: Blog detail view of blog with pk=1
  • localhost:8000/blog/python: Blog index view of all posts with category python

These URLs won’t work just yet as you still need to create the templates.

In this section, you created all the views for your blog application. You learned how to use filters when making queries and how to create Django forms. It won’t be long now until you can see your blog app in action!

As always, don’t forget that you can check out the source code for this section on GitHub.

Blog App: Templates

The final piece of our blog app is the templates. By the end of this section, you’ll have created a fully functioning blog.

You’ll notice there are some bootstrap elements included in the templates to make the interface prettier. These aren’t the focus of the tutorial so I’ve glossed over what they do but do check out the Bootstrap docs to find out more.

The first template you’ll create is for the blog index in a new file blog/templates/blog_index.html. This will be very similar to the projects index view.

You’ll use a for loop to loop over all the posts. For each post, you’ll display the title and a snippet of the body. As always, you’ll extend the base template personal_porfolio/templates/base.html, which contains our navigation bar and some extra formatting:

 1 {% extends "base.html" %}  2 {% block page_content %}  3 <div class="col-md-8 offset-md-2">  4     <h1>Blog Index</h1>  5     <hr>  6     {% for post in posts %}  7     <h2><a href="{% url 'blog_detail' post.pk%}">{{ post.title }}</a></h2>  8     <small>  9         {{ post.created_on.date }} |&nbsp; 10         Categories:&nbsp; 11         {% for category in post.categories.all %} 12         <a href="{% url 'blog_category' category.name %}"> 13             {{ category.name }} 14         </a>&nbsp; 15         {% endfor %} 16     </small> 17     <p>{{ post.body | slice:":400" }}...</p> 18     {% endfor %} 19 </div> 20 {% endblock %} 

On line 7, we have the post title, which is a hyperlink. The link is a Django link where we are pointing to the URL named blog_detail, which takes an integer as its argument and should correspond to the pk value of the post.

Underneath the title, we’ll display the created_on attribute of the post as well as its categories. On line 11, we use another for loop to loop over all the categories assigned to the post.

On line 17, we use a template filter slice to cut off the post body at 400 characters so that the blog index is more readable.

Once that’s in place, you should be able to access this page by visiting localhost:8000/blog:

Blog Index View

Next, create another HTML file blog/templates/blog_category.html where your blog_category template will live. This should be identical to blog_index.html, except with the category name inside the h1 tag instead of Blog Index:

{% extends "base.html" %} {% block page_content %} <div class="col-md-8 offset-md-2">     <h1>{{ category | title }}</h1>     <hr>     {% for post in posts %}         <h2><a href="{% url 'blog_detail' post.pk%}">{{ post.title }}</a></h2>         <small>             {{ post.created_on.date }} |&nbsp;             Categories:&nbsp;             {% for category in post.categories.all %}             <a href="{% url 'blog_category' category.name %}">                 {{ category.name }}             </a>&nbsp;             {% endfor %}         </small>         <p>{{ post.body | slice:":400" }}...</p>     {% endfor %} </div> {% endblock %} 

Most of this template is identical to the previous template. The only difference is on line 4, where we use another Django template filter title. This applies titlecase to the string and makes words start with an uppercase character.

With that template finished, you’ll be able to access your category view. If you defined a category named python, you should be able to visit localhost:8000/blog/python and see all the posts with that category:

Blog Category View

The last template to create is the post_detail template. In this template, you’ll display the title and full body of a post.

Between the title and the body of the post, you’ll display the date the post was created and any categories. Underneath that, you’ll include a comments form so users can add a new comment. Under this, there will be a list of comments that have already been left:

 1 {% extends "base.html" %}  2 {% block page_content %}  3 <div class="col-md-8 offset-md-2">  4     <h1>{{ post.title }}</h1>  5     <small>  6         {{ post.created_on.date }} |&nbsp;  7         Categories:&nbsp;  8         {% for category in post.categories.all %}  9         <a href="{% url 'blog_category' category.name %}"> 10             {{ category.name }} 11         </a>&nbsp; 12         {% endfor %} 13     </small> 14     <p>{{ post.body | linebreaks }}</p> 15     <h3>Leave a comment:</h3> 16     <form action="/blog/{{ post.pk }}/" method="post"> 17         {% csrf_token %} 18         <div class="form-group"> 19             {{ form.author }} 20         </div> 21         <div class="form-group"> 22             {{ form.body }} 23         </div> 24         <button type="submit" class="btn btn-primary">Submit</button> 25     </form> 26     <h3>Comments:</h3> 27     {% for comment in comments %} 28     <p> 29         On {{comment.created_on.date }}&nbsp; 30         <b>{{ comment.author }}</b> wrote: 31     </p> 32     <p>{{ comment.body }}</p> 33     <hr> 34     {% endfor %} 35 </div> 36 {% endblock %} 

The first few lines of the template in which we display the post title, date, and categories is the same logic as for the previous templates. This time, when rendering the post body, use a linebreaks template filter. This tag registers line breaks as new paragraphs, so the body doesn’t appear as one long block of text.

Underneath the post, on line 16, you’ll display your form. The form action points to the URL path of the page to which you’re sending the POST request to. In this case, it’s the same as the page that is currently being visited. You then add a csrf_token, which provides security and renders the body and author fields of the form, followed by a submit button.

To get the bootstrap styling on the author and body fields, you need to add the form-control class to the text inputs.

Because Django renders the inputs for you when you include {{ form.body }} and {{ form.author }}, you can’t add these classes in the template. That’s why you added the attributes to the form widgets in the previous section.

Underneath the form, there’s another for loop that loops over all the comments on the given post. The comments, body, author, and created_on attributes are all displayed.

Once that template is in place, you should be able to visit localhost:8000/blog/1 and view your first post:

Blog Detail View

You should also be able to access the post detail pages by clicking on their title in the blog_index view.

The final finishing touch is to add a link to the blog_index to the navigation bar in base.html. This way, when you click on Blog in the navigation bar, you’ll be able to visit the blog. Check out the updates to base.html in the source code to see how to add that link.

With that now in place, your personal portfolio site is complete, and you’ve created your first Django site. The final version of the source code containing all the features can be found on GitHub, so check it out! Click around the site a bit to see all the functionality and try leaving some comments on your posts!

You may find a few things here and there that you think need polishing. Go ahead and tidy them up. The best way to learn more about this web framework is through practice, so try to extend this project and make it even better! If you’re not sure where to start, I’ve left a few ideas for you in the conclusion below!

Conclusion

Congratulations, you’ve reached the end of the tutorial! We’ve covered a lot, so make sure to keep practicing and building. The more you build the easier it will become and the less you’ll have to refer back to this article or the documentation. You’ll be building sophisticated web applications in no time.

In this tutorial you’ve seen:

  • How to create Django projects and apps
  • How to add web pages with views and templates
  • How to get user input with forms
  • How to hook your views and templates up with URL configurations
  • How to add data to your site using relational databases with Django’s Object Relational Mapper
  • How to use the Django Admin to manage your models

In addition, you’ve learned about the MVT structure of Django web applications and why Django is such a good choice for web development.

If you want to learn more about Django, do check out the documentation and make sure to stay tuned for Part 2 of this series!


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Planet Python