The Code Bits: Introduction to Generators in Python

In this post, we will learn what generators are, how to create them, how they work and how to use them in Python.

Generator function

Generators are functions that allow us to create iterators in Python. They provide a convenient, simple and memory-efficient approach to creating iterators. These are useful when dealing with large amounts of data.

Before starting with generators, it would be good to understand how a for-loop works in Python. It will be also be useful to know what iterable, iterator and the iterator protocol are.

An example: Generate even numbers using a generator function

Let us start with a simple example. We will be creating a generator function which generates a specific count of even numbers starting from a given value. We will be using this same example throughout this post.

def generate_even_numbers(start, count):     # Make sure that the first number is even.     start = start if start % 2 == 0 else start + 1      while count > 0:         yield start         start += 2         count -= 1 

Note that we used a yield statement within the function body to return our data. If you don’t understand it right away, no need to worry, we will get to its roots soon enough!

Let us see how we would use this generator function in a for-loop.

>>> generator_iterator = generate_even_numbers(0, 3) >>> for num in generator_iterator: ...     print(num) ... 0 2 4 

As you can see, we were able to use the value returned by the generator function in a for-loop, so it must have been an iterable.

Generator function returns a generator iterator

Let us check the type of the value returned by the generator function.

>>> generator_iterator = generate_even_numbers(0, 3) >>> type(generator_iterator) <class 'generator'> 

Okay, so the value returned is of type ‘generator’. This value is usually referred to as the generator iterator, even though the term generator is sometimes used interchangeably to refer to both the generator function as well as the generator iterator.

Now let us confirm that the generator_iterator is indeed an iterator. As per the iterator protocol, an iterator must:

  1. return its elements one by one when next() method is called on it. When all the elements are exhausted, it must raise StopIteration.
  2.  >>> generator_iterator = generate_even_numbers(0, 3) >>> next(generator_iterator) 0 >>> next(generator_iterator) 2 >>> next(generator_iterator) 4 >>> next(generator_iterator) Traceback (most recent call last):   File "", line 1, in  StopIteration 
  3. return itself when iter() method is called on it.
  4.  >>> generator_iterator = generate_even_numbers(0, 3) >>> generator_iterator <generator object generate_even_numbers at 0x10cb431b0> >>> iter(generator_iterator) <generator object generate_even_numbers at 0x10cb431b0> 

So now we know that the generator function is a convenient way to create an iterator. But what makes this function different from our normal methods in Python? How does it return an iterator? The answer lies in the yield statement.

How does the generator function work?

Let us revisit our generator example, now with some prints so that we can clearly understand how it works.

def generate_even_numbers(start, count):     print("In the generator function")      # Make sure that the first number is even.     start = start if start % 2 == 0 else start + 1      while count > 0:         print("[count:{}] Hello! Before I yield....".format(count))         yield start         print("[count:{}] Hey! I am back!!".format(count))         start += 2         count -= 1     print("[count:{}] That's all I have got...".format(count)) 

Now let us see its usage.

>>> generator_iterator = generate_even_numbers(3, 2) >>> for num in generator_iterator: ...     print("Processing even number: {}".format(num)) ... In the generator function [count:2] Hello! Before I yield.... Processing even number: 4 [count:2] Hey! I am back!! [count:1] Hello! Before I yield.... Processing even number: 6 [count:1] Hey! I am back!! [count:0] That's all I have got... 

There are a couple of things you should notice:

  1. Lazy evaluation
  2. How the yield statement works

Let us discuss these.

Lazy evaluation

Calling the function generate_even_numbers(3, 2) just returns a generator iterator. It does not start executing the function. This is called lazy evaluation. They start executing and yielding values only when it is needed, that is, when next() is called. As a result, only one element of the iterator is held in memory at a time. This makes them memory efficient and hence useful when dealing with large amounts of data.

How does the yield statement work?

By now, you may have gathered that the only special thing about the generator function with respect to normal functions is that they use yield to return their values. However, the yield statement is very much different from a normal return statement.

The yield statement makes a function a generator.

When next() is called on the generator iterator, the generator function executes till a yield statement is encountered. When the yield statement is reached, the execution state of the function is remembered (including the local variables and any try statements) and the function’s execution is temporarily suspended. The value associated with the yield statement is returned by the next() method.

When next() is called again, the generator function resumes execution. The saved local execution state is recollected and the statement next to yield is executed first. Then it continues executing till the next yield statement is encountered. Thus goes the process.

Finally, if there is no more yield in the generator function when next() is called, it ends up raising StopIteration. At this point, the for-loop would exit.

A simpler example: Generator function to yield some strings

Let us make sure that all of that is clear with a simpler example.

def generate_hello_world():     print("....Started executing the generator function")     yield "Hello"     print("....Between yields!")     yield "World"     print("....Done with yields!") 

Let us see how to use the iterator returned by the generator function using next() method.

 >>> """ We get the generator iterator """ >>> generator_iterator = generate_hello_world()  >>> """ When next() is called, the function executes till the first yield statement """ >>> next(generator_iterator) ....Started executing the generator function 'Hello'  >>> """ When next() is called again, it picks up where it left off and executes till the next yield statement """ >>> next(generator_iterator) ....Between yields! 'World'  >>> """ When there are no more yields, calling next() raises StopIteration """ >>> next(generator_iterator) ....Done with yields! Traceback (most recent call last):   File "", line 1, in  StopIteration 

Now let us see how to use the generator function in a for-loop.

>>> for word in generate_hello_world(): ...     print(word) ... ....Started executing the generator function Hello ....Between yields! World ....Done with yields! >>> 

On a side note, pay attention to how we did not use a separate variable to hold the generator iterator as in our previous examples. We directly called the generator function with the for-loop. This is doable because of how a for-loop works in Python. The expression following “in” is evaluated only once. This expression is expected to result in an iterable. In this case, it will result in the generator iterator. Then the method iter() is called on the iterable to get the iterator associated with it. Then next() is called repeatedly on the iterator until the iterator is exhausted.

Conclusion

In this post, we learned how to create generator functions in Python, how they work and how to use them.

Planet Python

Real Python: How to Work With a PDF in Python

The Portable Document Format or PDF is a file format that can be used to present and exchange documents reliably across operating systems. While the PDF was originally invented by Adobe, it is now an open standard that is maintained by the International Organization for Standardization (ISO). You can work with a preexisting PDF in Python by using the PyPDF2 package.

PyPDF2 is a pure-Python package that you can use for many different types of PDF operations.

By the end of this article, you’ll know how to do the following:

  • Extract document information from a PDF in Python
  • Rotate pages
  • Merge PDFs
  • Split PDFs
  • Add watermarks
  • Encrypt a PDF

Let’s get started!

Free Bonus: Click here to get access to a chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

History of pyPdf, PyPDF2, and PyPDF4

The original pyPdf package was released way back in 2005. The last official release of pyPdf was in 2010. After a lapse of around a year, a company called Phasit sponsored a fork of pyPdf called PyPDF2. The code was written to be backwards compatible with the original and worked quite well for several years, with its last release being in 2016.

There was a brief series of releases of a package called PyPDF3, and then the project was renamed to PyPDF4. All of these projects do pretty much the same thing, but the biggest difference between pyPdf and PyPDF2+ is that the latter versions added Python 3 support. There is a different Python 3 fork of the original pyPdf for Python 3, but that one has not been maintained for many years.

While PyPDF2 was recently abandoned, the new PyPDF4 does not have full backwards compatibility with PyPDF2. Most of the examples in this article will work perfectly fine with PyPDF4, but there are some that cannot, which is why PyPDF4 is not featured more heavily in this article. Feel free to swap out the imports for PyPDF2 with PyPDF4 and see how it works for you.

pdfrw: An Alternative

Patrick Maupin created a package called pdfrw that can do many of the same things that PyPDF2 does. You can use pdfrw for all of the same sorts of tasks that you will learn how to do in this article for PyPDF2, with the notable exception of encryption.

The biggest difference when it comes to pdfrw is that it integrates with the ReportLab package so that you can take a preexisting PDF and build a new one with ReportLab using some or all of the preexisting PDF.

Installation

Installing PyPDF2 can be done with pip or conda if you happen to be using Anaconda instead of regular Python.

Here’s how you would install PyPDF2 with pip:

$   pip install pypdf2 

The install is quite quick as PyPDF2 does not have any dependencies. You will likely spend as much time downloading the package as you will installing it.

Now let’s move on and learn how to extract some information from a PDF.

How to Extract Document Information From a PDF in Python

You can use PyPDF2 to extract metadata and some text from a PDF. This can be useful when you’re doing certain types of automation on your preexisting PDF files.

Here are the current types of data that can be extracted:

  • Author
  • Creator
  • Producer
  • Subject
  • Title
  • Number of pages

You need to go find a PDF to use for this example. You can use any PDF you have handy on your machine. To make things easy, I went to Leanpub and grabbed a sample of one of my books for this exercise. The sample you want to download is called reportlab-sample.pdf.

Let’s write some code using that PDF and learn how you can get access to these attributes:

# extract_doc_info.py  from PyPDF2 import PdfFileReader  def extract_information(pdf_path):     with open(pdf_path, 'rb') as f:         pdf = PdfFileReader(f)         information = pdf.getDocumentInfo()         number_of_pages = pdf.getNumPages()      txt = f"""     Information about {pdf_path}:       Author: {information.author}     Creator: {information.creator}     Producer: {information.producer}     Subject: {information.subject}     Title: {information.title}     Number of pages: {number_of_pages}     """      print(txt)     return information  if __name__ == '__main__':     path = 'reportlab-sample.pdf'     extract_information(path) 

Here you import PdfFileReader from the PyPDF2 package. The PdfFileReader is a class with several methods for interacting with PDF files. In this example, you call .getDocumentInfo(), which will return an instance of DocumentInformation. This contains most of the information that you’re interested in. You also call .getNumPages() on the reader object, which returns the number of pages in the document.

Note: That last code block uses Python 3’s new f-strings for string formatting. If you’d like to learn more, you can check out Python 3’s f-Strings: An Improved String Formatting Syntax (Guide).

The information variable has several instance attributes that you can use to get the rest of the metadata you want from the document. You print out that information and also return it for potential future use.

While PyPDF2 has .extractText(), which can be used on its page objects (not shown in this example), it does not work very well. Some PDFs will return text and some will return an empty string. When you want to extract text from a PDF, you should check out the PDFMiner project instead. PDFMiner is much more robust and was specifically designed for extracting text from PDFs.

Now you’re ready to learn about rotating PDF pages.

How to Rotate Pages

Occasionally, you will receive PDFs that contain pages that are in landscape mode instead of portrait mode. Or perhaps they are even upside down. This can happen when someone scans a document to PDF or email. You could print the document out and read the paper version or you can use the power of Python to rotate the offending pages.

For this example, you can go and pick out a Real Python article and print it to PDF.

Let’s learn how to rotate a few of the pages of that article with PyPDF2:

# rotate_pages.py  from PyPDF2 import PdfFileReader, PdfFileWriter  def rotate_pages(pdf_path):     pdf_writer = PdfFileWriter()     pdf_reader = PdfFileReader(path)     # Rotate page 90 degrees to the right     page_1 = pdf_reader.getPage(0).rotateClockwise(90)     pdf_writer.addPage(page_1)     # Rotate page 90 degrees to the left     page_2 = pdf_reader.getPage(1).rotateCounterClockwise(90)     pdf_writer.addPage(page_2)     # Add a page in normal orientation     pdf_writer.addPage(pdf_reader.getPage(2))      with open('rotate_pages.pdf', 'wb') as fh:         pdf_writer.write(fh)  if __name__ == '__main__':     path = 'Jupyter_Notebook_An_Introduction.pdf'     rotate_pages(path) 

For this example, you need to import the PdfFileWriter in addition to PdfFileReader because you will need to write out a new PDF. rotate_pages() takes in the path to the PDF that you want to modify. Within that function, you will need to create a writer object that you can name pdf_writer and a reader object called pdf_reader.

Next, you can use .GetPage() to get the desired page. Here you grab page zero, which is the first page. Then you call the page object’s .rotateClockwise() method and pass in 90 degrees. Then for page two, you call .rotateCounterClockwise() and pass it 90 degrees as well.

Note: The PyPDF2 package only allows you to rotate a page in increments of 90 degrees. You will receive an AssertionError otherwise.

After each call to the rotation methods, you call .addPage(). This will add the rotated version of the page to the writer object. The last page that you add to the writer object is page 3 without any rotation done to it.

Finally you write out the new PDF using .write(). It takes a file-like object as its parameter. This new PDF will contain three pages. The first two will be rotated in opposite directions of each other and be in landscape while the third page is a normal page.

Now let’s learn how you can merge multiple PDFs into one.

How to Merge PDFs

There are many situations where you will want to take two or more PDFs and merge them together into a single PDF. For example, you might have a standard cover page that needs to go on to many types of reports. You can use Python to help you do that sort of thing.

For this example, you can open up a PDF and print a page out as a separate PDF. Then do that again, but with a different page. That will give you a couple of inputs to use for example purposes.

Let’s go ahead and write some code that you can use to merge PDFs together:

# pdf_merging.py  from PyPDF2 import PdfFileReader, PdfFileWriter  def merge_pdfs(paths, output):     pdf_writer = PdfFileWriter()      for path in paths:         pdf_reader = PdfFileReader(path)         for page in range(pdf_reader.getNumPages()):             # Add each page to the writer object             pdf_writer.addPage(pdf_reader.getPage(page))      # Write out the merged PDF     with open(output, 'wb') as out:         pdf_writer.write(out)  if __name__ == '__main__':     paths = ['document1.pdf', 'document2.pdf']     merge_pdfs(paths, output='merged.pdf') 

You can use merge_pdfs() when you have a list of PDFs that you want to merge together. You will also need to know where to save the result, so this function takes a list of input paths and an output path.

Then you loop over the inputs and create a PDF reader object for each of them. Next you will iterate over all the pages in the PDF file and use .addPage() to add each of those pages to itself.

Once you’re finished iterating over all of the pages of all of the PDFs in your list, you will write out the result at the end.

One item I would like to point out is that you could enhance this script a bit by adding in a range of pages to be added if you didn’t want to merge all the pages of each PDF. If you’d like a challenge, you could also create a command line interface for this function using Python’s argparse module.

Let’s find out how to do the opposite of merging!

How to Split PDFs

There are times where you might have a PDF that you need to split up into multiple PDFs. This is especially true of PDFs that contain a lot of scanned-in content, but there are a plethora of good reasons for wanting to split a PDF.

Here’s how you can use PyPDF2 to split your PDF into multiple files:

# pdf_splitting.py  from PyPDF2 import PdfFileReader, PdfFileWriter  def split(path, name_of_split):     pdf = PdfFileReader(path)     for page in range(pdf.getNumPages()):         pdf_writer = PdfFileWriter()         pdf_writer.addPage(pdf.getPage(page))          output = f'{name_of_split}{page}.pdf'         with open(output, 'wb') as output_pdf:             pdf_writer.write(output_pdf)  if __name__ == '__main__':     path = 'Jupyter_Notebook_An_Introduction.pdf'     split(path, 'jupyter_page') 

In this example, you once again create a PDF reader object and loop over its pages. For each page in the PDF, you will create a new PDF writer instance and add a single page to it. Then you will write that page out to a uniquely named file. When the script is finished running, you should have each page of the original PDF split into separate PDFs.

Now let’s take a moment to learn how you can add a watermark to your PDF.

How to Add Watermarks

Watermarks are identifying images or patterns on printed and digital documents. Some watermarks can only be seen in special lighting conditions. The reason watermarking is important is that it allows you to protect your intellectual property, such as your images or PDFs. Another term for watermark is overlay.

You can use Python and PyPDF2 to watermark your documents. You need to have a PDF that only contains your watermark image or text.

Let’s learn how to add a watermark now:

# pdf_watermarker.py  from PyPDF2 import PdfFileWriter, PdfFileReader  def create_watermark(input_pdf, output, watermark):     watermark_obj = PdfFileReader(watermark)     watermark_page = watermark_obj.getPage(0)      pdf_reader = PdfFileReader(input_pdf)     pdf_writer = PdfFileWriter()      # Watermark all the pages     for page in range(pdf_reader.getNumPages()):         page = pdf_reader.getPage(page)         page.mergePage(watermark_page)         pdf_writer.addPage(page)      with open(output, 'wb') as out:         pdf_writer.write(out)  if __name__ == '__main__':     create_watermark(         input_pdf='Jupyter_Notebook_An_Introduction.pdf',          output='watermarked_notebook.pdf',         watermark='watermark.pdf') 

create_watermark() accepts three arguments:

  1. input_pdf: the PDF file path to be watermarked
  2. output: the path you want to save the watermarked version of the PDF
  3. watermark: a PDF that contains your watermark image or text

In the code, you open up the watermark PDF and grab just the first page from the document as that is where your watermark should reside. Then you create a PDF reader object using the input_pdf and a generic pdf_writer object for writing out the watermarked PDF.

The next step is to iterate over the pages in the input_pdf. This is where the magic happens. You will need to call .mergePage() and pass it the watermark_page. When you do that, it will overlay the watermark_page on top of the current page. Then you add that newly merged page to your pdf_writer object.

Finally, you write the newly watermarked PDF out to disk, and you’re done!

The last topic you will learn about is how PyPDF2 handles encryption.

How to Encrypt a PDF

PyPDF2 currently only supports adding a user password and an owner password to a preexisting PDF. In PDF land, an owner password will basically give you administrator privileges over the PDF and allow you to set permissions on the document. On the other hand, the user password just allows you to open the document.

As far as I can tell, PyPDF2 doesn’t actually allow you to set any permissions on the document even though it does allow you to set the owner password.

Regardless, this is how you can add a password, which will also inherently encrypt the PDF:

# pdf_encrypt.py  from PyPDF2 import PdfFileWriter, PdfFileReader  def add_encryption(input_pdf, output_pdf, password):     pdf_writer = PdfFileWriter()     pdf_reader = PdfFileReader(input_pdf)      for page in range(pdf_reader.getNumPages()):         pdf_writer.addPage(pdf_reader.getPage(page))      pdf_writer.encrypt(user_pwd=password, owner_pwd=None,                         use_128bit=True)      with open(output_pdf, 'wb') as fh:         pdf_writer.write(fh)  if __name__ == '__main__':     add_encryption(input_pdf='reportlab-sample.pdf',                    output_pdf='reportlab-encrypted.pdf',                    password='twofish') 

add_encryption() takes in the input and output PDF paths as well as the password that you want to add to the PDF. It then opens a PDF writer and a reader object, as before. Since you will want to encrypt the entire input PDF, you will need to loop over all of its pages and add them to the writer.

The final step is to call .encrypt(), which takes the user password, the owner password, and whether or not 128-bit encryption should be added. The default is for 128-bit encryption to be turned on. If you set it to False, then 40-bit encryption will be applied instead.

Note: PDF encryption uses either RC4 or AES (Advanced Encryption Standard) to encrypt the PDF according to pdflib.com.

Just because you have encrypted your PDF does not mean it is necessarily secure. There are tools to remove passwords from PDFs. If you’d like to learn more, Carnegie Mellon University has an interesting paper on the topic.

Conclusion

The PyPDF2 package is quite useful and is usually pretty fast. You can use PyPDF2 to automate large jobs and leverage its capabilities to help you do your job better!

In this tutorial, you learned how to do the following:

  • Extract metadata from a PDF
  • Rotate pages
  • Merge and split PDFs
  • Add watermarks
  • Add encryption

Also keep an eye on the newer PyPDF4 package as it will likely replace PyPDF2 soon. You might also want to check out pdfrw, which can do many of the same things that PyPDF2 can do.

Further Reading

If you’d like to learn more about working with PDFs in Python, you should check out some of the following resources for more information:


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Planet Python

codingdirectional: Reverse a number with Python

In this snippet, we are going to create a python method to reverse the order of a number. This is one of the questions on Codewars. If you enter -123 into the method you will get -321. If you enter 1000 into the method you will get 1. Below is the entire solution.

 def reverse_number(n):     num_list = list(str(n))     num_list.reverse()      if "-" in num_list:         num_list.pop(len(num_list)-1)         num_list.insert(0, "-")      return int("".join(num_list)) 

If you do follow my website you know that I always write simple python code and post them here, but starting from the next article, I will stop posting python code for a while and start to talk about my python journey and the cool software which is related to python. I hope you will appreciate this new style of writing and thus will make learning python for everyone a lot more fun than just staring at the boring and sometimes long python snippet.

Like, share or follow me on Twitter.

If you have any solution for this problem do comment below.

Planet Python

Stack Abuse: Introduction to the Python Calendar Module

Introduction

Python has an built-in module named Calendar that contains useful classes and functions to support a variety of calendar operations. By default, the Calendar module follows the Gregorian calendar, where Monday is the first day (0) of the week and Sunday is the last day of the week (6).

In Python, datetime and time modules also provide low-level calendar-related functionalities. In addition to these modules, the Calendar module provides essential functions related to displaying and manipulating calendars.

To print and manipulate calendars, the Calendar module has 3 important classes: Calendar, TextCalendar, and HTMLCalendar. In this article, we will see how these classes can help implement a variety of calendar related functions.

Functionalities of the Calendar Module

To use the Calendar module, we need to first import the module using:

import calendar   

Let’s take a look at the list of useful functions in this module.

Printing Calendar for a Specific Month

We can print the calendar for a specific month, by using the below function:

calendar.month(yyyy, m, w, l)   

The arguments passed to this function are the year (yyyy), month (m), date column width (w), and the number of lines per week (l), respectively. For example, let’s use this function to print the calendar of March, 2019:

print ("Calendar of March 2019 is:")   print (calendar.month(2019, 3, 2, 1))   

Output:

Calendar of March 2019 is:        March 2019 Mo Tu We Th Fr Sa Su                1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17   18 19 20 21 22 23 24   25 26 27 28 29 30 31   

Printing Calendar for a Specific Year

We can print the calendar for a whole year, using the below function:

calendar.calendar(yyyy, w, l, c, m)   

The above function returns the calendar for the entire year, for the year specified as an argument. The arguments passed to this function are the year (yyyy), date column width (w), number of lines per week (l), number of spaces between month’s column (c), number of columns (m).

For example, to print the calendar of the year 2019, use:

print(calendar.calendar(2019, 2, 2, 6, 3))   

Output:

January                   February                   March  Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      1  2  3  4  5  6                   1  2  3                   1  2  3   7  8  9 10 11 12 13       4  5  6  7  8  9 10       4  5  6  7  8  9 10  14 15 16 17 18 19 20      11 12 13 14 15 16 17      11 12 13 14 15 16 17  21 22 23 24 25 26 27      18 19 20 21 22 23 24      18 19 20 21 22 23 24  28 29 30 31               25 26 27 28               25 26 27 28 29 30 31           April                      May                       June  Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su   1  2  3  4  5  6  7             1  2  3  4  5                      1  2   8  9 10 11 12 13 14       6  7  8  9 10 11 12       3  4  5  6  7  8  9  15 16 17 18 19 20 21      13 14 15 16 17 18 19      10 11 12 13 14 15 16  22 23 24 25 26 27 28      20 21 22 23 24 25 26      17 18 19 20 21 22 23  29 30                     27 28 29 30 31            24 25 26 27 28 29 30            July                     August                  September  Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su   1  2  3  4  5  6  7                1  2  3  4                         1   8  9 10 11 12 13 14       5  6  7  8  9 10 11       2  3  4  5  6  7  8  15 16 17 18 19 20 21      12 13 14 15 16 17 18       9 10 11 12 13 14 15  22 23 24 25 26 27 28      19 20 21 22 23 24 25      16 17 18 19 20 21 22  29 30 31                  26 27 28 29 30 31         23 24 25 26 27 28 29                                                      30          October                   November                  December  Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      1  2  3  4  5  6                   1  2  3                         1   7  8  9 10 11 12 13       4  5  6  7  8  9 10       2  3  4  5  6  7  8  14 15 16 17 18 19 20      11 12 13 14 15 16 17       9 10 11 12 13 14 15  21 22 23 24 25 26 27      18 19 20 21 22 23 24      16 17 18 19 20 21 22  28 29 30 31               25 26 27 28 29 30         23 24 25 26 27 28 29                                                      30 31 

Note: Instead of using “print”, we can alternately use calendar.prmonth() and calendar.pryear() functions to print the month and year calendars. These functions print the output on your terminal.

Checking for a Leap Year

We can use isleap() function to check if a year is a leap year or not. The year is passed as an argument to the function and the function returns True if the year is a leap, otherwise it returns False if the year is not a leap. Let’s use this function to see if the year 2016 is leap:

calendar.isleap(2016)   

Output:

True   

Number of Leap Years within Range

It is also possible to check the number of leap years in a given range of years, specified as an argument to the below function:

calendar.leapdays(year1, year2)   

The arguments passed to the function are 2 valid year values. This function returns the number of leap years between those years.

Example:

calendar.leapdays(2000, 2017)   

Output:

5   

As seen, there are 5 leap years between 2000 and 2017, hence the output is 5.

Return the Day of a Week

The weekday method takes 3 arguments, namely: year, month, and day. The function returns the day of a week, with Monday having an index of 0 and Sunday having an index of 6. For example:

calendar.weekday(2019, 3, 21)   

Output:

3   

As seen, this function returns index value “3”, which is “Thursday”.

Getting Abbreviated Weekday Names

The function weekheader takes an argument n, which specifies the number of characters for a particular weekday name and returns a header containing abbreviated weekday names.

For example:

print (calendar.weekheader(2))   

Output:

Mo Tu We Th Fr Sa Su   

Similarly,

print (calendar.weekheader(3))   

Output:

Mon Tue Wed Thu Fri Sat Sun   

Getting Number of Days in a Month

The monthrange function takes 2 arguments: year and month. This function returns a tuple containing the index of the day of the week in which the month starts and the number of days in the month.

For example:

print (calendar.monthrange(1983, 12))   

Output:

{3,31} 

Since the first day of December, 1983 was a Thursday, the function returns index value of Thursday as the first element of the tuple, and 31 since that is the number of days in December.

Get the Weeks in a Month

The monthcalendar function takes 2 arguments: year and month and returns a matrix, in which each row represents a week in that month.

For example:

print(calendar.monthcalendar(1983, 11))   

Output:

[[0,1,2,3,4,5,6], [7,8,9,10,11,12,13], [14,15,16,17,18,19,20], [21,22,23,24,25,26,27], [28,19,30,0,0,0]] 

As you can see, each week array begins with Monday and days outside of the month are represented with zeroes. So the first array indicates that the first day of the month is a Tuesday.

Modifying Default Settings

Default calendar settings can be modified to fit your needs. For example, the following script sets Monday as the first day of the week.

class calendar.calendar(firstweekday=0)   

By default, calendars follow European convention, having Monday as the first day of the week and Sunday as the last day of the week. Also, the month January has the index value 1 and December has the index value 12.

Useful Methods of the Calendar Class

The following are some of the most useful methods of the calendar class.

The iterweekdays() Method

This method returns an iterator that contains a list of indexes for the days in a week.

For example:

import calendar  c = calendar.Calendar()   for i in c.iterweekdays():       print (i, end=" ") 

Output:

0 1 2 3 4 5 6   

The itermonthdates() Method

The itermonthdates() takes 2 arguments: year and month. This function returns an iterator of all days of the given month. Also, all days before the start of the month and after the end of the month, required to get the complete week, are displayed.

Example:

import calendar  c = calendar.Calendar()   for i in c.itermonthdates (2019, 1):       print (i, end=" ") 

Output:

2018-12-31 2019-01-01 2019-01-02 2019-01-03 ..............2019-02-03   

The itermonthdays() Method

This method is similar to itermonthdates method, but it only returns the day numbers.

Example:

import calendar  c = calendar.Calendar()   for i in c.itermonthdays (2019, 1):       print (i, end=" ") 

Output:

0 1 2 3 4 5 6........ 31 0 0 0   

As you can see, all days before the start of the month and after the end of the month to get the complete week are set to “0”.

The itermonthdays2() Method

This method displays a tuple consisting of day and weekday numbers.

Example:

import calendar  c = calendar.Calendar()   for i in c.itermonthdays2 (2019, 1):       print (i, end=" ") 

Output:

(0,0) (1,1) (2,2) (3,3) (4,4) (5,5) (6,6) (7,0) (8,1) (9,2) ........... 

The itermonthdays3() Method

This method is pretty similar to the itermonthdays3() method, except that it returns a tuple of year, month, and the day of the month.

Example:

import calendar  c = calendar.Calendar()   for i in c.itermonthdays3 (2019, 1):       print (i, end=" ") 

Output:

(2018,12,31) (2019,01,01) (2019,01,02).....(2019,01,31) (2019,02,01) (2019,02,02) (2019,02,03) 

The monthdatescalendar() Method

This method takes year and month as arguments and returns a list of full weeks in the month. Each week is a list of 7 datetime.date objects.

Example:

import calendar  c = calendar.Calendar()   for i in c.monthdatescalendar (2019, 1):       print (i, end=" ") 

Output:

[datetime.date(2018, 12, 31), datetime.date(2019, 01, 01), datetime.date(2019, 01, 02), datetime.date(2019, 01, 03), datetime.date(2019, 01, 04), datetime.date(2019, 01, 05), datetime.date(2019, 01, 06)... datetime.date(2019, 02, 03)] ..... 

The monthdays2calendar() Method

This function takes year and month as arguments and returns a list of weeks, with each week as 7 tuples of the day of month and day of the week.

Example:

import calendar  c = calendar.Calendar()   for i in c.monthdays2calendar (2019, 1):       print(i, end=" ") 

Output:

[(0,0) (1,1) (2,2) (3,3) (4,4) (5,5) (6,6)] [(7,0) (8,1) (9,2) (10,3) (11,4) (12,5) (13,6)] .... 

As you see, the first value of the tuple is the day of the month (0-31) and second value of the tuple is the week number (0-6)

The monthdayscalendar() Method

This method takes year and month as arguments and returns a list of full weeks, with each week being a list of days of a month.

Example:

import calendar  c = calendar.Calendar()   for i in c.monthdayscalendar(2019, 1):       print (i, end=" ") 

Sample Output:

[0, 1, 2 , 3, 4, 5, 6] [7, 8, 9, 10, 11, 12, 13]....[28, 29, 30, 31, 0, 0, 0] 

The yeardatescalendar() Method

This function takes the year (yyyy) and the number of months in a month row (w). By default, w parameter is 3. The function returns a list of month rows, where days are datetime.date objects.

Example:

import calendar  c = calendar.Calendar()   for i in c.yeardatescalendar(2019, 3):       print (i, end=" ") 

Output:

[[[datetime.date(2018, 12, 31), datetime.date(2019, 1, 1), datetime.date(2019, 1, 2), datetime.date(2019, 1, 3), datetime.date(2019, 1, 4), datetime.date(2019, 1, 5), datetime.date(2019, 1, 6)], [datetime.date(2019, 1, 7), datetime.date(2019, 1, 8), datetime.date(2019, 1, 9), datetime.date(2019, 1, 10), datetime.date(2019, 1, 11), datetime.date(2019, 1, 12), datetime.date(2019, 1, 13)], [datetime.date(2019, 1, 14), datetime.date(2019, 1, 15), datetime.date(2019, 1, 16), datetime.date(2019, 1, 17), datetime.date(2019, 1, 18), datetime.date(2019, 1, 19), datetime.date(2019, 1, 20)], [datetime.date(2019, 1, 21), datetime.date(2019, 1, 22), datetime.date(2019, 1, 23), datetime.date(2019, 1, 24), datetime.date(2019, 1, 25), datetime.date(2019, 1, 26), datetime.date(2019, 1, 27)], [datetime.date(2019, 1, 28), datetime.date(2019, 1, 29), datetime.date(2019, 1, 30), datetime.date(2019, 1, 31), datetime.date(2019, 2, 1), datetime.date(2019, 2, 2), datetime.date(2019, 2, 3)]] ... ] 

The yeardays2calendar() Method

This function takes the year (yyyy) and number of months we want in a month row (w). By default, the w parameter is 3. The function returns a list of weeks, as tuples of days of the month and the day of the week.

Example:

import calendar  c = calendar.Calendar()   for i in c.yeardays2calendar(2019, 3):       print (i, end=" ") 

Output:

[[[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)], [(7, 0), (8, 1), (9, 2), (10, 3), (11, 4), (12, 5), (13, 6)], [(14, 0), (15, 1), (16, 2), (17, 3), (18, 4), (19, 5), (20, 6)], [(21, 0), (22, 1), (23, 2), (24, 3), (25, 4), (26, 5), (27, 6)], [(28, 0), (29, 1), (30, 2), (31, 3), (0, 4), (0, 5), (0, 6)]], [[(0, 0), (0, 1), (0, 2), (0, 3), (1, 4), (2, 5), (3, 6)], [(4, 0), (5, 1), (6, 2), (7, 3), (8, 4), (9, 5), (10, 6)], [(11, 0), (12, 1), (13, 2), (14, 3), (15, 4), (16, 5), (17, 6)], [(18, 0), (19, 1), (20, 2), (21, 3), (22, 4), (23, 5), (24, 6)], [(25, 0), (26, 1), (27, 2), (28, 3), (0, 4), (0, 5), (0, 6)]], [[(0, 0), (0, 1), (0, 2), (0, 3), (1, 4), (2, 5), (3, 6)] ... ]] 

The yeardayscalendar() Method

This function takes the year (yyyy) and the number of months we want in a month row (w). By default, w parameter is 3. The function returns a list of weeks as the day of the month.

Example:

import calendar  c = calendar.Calendar()   for i in c.yeardayscalendar(2019, 3):       print (i, end=" ") 

Output:

[[[0, 1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13], [14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27], [28, 29, 30, 31, 0, 0, 0]], [[0, 0, 0, 0, 1, 2, 3], [4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23, 24], [25, 26, 27, 28, 0, 0, 0]], [[0, 0, 0, 0, 1, 2, 3], [4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23, 24], [25, 26, 27, 28, 29, 30, 31]]] [[[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14], [15, 16, 17, 18, 19, 20, 21], [22, 23, 24, 25, 26, 27, 28], [29, 30, 0, 0, 0, 0, 0]] ... ]] 

The TextCalendar Class

The TextCalendar is used to generate plain text calendars. Similar to the Calendar class. This class takes a constructor where the first weekday is set to 0, by default. Let’s look at the methods provided by the TextCalendar class.

The formatmonth() Method

This method takes 4 arguments namely: year, month, the width of days column (w), and a number of lines used by each week (l). This method returns a multi-line string.

Example:

import calendar  c = calendar.TextCalendar()   print(c.formatmonth(2019, 1))   

This displays calendar of January, 2019.

Output:

    January 2019 Mo Tu We Th Fr Sa Su       1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20   21 22 23 24 25 26 27   28 29 30 31   

The prmonth() Method:

This method prints a month’s calendar as returned by the formatmonth method. We can use this function to avoid the use of “print” function, to print the calendar on the terminal.

To print the January, 2019 calendar, use:

c.prmonth(2019, 1)   

The formatyear() Method:

This method returns a “m” column calendar for the entire year. The arguments passed to this function are year (yyyy), date column width (w), number of lines per week (l), number of spaces between month’s column (c), number of columns (m).

The LocaleTextCalendar class:

This is a sub-class of TextCalendar class. Its constructor takes an additional argument, locale. It will return month and weekday names, in the specified locale. We can create a text calendar object in our native language. We can fetch month or weekdays or other data to display calendar formatted from the local system, other than the current default one. Example:

import calendar  for name in calendar.month_name:       print(name) 

This will print the name of the months, as per the local system.

Output:

January   February   March   April   May   June   July   August   September   October   November   December   

The HTMLCalendar Class:

This is similar to TextCalendar class, but, generates an HTML calendar. The constructor for this class has the firstweekday set to “0”.

Below are some of the methods provided by the HTMLCalendar class.

The formatmonth() method:

This function displays the calendar of a month, in a HTML table format. We can display April, 2019 calendar as a HTML table, using:

hc = calendar.HTMLCalendar()   print(hc.formatmonth(2019, 4))   

Output:

<table border="0" cellpadding="0" cellspacing="0" class="month">   <tr><th colspan="7" class="month">April 2019</th></tr>   <tr><th class="mon">Mon</th><th class="tue">Tue</th><th class="wed">Wed</th><th class="thu">Thu</th><th class="fri">Fri</th><th class="sat">Sat</th><th class="sun">Sun</th></tr>   <tr><td class="mon">1</td><td class="tue">2</td><td class="wed">3</td><td class="thu">4</td><td class="fri">5</td><td class="sat">6</td><td class="sun">7</td></tr>   <tr><td class="mon">8</td><td class="tue">9</td><td class="wed">10</td><td class="thu">11</td><td class="fri">12</td><td class="sat">13</td><td class="sun">14</td></tr>   <tr><td class="mon">15</td><td class="tue">16</td><td class="wed">17</td><td class="thu">18</td><td class="fri">19</td><td class="sat">20</td><td class="sun">21</td></tr>   <tr><td class="mon">22</td><td class="tue">23</td><td class="wed">24</td><td class="thu">25</td><td class="fri">26</td><td class="sat">27</td><td class="sun">28</td></tr>   <tr><td class="mon">29</td><td class="tue">30</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td></tr>   </table>   

The formatyear() method:

This method takes year and number of months in a row (w) as arguments and prints the entire year’s calendar as an HTML table. By default, the width is set to 3. We can display 2019 calendar as a HTML table using:

hc = calendar.HTMLCalendar()   print(hc.formatyear(2019, 4))   

The formatyearpage() method:

This method takes a year, number of months in a row (w), cascading style sheet (CSS), and encoding, as arguments. The css and encoding arguments can be set to None, in case we do not use CSS and encoding. This function displays an entire year’s calendar as an HTML page having default width of 3. We can print 2019 year’s calendar as a HTML page using:

hc = calendar.HTMLCalendar()   print(hc.formatyearpage(2019, 3, css=None, encoding=None))   
b'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\n<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />\n<title>Calendar for 2019</title>\n</head>\n<body>\n<table border="0" cellpadding="0" cellspacing="0" class="year">\n<tr><th colspan="3" class="year">2019</th></tr><tr><td><table border="0" cellpadding="0" cellspacing="0" class="month">\n<tr><th colspan="7" class="month">January</th></tr>\n<tr><th class="mon">Mon</th><th class="tue">Tue</th><th class="wed">Wed</th><th class="thu">Thu</th><th class="fri">Fri</th><th class="sat">Sat</th><th class="sun">Sun</th></tr>\n<tr><td class="noday">&nbsp;</td><td class="tue">1</td><td class="wed">2</td><td class="thu">3</td><td class="fri">4</td><td class="sat">5</td><td class="sun">6</td></tr> ... </table></body>\n</html>\n'   

The HTMLCalendar output looks similar to the plain text version, but it is wrapped with HTML tags. The cell of the HTML table contains a class attribute corresponding to the day of the week. Therefore, the HTML calendar can be styled through CSS.

The LocaleHTMLCalendar Class

This is a subclass of the HTMLCalendar class. Its constructor takes an additional argument, locale. It will return month and weekday names, in the specified locale as an HTML table. We can create a text calendar object in our native language. For example, we can generate the April 2019 calendar as an HTML table in ‘en_AU’ locale using:

import calendar  cal = calendar.LocaleHTMLCalendar(locale='en_AU.utf8')   print(cal.formatmonth(2019, 4))   

Output:

<table border="0" cellpadding="0" cellspacing="0" class="month">   <tr><th colspan="7" class="month">April 2019</th></tr>   <tr><th class="mon">Mon</th><th class="tue">Tue</th><th class="wed">Wed</th><th class="thu">Thu</th><th class="fri">Fri</th><th class="sat">Sat</th><th class="sun">Sun</th></tr>   <tr><td class="mon">1</td><td class="tue">2</td><td class="wed">3</td><td class="thu">4</td><td class="fri">5</td><td class="sat">6</td><td class="sun">7</td></tr>   <tr><td class="mon">8</td><td class="tue">9</td><td class="wed">10</td><td class="thu">11</td><td class="fri">12</td><td class="sat">13</td><td class="sun">14</td></tr>   <tr><td class="mon">15</td><td class="tue">16</td><td class="wed">17</td><td class="thu">18</td><td class="fri">19</td><td class="sat">20</td><td class="sun">21</td></tr>   <tr><td class="mon">22</td><td class="tue">23</td><td class="wed">24</td><td class="thu">25</td><td class="fri">26</td><td class="sat">27</td><td class="sun">28</td></tr>   <tr><td class="mon">29</td><td class="tue">30</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td><td class="noday">&nbsp;</td></tr>   </table>   

Conclusion

In this tutorial, we discussed the use of different classes and sub-classes of Calendar module in Python for working with dates to manage week/month/year oriented values. We also discussed the use of functions in the Python Calendar module. Along with this, we also implemented the TextCalendar and HTMLCalendar classes to produce pre-formatted output. I hope the tutorial was informative!

Planet Python