Robin Wilson: I am now a freelancer in Remote Sensing, GIS, Data Science & Python

I’ve been doing a bit of freelancing ‘on the side’ for a while – but now I’ve made it official: I am available for freelance work. Please look at my new website or contact me if you’re interested in what I can do for you, or carry on reading for more details.

Since I stopped working as an academic, and took time out to focus on my work and look after my new baby, I’ve been trying to find something which allows me to fit my work nicely around the rest of my life. I’ve done bits of short part-time work contracts, and various bits of freelance work – and I’ve now decided that freelancing is the way forward.

I’ve created a new freelance website which explains what I do and the experience I have – but to summarise here, my areas of focus are:

  • Remote Sensing – I am an expert at processing satellite and aerial imagery, and have processed time-series of thousands of images for a range of clients. I can help you produce useful information from raw satellite data, and am particularly experienced at atmospheric remote sensing and atmospheric correction.
  • GIS – I can process geographic data from a huge range of sources into a coherent data library, perform analyses and produce outputs in the form of static maps, webmaps and reports.
  • Data science – I have experience processing terabytes of data to produce insights which were used directly by the United Nations, and I can apply the same skills to processing your data: whether it is a single questionnaire or a huge automatically-generated dataset. I am particularly experienced at making research reproducible and self-documenting.
  • Python – I am an experienced Python programmer, and maintain a number of open-source modules (such as Py6S). I produce well-written, Pythonic code with high-quality tests and documentation.

The testimonials on my website show how much previous clients have valued the work I’ve done for them.

I’ve heard from a various people that they were rather put off by the nature of the auction that I ran for a day’s work from me – so if you were interested in working with me but wanted a standard sort of contract, and more than a day’s work, then please get in touch and we can discuss how we could work together.

(I’m aware that the last few posts on the blog have been focused on the auction for work, and this announcement of freelance work. Don’t worry – I’ve got some more posts lined up which are more along my usual lines. Stay tuned for posts on Leaflet webmaps and machine learning of large raster stacks)

Planet Python

Dashboard Confessions: Being a Data Nonconformist

I have no formal background in data or statistics. The skills and experience I’ve gained have been a direct result of working in analytics, figuring it out as I’ve gone along and working hard to solve problems about which I’m passionate. Big data, machine learning, complex algorithms—these are all now common associations with the world of analytics. While I have colleagues who specialize in these technical areas of our work, my journey with data has been unconventional. I often describe my own work as “finding the story” within a dataset.

The Art of Data Science

Data and narrative may seem like an odd pairing. Typically, we think of data as a hard science, primarily a left-brained activity. We associate stories with the arts and creative right-brained activities, and the idea that these can coexist can be difficult to grasp. Although some areas of data analytics are certainly more science than art, we live in a world where data is so common that we’ve had to democratize the analytics process.

Gone are the days when a small group of coders are the only ones trusted to develop insights. In many companies today, everyone is expected to analyze data, requiring even more overlap and nontraditional approaches to data. Tools like Tableau and Alteryx have proven invaluable for turning ordinary people into analysts, but jumping into analytics without the typical background can still feel like a bit of a leap.

In my last job, I worked as an analyst for KIPP DC, a national network for public charter schools. I helped district leaders, principals and teachers understand and interpret data about our schools so that we could make them stronger. The idea of writing a formula in Excel made most of these people cringe. But I quickly discovered that the key to success in my role was not being able to write complex calculations or model polynomial regressions. Rather, it was something far simpler: understanding what insights people needed and telling a story with data that they could understand. In some regards, this required me to take a step back from the typical left-brained approach to analysis and assume a more organic and creative posture—it wasn’t just “What does the data say?” but it became “What is the data prompting us to do?”.

But I quickly discovered that the key to success in my role was not being able to write complex calculations or model polynomial regressions. Rather, it was something far simpler: understanding what insights people needed and telling a story with data that they could understand.

Thinking About Data in Terms of Design

In this same vein of blurring the lines between science and art, left brain and right brain, I’ve recently been learning about design thinking. In the simplest of terms, it’s just understanding what users need and how to meet those needs in an intuitive and enjoyable way. Although I didn’t realize it at the time, in retrospect, I’ve realized that my early approach to data analytics was through the lens of design thinking. I was constantly asking myself, “What are people going to do with this data?” and I tried to anticipate their next question and what other data they might need. I put myself in the shoes of the user and developed all my dashboards with them in mind. Many times, this took me outside the stereotypical box of an analyst and forced me to consider the data differently.

A couple months into my job at KIPP, I overheard a specific goal that district leaders had for our schools. Although we already had a dashboard with data related to this goal, I found myself running into dead ends and unanswered questions when I used it. So I added several new ways of breaking down the data. With each new graph I built, I asked myself, “What can someone do with this data?” and if that answer didn’t lead to practical action, I knew my work wasn’t done and found a way to make it better. This stands in direct contrast to how some people perceive data analysis: inputting data, yielding results and communicating them as is. However, I’ve learned that data analysis isn’t like this—especially at InterWorks. It’s much more about trying different things over and over, thinking about data in ways you never have before in order to glean meaningful insights and deliver better reports than ever before.

Above: Me with some KIPP students at the Washington Monument

Crafting a Data Narrative

When I presented the updated dashboard to school leaders, they were excited about the new possibilities it unlocked. My graphs led to insights that started conversations that produced better performing schools. But none of this was the result of something I had learned in a statistics class. Instead, it was simply the result of a mindset that anyone can develop: understanding your audience and designing solutions with their specific needs in mind.

Every organization has unique questions it needs its data to answer, and more often than not, getting to those answers isn’t a straight and narrow path. It’s one that necessitates adaptability, living in the grey rather than sticking to black or white, and it invites creativity. While my path into analytics wasn’t what the textbook may have prescribed, I’m thankful for the outsider’s perspective it lent me. It makes balancing art and science easier, it allows me the freedom to bend and not break, and it more closely resembles what big data really looks like today: a blend of beautiful design and actionable insights—a unifying of both sides of the brain.

The post Dashboard Confessions: Being a Data Nonconformist appeared first on InterWorks.

InterWorks

Data School: Should you use “dot notation” or “bracket notation” with pandas?

If you’ve ever used the pandas library in Python, you probably know that there are two ways to select a Series (meaning a column) from a DataFrame:

# dot notation df.col_name  # bracket notation df['col_name'] 

Which method should you use? I’ll make the case for each, and then you can decide…

Why use bracket notation?

The case for bracket notation is simple: It always works.

Here are the specific cases in which you must use bracket notation, because dot notation would fail:

# column name includes a space df['col name']  # column name matches a DataFrame method df['count']  # column name is stored in a variable var = 'col_name' df[var]  # new column is created through assignment df['new'] = 0 

In other words, bracket notation always works, whereas dot notation only works under certain circumstances. That’s a pretty compelling case for bracket notation!

As stated in the Zen of Python:

There should be one– and preferably only one –obvious way to do it.

Why use dot notation?

If you’ve watched any of my pandas videos, you may have noticed that I use dot notation. Here are four reasons why:

Reason 1: Dot notation is easier to type

Dot notation is three fewer characters to type than bracket notation. And in terms of finger movement, typing a single period is much more convenient than typing brackets and quotes.

This might sound like a trivial reason, but if you’re selecting columns dozens (or hundreds) of times a day, it makes a real difference!

Reason 2: Dot notation is easier to read

Most of my pandas code is a made up of chains of selections and methods. By using dot notation, my code is mostly adorned with periods and parentheses (plus an occasional quotation mark):

# dot notation df.col_one.sum() df.col_one.isna().sum() df.groupby('col_two').col_one.sum() 

If you instead use bracket notation, your code is adorned with periods and parentheses plus lots of brackets and quotation marks:

# bracket notation df['col_one'].sum() df['col_one'].isna().sum() df.groupby('col_two')['col_one'].sum() 

I find the dot notation code easier to read, as well as more aesthetically pleasing.

Reason 3: Dot notation is easier to remember

With dot notation, every component in a chain is separated by a period on both sides. For example, this line of code has 4 components, and thus there are 3 periods separating the individual components:

# dot notation df.groupby('col_two').col_one.sum() 

If you instead use bracket notation, some of your components are separated by periods, and some are not:

# bracket notation df.groupby('col_two')['col_one'].sum() 

With bracket notation, I often forget whether there’s supposed to be a period before ['col_one'], after ['col_one'], or both before and after ['col_one'].

With dot notation, it’s easier for me to remember the correct syntax.

Reason 4: Dot notation limits the usage of brackets

Brackets can be used for many purposes in pandas:

df[['col_one', 'col_two']] df.iloc[4, 2] df.loc['row_label', 'col_one':'col_three'] df.col_one['row_label'] df[(df.col_one > 5) & (df.col_two == 'value')] 

If you also use bracket notation for Series selection, you end up with even more brackets in your code:

df['col_one']['row_label'] df[(df['col_one'] > 5) & (df['col_two'] == 'value')] 

As you use more brackets, each bracket becomes slightly more ambiguous as to its purpose, imposing a higher mental burden on the person reading the code. By using dot notation for Series selection, you reduce bracket usage to only the essential cases.

Conclusion

If you prefer bracket notation, then you can use it all of the time! However, you still have to be familiar with dot notation in order to read other people’s code.

If you prefer dot notation, then you can use it most of the time, as long as you are diligent about renaming columns when they contains spaces or collide with DataFrame methods. However, you still have to use bracket notation when creating new columns.

Which do you prefer? Let me know in the comments!


Planet Python

Data School: Learn a new pandas trick every day!

Every weekday, I share a new "pandas trick" on social media. Each trick takes only a minute to read, yet you’ll learn something new that will save you time and energy in the future!

Here’s my latest trick:

Want to read the 59 tricks that I’ve already posted? See below 👇

Want to see the daily trick in your social media feed? Follow me on Twitter, Facebook, LinkedIn, and YouTube

Want to watch a live demo of my top 25 tricks? Watch this video 🎥

Want to support daily pandas tricks? Become a Data School Insider 🙏


Categories

Reading files

Creating example DataFrames

Renaming columns

Selecting rows and columns

Filtering rows by condition

Manipulating strings

Working with data types

Encoding data

Extracting data from lists

Working with time series data

Handling missing values

Using aggregation functions

Random sampling

Merging DataFrames

Styling DataFrames

Exploring a dataset

Other


Planet Python